From patchwork Wed Nov 6 18:17:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007646 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD2Q5SrGz1xxf for ; Thu, 7 Nov 2024 05:17:34 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1F453858D39 for ; Wed, 6 Nov 2024 18:17:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 221F53858D28; Wed, 6 Nov 2024 18:17:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 221F53858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 221F53858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917035; cv=none; b=HiR754EGqIF6B8gYauSGCXptfibE55+HQU8AcxpBtTdZOc1vw5GXQTYxpPrWsYP4pl/6cVZd8kVd0e8kTbY428pGxbS0i10WecU+qTsqnW1FgebVnsxefYncpLWG9d2h1Jp6OO/dxaKrQNo4KxYSIxB24FuntfV1NuSkNP0Q8PQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917035; c=relaxed/simple; bh=SiRMyEs6l6I/2rN6qcFNYOiO9WSAjlSypJYkKRU11No=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=xFpVLrj9sf/5cb9EWofiZQ4QpP3mtjXjZZpGowZorq1IwkzLaRv7QUqBrDjdO5d8wugfoSbOtfid4wqPiH5jBxdmnuDVrqqIzrASbpavVDChQXXB35wYY5Yuht++LdSba+nsDB5Y7Egu1yHsDzYgGfB7R57NXRY6km6CmZN4WYI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C0973497; Wed, 6 Nov 2024 10:17:40 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 495FE3F66E; Wed, 6 Nov 2024 10:17:10 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 01/15] aarch64: Make more use of TARGET_STREAMING_SME2 In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:17:08 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Some code was checking TARGET_STREAMING and TARGET_SME2 separately, but we now have a macro to test both at once. gcc/ * config/aarch64/aarch64-sme.md: Use TARGET_STREAMING_SME2 instead of separate TARGET_STREAMING and TARGET_SME2 tests. * config/aarch64/aarch64-sve2.md: Likewise. * config/aarch64/iterators.md: Likewise. --- gcc/config/aarch64/aarch64-sme.md | 34 ++++++++++++------------------ gcc/config/aarch64/aarch64-sve2.md | 6 +++--- gcc/config/aarch64/iterators.md | 8 +++---- 3 files changed, 21 insertions(+), 27 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 78ad2fc699f..9215f51b01f 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -1334,7 +1334,7 @@ (define_insn "@aarch64_sme_" (match_operand:VNx8HI_ONLY 1 "register_operand" "w") (match_operand:VNx8HI_ONLY 2 "register_operand" "x")] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" "ll\tza.d[%w0, 0:3], %1.h, %2.h" ) @@ -1348,7 +1348,7 @@ (define_insn "*aarch64_sme__plus" (match_operand:VNx8HI_ONLY 2 "register_operand" "w") (match_operand:VNx8HI_ONLY 3 "register_operand" "x")] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" { operands[4] = GEN_INT (INTVAL (operands[1]) + 3); return "ll\tza.d[%w0, %1:%4], %2.h, %3.h"; @@ -1364,7 +1364,7 @@ (define_insn "@aarch64_sme_" (match_operand:SME_ZA_HIx24 1 "aligned_register_operand" "Uw") (match_operand:SME_ZA_HIx24 2 "aligned_register_operand" "Uw")] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" "ll\tza.d[%w0, 0:3, vgx], %1, %2" ) @@ -1378,7 +1378,7 @@ (define_insn "*aarch64_sme__plus" (match_operand:SME_ZA_HIx24 2 "aligned_register_operand" "Uw") (match_operand:SME_ZA_HIx24 3 "aligned_register_operand" "Uw")] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" { operands[4] = GEN_INT (INTVAL (operands[1]) + 3); return "ll\tza.d[%w0, %1:%4, vgx], %2, %3"; @@ -1395,7 +1395,7 @@ (define_insn "@aarch64_sme_single_" (vec_duplicate:SME_ZA_HIx24 (match_operand: 2 "register_operand" "x"))] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" "ll\tza.d[%w0, 0:3, vgx], %1, %2.h" ) @@ -1410,7 +1410,7 @@ (define_insn "*aarch64_sme_single__p (vec_duplicate:SME_ZA_HIx24 (match_operand: 3 "register_operand" "x"))] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" { operands[4] = GEN_INT (INTVAL (operands[1]) + 3); return "ll\tza.d[%w0, %1:%4, vgx], %2, %3.h"; @@ -1429,7 +1429,7 @@ (define_insn "@aarch64_sme_lane_" (match_operand:SI 3 "const_int_operand")] UNSPEC_SVE_LANE_SELECT)] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" "ll\tza.d[%w0, 0:3], %1, %2.h[%3]" ) @@ -1446,7 +1446,7 @@ (define_insn "*aarch64_sme_lane_" (match_operand:SI 4 "const_int_operand")] UNSPEC_SVE_LANE_SELECT)] SME_INT_TERNARY_SLICE))] - "TARGET_SME2 && TARGET_SME_I16I64 && TARGET_STREAMING_SME" + "TARGET_STREAMING_SME2 && TARGET_SME_I16I64" { operands[5] = GEN_INT (INTVAL (operands[1]) + 3); return "ll\tza.d[%w0, %1:%5], %2, %3.h[%4]"; @@ -1642,8 +1642,7 @@ (define_insn "@aarch64_sme_" (match_operand:SME_ZA_SDFx24 1 "aligned_register_operand" "Uw") (match_operand:SME_ZA_SDFx24 2 "aligned_register_operand" "Uw")] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, 0, vgx], %1, %2" ) @@ -1658,8 +1657,7 @@ (define_insn "*aarch64_sme__plus" (match_operand:SME_ZA_SDFx24 2 "aligned_register_operand" "Uw") (match_operand:SME_ZA_SDFx24 3 "aligned_register_operand" "Uw")] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, %1, vgx], %2, %3" ) @@ -1674,8 +1672,7 @@ (define_insn "@aarch64_sme_single_ (vec_duplicate:SME_ZA_SDFx24 (match_operand: 2 "register_operand" "x"))] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, 0, vgx], %1, %2." ) @@ -1691,8 +1688,7 @@ (define_insn "*aarch64_sme_single_ (vec_duplicate:SME_ZA_SDFx24 (match_operand: 3 "register_operand" "x"))] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, %1, vgx], %2, %3." ) @@ -1709,8 +1705,7 @@ (define_insn "@aarch64_sme_lane_" (match_operand:SI 3 "const_int_operand")] UNSPEC_SVE_LANE_SELECT)] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, 0, vgx], %1, %2.[%3]" ) @@ -1728,8 +1723,7 @@ (define_insn "*aarch64_sme_lane_" (match_operand:SI 4 "const_int_operand")] UNSPEC_SVE_LANE_SELECT)] SME_FP_TERNARY_SLICE))] - "TARGET_SME2 - && TARGET_STREAMING_SME + "TARGET_STREAMING_SME2 && == " "\tza.[%w0, %1, vgx], %2, %3.[%4]" ) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index ac27124fb74..38ecdd1ccc1 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2213,7 +2213,7 @@ (define_insn "@aarch64_sve_" (unspec:VNx16QI_ONLY [(match_operand:VNx16SI_ONLY 1 "aligned_register_operand" "Uw")] SVE_QCVTxN))] - "TARGET_SME2 && TARGET_STREAMING" + "TARGET_STREAMING_SME2" "\t%0.b, %1" ) @@ -2222,7 +2222,7 @@ (define_insn "@aarch64_sve_" (unspec:VNx8HI_ONLY [(match_operand:VNx8SI_ONLY 1 "aligned_register_operand" "Uw")] SVE_QCVTxN))] - "TARGET_SME2 && TARGET_STREAMING" + "TARGET_STREAMING_SME2" "\t%0.h, %1" ) @@ -2231,7 +2231,7 @@ (define_insn "@aarch64_sve_" (unspec:VNx8HI_ONLY [(match_operand:VNx8DI_ONLY 1 "aligned_register_operand" "Uw")] SVE_QCVTxN))] - "TARGET_SME2 && TARGET_STREAMING" + "TARGET_STREAMING_SME2" "\t%0.h, %1" ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 8269b0cdcd9..4942631aa95 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3051,16 +3051,16 @@ (define_int_iterator SVE_BFLOAT_TERNARY_LONG [UNSPEC_BFDOT UNSPEC_BFMLALB UNSPEC_BFMLALT - (UNSPEC_BFMLSLB "TARGET_SME2 && TARGET_STREAMING_SME") - (UNSPEC_BFMLSLT "TARGET_SME2 && TARGET_STREAMING_SME") + (UNSPEC_BFMLSLB "TARGET_STREAMING_SME2") + (UNSPEC_BFMLSLT "TARGET_STREAMING_SME2") (UNSPEC_BFMMLA "TARGET_NON_STREAMING")]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG_LANE [UNSPEC_BFDOT UNSPEC_BFMLALB UNSPEC_BFMLALT - (UNSPEC_BFMLSLB "TARGET_SME2 && TARGET_STREAMING_SME") - (UNSPEC_BFMLSLT "TARGET_SME2 && TARGET_STREAMING_SME")]) + (UNSPEC_BFMLSLB "TARGET_STREAMING_SME2") + (UNSPEC_BFMLSLT "TARGET_STREAMING_SME2")]) (define_int_iterator SVE_INT_REDUCTION [UNSPEC_ANDV UNSPEC_IORV From patchwork Wed Nov 6 18:17:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007647 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD3Y549Jz1xxf for ; Thu, 7 Nov 2024 05:18:33 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D15BC3858C42 for ; Wed, 6 Nov 2024 18:18:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 4E23C3858C66; Wed, 6 Nov 2024 18:17:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4E23C3858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4E23C3858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917060; cv=none; b=qE6eQiyK3SJXQhz4wZMzJWv+Xh40FDF+N+wQIT8RiD5gHzK3x8TuNNY2gFcftfHGFGeB6PL04TqeJZkduANeNxAe6xUCef+nj+9Rswm4cTVmVjJvYerkfjYr6XP6GLWqIhCD9gdkwCjVS7LBAlnoRbYAucY5v9hPo9GPzkLRAnU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917060; c=relaxed/simple; bh=Z4WfcGDiUNBi1q+eiaDC7iD/N//c4DKUvBdtEseIPUs=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=jxl6J+EyufbtJkibVN+5IfZyv+OH0evh2BYbG1VvgLMPnjwCjKKHCYZkPkeUnqE6/NMWY64Nn/7N90Sn+EmVGhFQI/OJ1oO/LgEjP3AgkxS0WK4rQb3TOYB0AvCaJZWWEdI2xjxrYewQzsodew8k0UtMV1+yCgMJEc3/cJcAcsY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DA5D9497; Wed, 6 Nov 2024 10:18:05 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 614EC3F66E; Wed, 6 Nov 2024 10:17:35 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 02/15] aarch64: Test TARGET_STREAMING instead of TARGET_STREAMING_SME In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:17:33 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org g:ede97598e2c recorded separate ISA requirements for streaming and non-streaming mode. The premise there was that AARCH64_FL_SME should not be included in the streaming mode requirements, since: (a) an __arm_streaming_compatible function wouldn't be in streaming mode if SME wasn't available. (b) __arm_streaming_compatible functions only allow things that are possible in non-streaming mode, so the non-streaming architecture is enough to assemble the code, even if +sme isn't enabled. (c) we reject __arm_streaming if +sme isn't enabled, so don't need to test it for individual intrinsics as well. Later patches lean into this further. This patch applies the same reasoning to the .md constructs for base streaming-only SME instructions, guarding them with TARGET_STREAMING rather than TARGET_STREAMING_SME. gcc/ * config/aarch64/aarch64.h (TARGET_SME): Expand comment. (TARGET_STREAMING_SME): Delete. * config/aarch64/aarch64-sme.md: Use TARGET_STREAMING instead of TARGET_STREAMING_SME. * config/aarch64/aarch64-sve2.md: Likewise. --- gcc/config/aarch64/aarch64-sme.md | 28 ++++++++++++++-------------- gcc/config/aarch64/aarch64-sve2.md | 8 ++++---- gcc/config/aarch64/aarch64.h | 6 ++---- 3 files changed, 20 insertions(+), 22 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 9215f51b01f..8fca138314c 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -481,7 +481,7 @@ (define_insn "@aarch64_sme_" (match_operand: 2 "register_operand" "Upl") (match_operand:SME_ZA_I 3 "aarch64_sve_ldff1_operand" "Utf")] SME_LD1))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "ld1\t{ za%0.[%w1, 0] }, %2/z, %3" ) @@ -496,7 +496,7 @@ (define_insn "@aarch64_sme__plus" (match_operand: 3 "register_operand" "Upl") (match_operand:SME_ZA_I 4 "aarch64_sve_ldff1_operand" "Utf")] SME_LD1))] - "TARGET_STREAMING_SME + "TARGET_STREAMING && UINTVAL (operands[2]) < 128 / " "ld1\t{ za%0.[%w1, %2] }, %3/z, %4" ) @@ -583,7 +583,7 @@ (define_insn "@aarch64_sme_" (match_operand:SI 2 "register_operand" "Ucj") (match_operand: 3 "register_operand" "Upl")] SME_ST1))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "st1\t{ za%1.[%w2, 0] }, %3, %0" ) @@ -598,7 +598,7 @@ (define_insn "@aarch64_sme__plus" (match_operand:SI 3 "const_int_operand")) (match_operand: 4 "register_operand" "Upl")] SME_ST1))] - "TARGET_STREAMING_SME + "TARGET_STREAMING && UINTVAL (operands[3]) < 128 / " "st1\t{ za%1.[%w2, %3] }, %4, %0" ) @@ -663,7 +663,7 @@ (define_insn "@aarch64_sme_" (match_operand:DI 3 "const_int_operand") (match_operand:SI 4 "register_operand" "Ucj")] SME_READ))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "mova\t%0., %2/m, za%3.[%w4, 0]" ) @@ -678,7 +678,7 @@ (define_insn "*aarch64_sme__plus" (plus:SI (match_operand:SI 4 "register_operand" "Ucj") (match_operand:SI 5 "const_int_operand"))] SME_READ))] - "TARGET_STREAMING_SME + "TARGET_STREAMING && UINTVAL (operands[5]) < 128 / " "mova\t%0., %2/m, za%3.[%w4, %5]" ) @@ -693,7 +693,7 @@ (define_insn "@aarch64_sme_" (match_operand:DI 3 "const_int_operand") (match_operand:SI 4 "register_operand" "Ucj")] SME_READ))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "mova\t%0.q, %2/m, za%3.q[%w4, 0]" ) @@ -707,7 +707,7 @@ (define_insn "@aarch64_sme_" (match_operand: 2 "register_operand" "Upl") (match_operand:SVE_FULL 3 "register_operand" "w")] SME_WRITE))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "mova\tza%0.[%w1, 0], %2/m, %3." ) @@ -722,7 +722,7 @@ (define_insn "*aarch64_sme__plus" (match_operand: 3 "register_operand" "Upl") (match_operand:SVE_FULL 4 "register_operand" "w")] SME_WRITE))] - "TARGET_STREAMING_SME + "TARGET_STREAMING && UINTVAL (operands[2]) < 128 / " "mova\tza%0.[%w1, %2], %3/m, %4." ) @@ -737,7 +737,7 @@ (define_insn "@aarch64_sme_" (match_operand:VNx2BI 2 "register_operand" "Upl") (match_operand:SVE_FULL 3 "register_operand" "w")] SME_WRITE))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "mova\tza%0.q[%w1, 0], %2/m, %3.q" ) @@ -917,7 +917,7 @@ (define_insn "@aarch64_sme_" (match_operand: 2 "register_operand" "Upl") (match_operand:SME_ZA_SDI 3 "register_operand" "w")] SME_BINARY_SDI))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "\tza%0., %1/m, %2/m, %3." ) @@ -1479,7 +1479,7 @@ (define_insn "@aarch64_sme_" (match_operand:VNx16QI_ONLY 3 "register_operand" "w") (match_operand:VNx16QI_ONLY 4 "register_operand" "w")] SME_INT_MOP))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" "\tza%0.s, %1/m, %2/m, %3.b, %4.b" ) @@ -1494,7 +1494,7 @@ (define_insn "@aarch64_sme_" (match_operand:VNx8HI_ONLY 3 "register_operand" "w") (match_operand:VNx8HI_ONLY 4 "register_operand" "w")] SME_INT_MOP))] - "TARGET_STREAMING_SME && TARGET_SME_I16I64" + "TARGET_STREAMING && TARGET_SME_I16I64" "\tza%0.d, %1/m, %2/m, %3.h, %4.h" ) @@ -1887,7 +1887,7 @@ (define_insn "@aarch64_sme_" (match_operand:SME_MOP_HSDF 3 "register_operand" "w") (match_operand:SME_MOP_HSDF 4 "register_operand" "w")] SME_FP_MOP))] - "TARGET_STREAMING_SME + "TARGET_STREAMING && ( == 32) == ( <= 32)" "\tza%0., %1/m, %2/m, %3., %4." ) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 38ecdd1ccc1..a7b29daeba4 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -560,7 +560,7 @@ (define_insn "@aarch64_sve_clamp" (match_operand:SVE_FULL_I 1 "register_operand") (match_operand:SVE_FULL_I 2 "register_operand")) (match_operand:SVE_FULL_I 3 "register_operand")))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] clamp\t%0., %2., %3. [ ?&w, w, w, w; yes ] movprfx\t%0, %1\;clamp\t%0., %2., %3. @@ -580,7 +580,7 @@ (define_insn_and_split "*aarch64_sve_clamp_x" UNSPEC_PRED_X) (match_operand:SVE_FULL_I 3 "register_operand"))] UNSPEC_PRED_X))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] # [ ?&w, w, w, w; yes ] # @@ -3182,7 +3182,7 @@ (define_insn "@aarch64_pred_" [(match_operand:SVE_FULL 2 "register_operand")] UNSPEC_REVD_ONLY)] UNSPEC_PRED_X))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" {@ [ cons: =0 , 1 , 2 ; attrs: movprfx ] [ w , Upl , 0 ; * ] revd\t%0.q, %1/m, %2.q [ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;revd\t%0.q, %1/m, %2.q @@ -3198,7 +3198,7 @@ (define_insn "@cond_" UNSPEC_REVD_ONLY) (match_operand:SVE_FULL 3 "register_operand")] UNSPEC_SEL))] - "TARGET_STREAMING_SME" + "TARGET_STREAMING" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , Upl , w , 0 ; * ] revd\t%0.q, %1/m, %2.q [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %3\;revd\t%0.q, %1/m, %2.q diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 593319fd472..d17f40ce22e 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -339,12 +339,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED #define TARGET_SVE2_SM4 (AARCH64_HAVE_ISA (SVE2_SM4) && TARGET_NON_STREAMING) /* SME instructions, enabled through +sme. Note that this does not - imply anything about the state of PSTATE.SM. */ + imply anything about the state of PSTATE.SM; instructions that require + SME and streaming mode should use TARGET_STREAMING instead. */ #define TARGET_SME AARCH64_HAVE_ISA (SME) -/* Same with streaming mode enabled. */ -#define TARGET_STREAMING_SME (TARGET_STREAMING && TARGET_SME) - /* The FEAT_SME_I16I64 extension to SME, enabled through +sme-i16i64. */ #define TARGET_SME_I16I64 AARCH64_HAVE_ISA (SME_I16I64) From patchwork Wed Nov 6 18:17:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007652 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD6N3ffwz1xyS for ; Thu, 7 Nov 2024 05:21:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8F54E385840C for ; Wed, 6 Nov 2024 18:20:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CE76F3858C60; Wed, 6 Nov 2024 18:17:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE76F3858C60 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CE76F3858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917081; cv=none; b=qz1A6DpTX2zvtaxLdjX9uePM7r4jH+VUikaZg1XyEp9WwYytRjmIib5KWTZDPT6P7I+P4desCj7n8W4c45ZoK5YKuT7JpWIOthxZssrYBogA6pM5/ojJijWKXYG52KuClkWPWohlH8YJaE+nqGu432vhIFhfrHkJf0Fi7JjjunI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917081; c=relaxed/simple; bh=SfzXkXLQR1IP1mnpYtcmBir7unBxIGa/dEasxEJVVgY=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=dwraBZ05kKkIq/63Jw68zOQKyKzs9fuYhEJm/F7lpx8tXHY6yzkQvbAAFutuRgxiEQWZHfNFK3I9eZ1XFaO4ukXh3VVrUE7gGuNOUmLPzwbpqPNL/MY2qHdWOkCNrzvE76TvFUGE9qlmKY6gK6KTwMQ9pDID/uRFLjC199rOjc8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 66DB6497; Wed, 6 Nov 2024 10:18:25 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E17133F66E; Wed, 6 Nov 2024 10:17:54 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 03/15] aarch64: Tweak definition of all_data & co In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:17:53 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Past extensions to SVE have required new subsets of all_data; the SVE2.1 patches will add another. This patch tries to make this more scalable by defining the multi-size *_data macros to be unions of single-size *_data macros. gcc/ * config/aarch64/aarch64-sve-builtins.cc (TYPES_all_data): Redefine in terms of single-size *_data definitions. (TYPES_bhs_data, TYPES_hs_data, TYPES_sd_data): Likewise. (TYPES_b_data, TYPES_h_data, TYPES_s_data): New macros. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 51 +++++++++++++--------- 1 file changed, 31 insertions(+), 20 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 44b7f6edae5..c0b5115fdeb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -231,12 +231,11 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_all_arith(S, D) \ TYPES_all_float (S, D), TYPES_all_integer (S, D) -/* _bf16 - _f16 _f32 _f64 - _s8 _s16 _s32 _s64 - _u8 _u16 _u32 _u64. */ #define TYPES_all_data(S, D) \ - S (bf16), TYPES_all_arith (S, D) + TYPES_b_data (S, D), \ + TYPES_h_data (S, D), \ + TYPES_s_data (S, D), \ + TYPES_d_data (S, D) /* _b only. */ #define TYPES_b(S, D) \ @@ -255,6 +254,11 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_b_integer(S, D) \ S (s8), TYPES_b_unsigned (S, D) +/* _s8 + _u8. */ +#define TYPES_b_data(S, D) \ + TYPES_b_integer (S, D) + /* _s8 _s16 _u8 _u16. */ #define TYPES_bh_integer(S, D) \ @@ -277,12 +281,10 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_bhs_integer(S, D) \ TYPES_bhs_signed (S, D), TYPES_bhs_unsigned (S, D) -/* _bf16 - _f16 _f32 - _s8 _s16 _s32 - _u8 _u16 _u32. */ #define TYPES_bhs_data(S, D) \ - S (bf16), S (f16), S (f32), TYPES_bhs_integer (S, D) + TYPES_b_data (S, D), \ + TYPES_h_data (S, D), \ + TYPES_s_data (S, D) /* _s16_s8 _s32_s16 _s64_s32 _u16_u8 _u32_u16 _u64_u32. */ @@ -295,6 +297,13 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_h_integer(S, D) \ S (s16), S (u16) +/* _bf16 + _f16 + _s16 + _u16. */ +#define TYPES_h_data(S, D) \ + S (bf16), S (f16), TYPES_h_integer (S, D) + /* _s16 _s32. */ #define TYPES_hs_signed(S, D) \ S (s16), S (s32) @@ -308,12 +317,9 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_hs_float(S, D) \ S (f16), S (f32) -/* _bf16 - _f16 _f32 - _s16 _s32 - _u16 _u32. */ #define TYPES_hs_data(S, D) \ - S (bf16), S (f16), S (f32), TYPES_hs_integer (S, D) + TYPES_h_data (S, D), \ + TYPES_s_data (S, D) /* _u16 _u64. */ #define TYPES_hd_unsigned(S, D) \ @@ -352,10 +358,17 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_s_unsigned(S, D) \ S (u32) -/* _s32 _u32. */ +/* _s32 + _u32. */ #define TYPES_s_integer(S, D) \ TYPES_s_signed (S, D), TYPES_s_unsigned (S, D) +/* _f32 + _s32 + _u32. */ +#define TYPES_s_data(S, D) \ + TYPES_s_float (S, D), TYPES_s_integer (S, D) + /* _s32 _s64. */ #define TYPES_sd_signed(S, D) \ S (s32), S (s64) @@ -369,11 +382,9 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_sd_integer(S, D) \ TYPES_sd_signed (S, D), TYPES_sd_unsigned (S, D) -/* _f32 _f64 - _s32 _s64 - _u32 _u64. */ #define TYPES_sd_data(S, D) \ - S (f32), S (f64), TYPES_sd_integer (S, D) + TYPES_s_data (S, D), \ + TYPES_d_data (S, D) /* _f16 _f32 _f64 _s32 _s64 From patchwork Wed Nov 6 18:18:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007648 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD40304rz1xxf for ; Thu, 7 Nov 2024 05:18:56 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4EF873858D28 for ; Wed, 6 Nov 2024 18:18:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5E9CF3858D37; Wed, 6 Nov 2024 18:18:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5E9CF3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5E9CF3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917111; cv=none; b=uZ83RrZ70e3wib6subWGwoOLZupeJcWjCyHHlzkbFbw4XD0q4m6QvFZxJ7j8CEvtXAib1k/VqHipwsubGG0v580JdIXH2fBEuRAxueQ5ka4JVvX1OyUB0wTJfYX6zcbSrBiCzmJ5zT96eShin5z0AsFcs6EOTh6wv5IUpvWN0Lk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917111; c=relaxed/simple; bh=s+jCbq/kvokUVa8jIPa12l2MK2seC73rh0lFXgiDgpM=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=odUSBUgHqTD4m20VH3KymufQhXtDV+KQHx6LS5u/yZ6dugK6oJ+WW06GkuOYPAa2tdCwCPCzed8rlYy4SBQ0XxzXZHb3TfxETHdeUe3kkUieoCA14Uv3XqQTm3R76HR/CFr8YUjaKYXg3I/20WaLqmM5s8zMvtkztSVyeCxz2ds= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EE080497; Wed, 6 Nov 2024 10:18:54 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 959123F66E; Wed, 6 Nov 2024 10:18:23 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 04/15] aarch64: Use braces in SVE TBL instructions In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:18:21 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org GCC previously used the older assembly syntax for SVE TBL, with no braces around the second operand. This patch switches to the newer, official syntax, with braces around the operand. The initial SVE binutils submission supported both syntaxes, so there should be no issues with backwards compatibility. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_sve_tbl): Wrap the second operand in braces. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c: Wrap the second TBL operand in braces * gcc.target/aarch64/sve/acle/asm/dup_lane_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_u8.c: Likewise. * gcc.target/aarch64/sve/slp_perm_6.c: Likewise. * gcc.target/aarch64/sve/slp_perm_7.c: Likewise. * gcc.target/aarch64/sve/vec_perm_1.c: Likewise. * gcc.target/aarch64/sve/vec_perm_const_1.c: Likewise. * gcc.target/aarch64/sve/vec_perm_const_1_overrun.c: Likewise. * gcc.target/aarch64/sve/vec_perm_const_single_1.c: Likewise. * gcc.target/aarch64/sve/vec_perm_single_1.c: Likewise. * gcc.target/aarch64/sve/uzp1_1.c: Shorten the scan-assembler-nots to just "\ttbl\". * gcc.target/aarch64/sve/uzp2_1.c: Likewise. --- gcc/config/aarch64/aarch64-sve.md | 2 +- .../aarch64/sve/acle/asm/dup_lane_bf16.c | 12 +++++------ .../aarch64/sve/acle/asm/dup_lane_f16.c | 12 +++++------ .../aarch64/sve/acle/asm/dup_lane_f32.c | 16 +++++++-------- .../aarch64/sve/acle/asm/dup_lane_f64.c | 18 ++++++++--------- .../aarch64/sve/acle/asm/dup_lane_s16.c | 12 +++++------ .../aarch64/sve/acle/asm/dup_lane_s32.c | 16 +++++++-------- .../aarch64/sve/acle/asm/dup_lane_s64.c | 20 +++++++++---------- .../aarch64/sve/acle/asm/dup_lane_s8.c | 8 ++++---- .../aarch64/sve/acle/asm/dup_lane_u16.c | 12 +++++------ .../aarch64/sve/acle/asm/dup_lane_u32.c | 16 +++++++-------- .../aarch64/sve/acle/asm/dup_lane_u64.c | 20 +++++++++---------- .../aarch64/sve/acle/asm/dup_lane_u8.c | 8 ++++---- .../aarch64/sve/acle/asm/tbl_bf16.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_f16.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_f32.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_f64.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_s16.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_s32.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_s64.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_s8.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_u16.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_u32.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_u64.c | 6 +++--- .../gcc.target/aarch64/sve/acle/asm/tbl_u8.c | 6 +++--- .../gcc.target/aarch64/sve/slp_perm_6.c | 2 +- .../gcc.target/aarch64/sve/slp_perm_7.c | 2 +- gcc/testsuite/gcc.target/aarch64/sve/uzp1_1.c | 8 ++++---- gcc/testsuite/gcc.target/aarch64/sve/uzp2_1.c | 8 ++++---- .../gcc.target/aarch64/sve/vec_perm_1.c | 8 ++++---- .../gcc.target/aarch64/sve/vec_perm_const_1.c | 8 ++++---- .../aarch64/sve/vec_perm_const_1_overrun.c | 8 ++++---- .../aarch64/sve/vec_perm_const_single_1.c | 8 ++++---- .../aarch64/sve/vec_perm_single_1.c | 8 ++++---- 34 files changed, 152 insertions(+), 152 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 06bd3e4bb2c..0955a697680 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -9040,7 +9040,7 @@ (define_insn "@aarch64_sve_tbl" (match_operand: 2 "register_operand" "w")] UNSPEC_TBL))] "TARGET_SVE" - "tbl\t%0., %1., %2." + "tbl\t%0., {%1.}, %2." ) ;; ------------------------------------------------------------------------- diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c index d05ad5adbb8..f328df5f2bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_bf16_tied1: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_bf16_tied1, svbfloat16_t, uint16_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_bf16_tied1, svbfloat16_t, uint16_t, /* ** dup_lane_w0_bf16_untied: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z1\.h, \1 +** tbl z0\.h, {z1\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_bf16_untied, svbfloat16_t, uint16_t, @@ -70,7 +70,7 @@ TEST_UNIFORM_Z (dup_lane_31_bf16, svbfloat16_t, /* ** dup_lane_32_bf16: ** mov (z[0-9]+\.h), #32 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_bf16, svbfloat16_t, @@ -80,7 +80,7 @@ TEST_UNIFORM_Z (dup_lane_32_bf16, svbfloat16_t, /* ** dup_lane_63_bf16: ** mov (z[0-9]+\.h), #63 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_bf16, svbfloat16_t, @@ -90,7 +90,7 @@ TEST_UNIFORM_Z (dup_lane_63_bf16, svbfloat16_t, /* ** dup_lane_64_bf16: ** mov (z[0-9]+\.h), #64 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_bf16, svbfloat16_t, @@ -100,7 +100,7 @@ TEST_UNIFORM_Z (dup_lane_64_bf16, svbfloat16_t, /* ** dup_lane_255_bf16: ** mov (z[0-9]+\.h), #255 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_bf16, svbfloat16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f16.c index 142afbb2452..82e882d4703 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f16.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_f16_tied1: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_f16_tied1, svfloat16_t, uint16_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_f16_tied1, svfloat16_t, uint16_t, /* ** dup_lane_w0_f16_untied: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z1\.h, \1 +** tbl z0\.h, {z1\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_f16_untied, svfloat16_t, uint16_t, @@ -70,7 +70,7 @@ TEST_UNIFORM_Z (dup_lane_31_f16, svfloat16_t, /* ** dup_lane_32_f16: ** mov (z[0-9]+\.h), #32 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_f16, svfloat16_t, @@ -80,7 +80,7 @@ TEST_UNIFORM_Z (dup_lane_32_f16, svfloat16_t, /* ** dup_lane_63_f16: ** mov (z[0-9]+\.h), #63 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_f16, svfloat16_t, @@ -90,7 +90,7 @@ TEST_UNIFORM_Z (dup_lane_63_f16, svfloat16_t, /* ** dup_lane_64_f16: ** mov (z[0-9]+\.h), #64 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_f16, svfloat16_t, @@ -100,7 +100,7 @@ TEST_UNIFORM_Z (dup_lane_64_f16, svfloat16_t, /* ** dup_lane_255_f16: ** mov (z[0-9]+\.h), #255 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_f16, svfloat16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f32.c index b32068a37d6..ad67aa88306 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f32.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_f32_tied1: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_f32_tied1, svfloat32_t, uint32_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_f32_tied1, svfloat32_t, uint32_t, /* ** dup_lane_w0_f32_untied: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z1\.s, \1 +** tbl z0\.s, {z1\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_f32_untied, svfloat32_t, uint32_t, @@ -52,7 +52,7 @@ TEST_UNIFORM_Z (dup_lane_15_f32, svfloat32_t, /* ** dup_lane_16_f32: ** mov (z[0-9]+\.s), #16 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_f32, svfloat32_t, @@ -62,7 +62,7 @@ TEST_UNIFORM_Z (dup_lane_16_f32, svfloat32_t, /* ** dup_lane_31_f32: ** mov (z[0-9]+\.s), #31 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_f32, svfloat32_t, @@ -72,7 +72,7 @@ TEST_UNIFORM_Z (dup_lane_31_f32, svfloat32_t, /* ** dup_lane_32_f32: ** mov (z[0-9]+\.s), #32 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_f32, svfloat32_t, @@ -82,7 +82,7 @@ TEST_UNIFORM_Z (dup_lane_32_f32, svfloat32_t, /* ** dup_lane_63_f32: ** mov (z[0-9]+\.s), #63 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_f32, svfloat32_t, @@ -92,7 +92,7 @@ TEST_UNIFORM_Z (dup_lane_63_f32, svfloat32_t, /* ** dup_lane_64_f32: ** mov (z[0-9]+\.s), #64 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_f32, svfloat32_t, @@ -102,7 +102,7 @@ TEST_UNIFORM_Z (dup_lane_64_f32, svfloat32_t, /* ** dup_lane_255_f32: ** mov (z[0-9]+\.s), #255 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_f32, svfloat32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f64.c index 64af50d0c09..39f8e81eacb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_f64.c @@ -5,7 +5,7 @@ /* ** dup_lane_x0_f64_tied1: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_f64_tied1, svfloat64_t, uint64_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_x0_f64_tied1, svfloat64_t, uint64_t, /* ** dup_lane_x0_f64_untied: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z1\.d, \1 +** tbl z0\.d, {z1\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_f64_untied, svfloat64_t, uint64_t, @@ -43,7 +43,7 @@ TEST_UNIFORM_Z (dup_lane_0_f64_untied, svfloat64_t, /* ** dup_lane_15_f64: ** mov (z[0-9]+\.d), #15 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_15_f64, svfloat64_t, @@ -53,7 +53,7 @@ TEST_UNIFORM_Z (dup_lane_15_f64, svfloat64_t, /* ** dup_lane_16_f64: ** mov (z[0-9]+\.d), #16 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_f64, svfloat64_t, @@ -63,7 +63,7 @@ TEST_UNIFORM_Z (dup_lane_16_f64, svfloat64_t, /* ** dup_lane_31_f64: ** mov (z[0-9]+\.d), #31 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_f64, svfloat64_t, @@ -73,7 +73,7 @@ TEST_UNIFORM_Z (dup_lane_31_f64, svfloat64_t, /* ** dup_lane_32_f64: ** mov (z[0-9]+\.d), #32 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_f64, svfloat64_t, @@ -83,7 +83,7 @@ TEST_UNIFORM_Z (dup_lane_32_f64, svfloat64_t, /* ** dup_lane_63_f64: ** mov (z[0-9]+\.d), #63 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_f64, svfloat64_t, @@ -93,7 +93,7 @@ TEST_UNIFORM_Z (dup_lane_63_f64, svfloat64_t, /* ** dup_lane_64_f64: ** mov (z[0-9]+\.d), #64 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_f64, svfloat64_t, @@ -103,7 +103,7 @@ TEST_UNIFORM_Z (dup_lane_64_f64, svfloat64_t, /* ** dup_lane_255_f64: ** mov (z[0-9]+\.d), #255 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_f64, svfloat64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s16.c index 3b6f20696fa..2315d496979 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s16.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_s16_tied1: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s16_tied1, svint16_t, uint16_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_s16_tied1, svint16_t, uint16_t, /* ** dup_lane_w0_s16_untied: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z1\.h, \1 +** tbl z0\.h, {z1\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s16_untied, svint16_t, uint16_t, @@ -88,7 +88,7 @@ TEST_UNIFORM_Z (dup_lane_31_s16, svint16_t, /* ** dup_lane_32_s16: ** mov (z[0-9]+\.h), #32 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_s16, svint16_t, @@ -98,7 +98,7 @@ TEST_UNIFORM_Z (dup_lane_32_s16, svint16_t, /* ** dup_lane_63_s16: ** mov (z[0-9]+\.h), #63 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_s16, svint16_t, @@ -108,7 +108,7 @@ TEST_UNIFORM_Z (dup_lane_63_s16, svint16_t, /* ** dup_lane_64_s16: ** mov (z[0-9]+\.h), #64 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_s16, svint16_t, @@ -118,7 +118,7 @@ TEST_UNIFORM_Z (dup_lane_64_s16, svint16_t, /* ** dup_lane_255_s16: ** mov (z[0-9]+\.h), #255 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_s16, svint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s32.c index bf597fdf66c..98b4ff052e2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s32.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_s32_tied1: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s32_tied1, svint32_t, uint32_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_s32_tied1, svint32_t, uint32_t, /* ** dup_lane_w0_s32_untied: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z1\.s, \1 +** tbl z0\.s, {z1\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s32_untied, svint32_t, uint32_t, @@ -70,7 +70,7 @@ TEST_UNIFORM_Z (dup_lane_15_s32, svint32_t, /* ** dup_lane_16_s32: ** mov (z[0-9]+\.s), #16 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_s32, svint32_t, @@ -80,7 +80,7 @@ TEST_UNIFORM_Z (dup_lane_16_s32, svint32_t, /* ** dup_lane_31_s32: ** mov (z[0-9]+\.s), #31 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_s32, svint32_t, @@ -90,7 +90,7 @@ TEST_UNIFORM_Z (dup_lane_31_s32, svint32_t, /* ** dup_lane_32_s32: ** mov (z[0-9]+\.s), #32 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_s32, svint32_t, @@ -100,7 +100,7 @@ TEST_UNIFORM_Z (dup_lane_32_s32, svint32_t, /* ** dup_lane_63_s32: ** mov (z[0-9]+\.s), #63 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_s32, svint32_t, @@ -110,7 +110,7 @@ TEST_UNIFORM_Z (dup_lane_63_s32, svint32_t, /* ** dup_lane_64_s32: ** mov (z[0-9]+\.s), #64 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_s32, svint32_t, @@ -120,7 +120,7 @@ TEST_UNIFORM_Z (dup_lane_64_s32, svint32_t, /* ** dup_lane_255_s32: ** mov (z[0-9]+\.s), #255 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_s32, svint32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s64.c index f2f3a1770cd..b9bf4ba1287 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s64.c @@ -5,7 +5,7 @@ /* ** dup_lane_x0_s64_tied1: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_s64_tied1, svint64_t, uint64_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_x0_s64_tied1, svint64_t, uint64_t, /* ** dup_lane_x0_s64_untied: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z1\.d, \1 +** tbl z0\.d, {z1\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_s64_untied, svint64_t, uint64_t, @@ -52,7 +52,7 @@ TEST_UNIFORM_Z (dup_lane_7_s64, svint64_t, /* ** dup_lane_8_s64: ** mov (z[0-9]+\.d), #8 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_8_s64, svint64_t, @@ -62,7 +62,7 @@ TEST_UNIFORM_Z (dup_lane_8_s64, svint64_t, /* ** dup_lane_15_s64: ** mov (z[0-9]+\.d), #15 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_15_s64, svint64_t, @@ -72,7 +72,7 @@ TEST_UNIFORM_Z (dup_lane_15_s64, svint64_t, /* ** dup_lane_16_s64: ** mov (z[0-9]+\.d), #16 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_s64, svint64_t, @@ -82,7 +82,7 @@ TEST_UNIFORM_Z (dup_lane_16_s64, svint64_t, /* ** dup_lane_31_s64: ** mov (z[0-9]+\.d), #31 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_s64, svint64_t, @@ -92,7 +92,7 @@ TEST_UNIFORM_Z (dup_lane_31_s64, svint64_t, /* ** dup_lane_32_s64: ** mov (z[0-9]+\.d), #32 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_s64, svint64_t, @@ -102,7 +102,7 @@ TEST_UNIFORM_Z (dup_lane_32_s64, svint64_t, /* ** dup_lane_63_s64: ** mov (z[0-9]+\.d), #63 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_s64, svint64_t, @@ -112,7 +112,7 @@ TEST_UNIFORM_Z (dup_lane_63_s64, svint64_t, /* ** dup_lane_64_s64: ** mov (z[0-9]+\.d), #64 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_s64, svint64_t, @@ -122,7 +122,7 @@ TEST_UNIFORM_Z (dup_lane_64_s64, svint64_t, /* ** dup_lane_255_s64: ** mov (z[0-9]+\.d), #255 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_s64, svint64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s8.c index f5a07e9f337..8de7e5597b7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_s8.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_s8_tied1: ** mov (z[0-9]+\.b), w0 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s8_tied1, svint8_t, uint8_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_s8_tied1, svint8_t, uint8_t, /* ** dup_lane_w0_s8_untied: ** mov (z[0-9]+\.b), w0 -** tbl z0\.b, z1\.b, \1 +** tbl z0\.b, {z1\.b}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_s8_untied, svint8_t, uint8_t, @@ -106,7 +106,7 @@ TEST_UNIFORM_Z (dup_lane_63_s8, svint8_t, /* ** dup_lane_64_s8: ** mov (z[0-9]+\.b), #64 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_s8, svint8_t, @@ -116,7 +116,7 @@ TEST_UNIFORM_Z (dup_lane_64_s8, svint8_t, /* ** dup_lane_255_s8: ** mov (z[0-9]+\.b), #-1 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_s8, svint8_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u16.c index e5135caa545..408b18338a8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u16.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_u16_tied1: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u16_tied1, svuint16_t, uint16_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_u16_tied1, svuint16_t, uint16_t, /* ** dup_lane_w0_u16_untied: ** mov (z[0-9]+\.h), w0 -** tbl z0\.h, z1\.h, \1 +** tbl z0\.h, {z1\.h}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u16_untied, svuint16_t, uint16_t, @@ -88,7 +88,7 @@ TEST_UNIFORM_Z (dup_lane_31_u16, svuint16_t, /* ** dup_lane_32_u16: ** mov (z[0-9]+\.h), #32 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_u16, svuint16_t, @@ -98,7 +98,7 @@ TEST_UNIFORM_Z (dup_lane_32_u16, svuint16_t, /* ** dup_lane_63_u16: ** mov (z[0-9]+\.h), #63 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_u16, svuint16_t, @@ -108,7 +108,7 @@ TEST_UNIFORM_Z (dup_lane_63_u16, svuint16_t, /* ** dup_lane_64_u16: ** mov (z[0-9]+\.h), #64 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_u16, svuint16_t, @@ -118,7 +118,7 @@ TEST_UNIFORM_Z (dup_lane_64_u16, svuint16_t, /* ** dup_lane_255_u16: ** mov (z[0-9]+\.h), #255 -** tbl z0\.h, z0\.h, \1 +** tbl z0\.h, {z0\.h}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_u16, svuint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u32.c index 7e972aca70a..d53cf056e96 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u32.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_u32_tied1: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u32_tied1, svuint32_t, uint32_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_u32_tied1, svuint32_t, uint32_t, /* ** dup_lane_w0_u32_untied: ** mov (z[0-9]+\.s), w0 -** tbl z0\.s, z1\.s, \1 +** tbl z0\.s, {z1\.s}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u32_untied, svuint32_t, uint32_t, @@ -70,7 +70,7 @@ TEST_UNIFORM_Z (dup_lane_15_u32, svuint32_t, /* ** dup_lane_16_u32: ** mov (z[0-9]+\.s), #16 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_u32, svuint32_t, @@ -80,7 +80,7 @@ TEST_UNIFORM_Z (dup_lane_16_u32, svuint32_t, /* ** dup_lane_31_u32: ** mov (z[0-9]+\.s), #31 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_u32, svuint32_t, @@ -90,7 +90,7 @@ TEST_UNIFORM_Z (dup_lane_31_u32, svuint32_t, /* ** dup_lane_32_u32: ** mov (z[0-9]+\.s), #32 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_u32, svuint32_t, @@ -100,7 +100,7 @@ TEST_UNIFORM_Z (dup_lane_32_u32, svuint32_t, /* ** dup_lane_63_u32: ** mov (z[0-9]+\.s), #63 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_u32, svuint32_t, @@ -110,7 +110,7 @@ TEST_UNIFORM_Z (dup_lane_63_u32, svuint32_t, /* ** dup_lane_64_u32: ** mov (z[0-9]+\.s), #64 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_u32, svuint32_t, @@ -120,7 +120,7 @@ TEST_UNIFORM_Z (dup_lane_64_u32, svuint32_t, /* ** dup_lane_255_u32: ** mov (z[0-9]+\.s), #255 -** tbl z0\.s, z0\.s, \1 +** tbl z0\.s, {z0\.s}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_u32, svuint32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u64.c index 5097b7e9673..c6c0e886247 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u64.c @@ -5,7 +5,7 @@ /* ** dup_lane_x0_u64_tied1: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_u64_tied1, svuint64_t, uint64_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_x0_u64_tied1, svuint64_t, uint64_t, /* ** dup_lane_x0_u64_untied: ** mov (z[0-9]+\.d), x0 -** tbl z0\.d, z1\.d, \1 +** tbl z0\.d, {z1\.d}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_x0_u64_untied, svuint64_t, uint64_t, @@ -52,7 +52,7 @@ TEST_UNIFORM_Z (dup_lane_7_u64, svuint64_t, /* ** dup_lane_8_u64: ** mov (z[0-9]+\.d), #8 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_8_u64, svuint64_t, @@ -62,7 +62,7 @@ TEST_UNIFORM_Z (dup_lane_8_u64, svuint64_t, /* ** dup_lane_15_u64: ** mov (z[0-9]+\.d), #15 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_15_u64, svuint64_t, @@ -72,7 +72,7 @@ TEST_UNIFORM_Z (dup_lane_15_u64, svuint64_t, /* ** dup_lane_16_u64: ** mov (z[0-9]+\.d), #16 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_16_u64, svuint64_t, @@ -82,7 +82,7 @@ TEST_UNIFORM_Z (dup_lane_16_u64, svuint64_t, /* ** dup_lane_31_u64: ** mov (z[0-9]+\.d), #31 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_31_u64, svuint64_t, @@ -92,7 +92,7 @@ TEST_UNIFORM_Z (dup_lane_31_u64, svuint64_t, /* ** dup_lane_32_u64: ** mov (z[0-9]+\.d), #32 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_32_u64, svuint64_t, @@ -102,7 +102,7 @@ TEST_UNIFORM_Z (dup_lane_32_u64, svuint64_t, /* ** dup_lane_63_u64: ** mov (z[0-9]+\.d), #63 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_63_u64, svuint64_t, @@ -112,7 +112,7 @@ TEST_UNIFORM_Z (dup_lane_63_u64, svuint64_t, /* ** dup_lane_64_u64: ** mov (z[0-9]+\.d), #64 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_u64, svuint64_t, @@ -122,7 +122,7 @@ TEST_UNIFORM_Z (dup_lane_64_u64, svuint64_t, /* ** dup_lane_255_u64: ** mov (z[0-9]+\.d), #255 -** tbl z0\.d, z0\.d, \1 +** tbl z0\.d, {z0\.d}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_u64, svuint64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u8.c index 25fdf0acb4a..58709f5edb8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_u8.c @@ -5,7 +5,7 @@ /* ** dup_lane_w0_u8_tied1: ** mov (z[0-9]+\.b), w0 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u8_tied1, svuint8_t, uint8_t, @@ -15,7 +15,7 @@ TEST_UNIFORM_ZX (dup_lane_w0_u8_tied1, svuint8_t, uint8_t, /* ** dup_lane_w0_u8_untied: ** mov (z[0-9]+\.b), w0 -** tbl z0\.b, z1\.b, \1 +** tbl z0\.b, {z1\.b}, \1 ** ret */ TEST_UNIFORM_ZX (dup_lane_w0_u8_untied, svuint8_t, uint8_t, @@ -106,7 +106,7 @@ TEST_UNIFORM_Z (dup_lane_63_u8, svuint8_t, /* ** dup_lane_64_u8: ** mov (z[0-9]+\.b), #64 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_64_u8, svuint8_t, @@ -116,7 +116,7 @@ TEST_UNIFORM_Z (dup_lane_64_u8, svuint8_t, /* ** dup_lane_255_u8: ** mov (z[0-9]+\.b), #-1 -** tbl z0\.b, z0\.b, \1 +** tbl z0\.b, {z0\.b}, \1 ** ret */ TEST_UNIFORM_Z (dup_lane_255_u8, svuint8_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_bf16.c index 8c077d11897..379c890e9ec 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_bf16.c @@ -4,7 +4,7 @@ /* ** tbl_bf16_tied1: -** tbl z0\.h, z0\.h, z4\.h +** tbl z0\.h, {z0\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_bf16_tied1, svbfloat16_t, svuint16_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_bf16_tied1, svbfloat16_t, svuint16_t, /* ** tbl_bf16_tied2: -** tbl z0\.h, z4\.h, z0\.h +** tbl z0\.h, {z4\.h}, z0\.h ** ret */ TEST_DUAL_Z_REV (tbl_bf16_tied2, svbfloat16_t, svuint16_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_bf16_tied2, svbfloat16_t, svuint16_t, /* ** tbl_bf16_untied: -** tbl z0\.h, z1\.h, z4\.h +** tbl z0\.h, {z1\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_bf16_untied, svbfloat16_t, svuint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f16.c index 94b6104123d..270d9d35622 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f16.c @@ -4,7 +4,7 @@ /* ** tbl_f16_tied1: -** tbl z0\.h, z0\.h, z4\.h +** tbl z0\.h, {z0\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_f16_tied1, svfloat16_t, svuint16_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_f16_tied1, svfloat16_t, svuint16_t, /* ** tbl_f16_tied2: -** tbl z0\.h, z4\.h, z0\.h +** tbl z0\.h, {z4\.h}, z0\.h ** ret */ TEST_DUAL_Z_REV (tbl_f16_tied2, svfloat16_t, svuint16_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_f16_tied2, svfloat16_t, svuint16_t, /* ** tbl_f16_untied: -** tbl z0\.h, z1\.h, z4\.h +** tbl z0\.h, {z1\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_f16_untied, svfloat16_t, svuint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f32.c index 741d3bdcf72..f3d32745788 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f32.c @@ -4,7 +4,7 @@ /* ** tbl_f32_tied1: -** tbl z0\.s, z0\.s, z4\.s +** tbl z0\.s, {z0\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_f32_tied1, svfloat32_t, svuint32_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_f32_tied1, svfloat32_t, svuint32_t, /* ** tbl_f32_tied2: -** tbl z0\.s, z4\.s, z0\.s +** tbl z0\.s, {z4\.s}, z0\.s ** ret */ TEST_DUAL_Z_REV (tbl_f32_tied2, svfloat32_t, svuint32_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_f32_tied2, svfloat32_t, svuint32_t, /* ** tbl_f32_untied: -** tbl z0\.s, z1\.s, z4\.s +** tbl z0\.s, {z1\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_f32_untied, svfloat32_t, svuint32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f64.c index 3c24e9a59e0..a7f81be7ca1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_f64.c @@ -4,7 +4,7 @@ /* ** tbl_f64_tied1: -** tbl z0\.d, z0\.d, z4\.d +** tbl z0\.d, {z0\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_f64_tied1, svfloat64_t, svuint64_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_f64_tied1, svfloat64_t, svuint64_t, /* ** tbl_f64_tied2: -** tbl z0\.d, z4\.d, z0\.d +** tbl z0\.d, {z4\.d}, z0\.d ** ret */ TEST_DUAL_Z_REV (tbl_f64_tied2, svfloat64_t, svuint64_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_f64_tied2, svfloat64_t, svuint64_t, /* ** tbl_f64_untied: -** tbl z0\.d, z1\.d, z4\.d +** tbl z0\.d, {z1\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_f64_untied, svfloat64_t, svuint64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s16.c index 2ec9c389a01..9dbf4bbd307 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s16.c @@ -4,7 +4,7 @@ /* ** tbl_s16_tied1: -** tbl z0\.h, z0\.h, z4\.h +** tbl z0\.h, {z0\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_s16_tied1, svint16_t, svuint16_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_s16_tied1, svint16_t, svuint16_t, /* ** tbl_s16_tied2: -** tbl z0\.h, z4\.h, z0\.h +** tbl z0\.h, {z4\.h}, z0\.h ** ret */ TEST_DUAL_Z_REV (tbl_s16_tied2, svint16_t, svuint16_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_s16_tied2, svint16_t, svuint16_t, /* ** tbl_s16_untied: -** tbl z0\.h, z1\.h, z4\.h +** tbl z0\.h, {z1\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_s16_untied, svint16_t, svuint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s32.c index 98b2d8d8bc0..da7fa65ecb0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s32.c @@ -4,7 +4,7 @@ /* ** tbl_s32_tied1: -** tbl z0\.s, z0\.s, z4\.s +** tbl z0\.s, {z0\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_s32_tied1, svint32_t, svuint32_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_s32_tied1, svint32_t, svuint32_t, /* ** tbl_s32_tied2: -** tbl z0\.s, z4\.s, z0\.s +** tbl z0\.s, {z4\.s}, z0\.s ** ret */ TEST_DUAL_Z_REV (tbl_s32_tied2, svint32_t, svuint32_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_s32_tied2, svint32_t, svuint32_t, /* ** tbl_s32_untied: -** tbl z0\.s, z1\.s, z4\.s +** tbl z0\.s, {z1\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_s32_untied, svint32_t, svuint32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s64.c index 0138a80d2e2..feffce38345 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s64.c @@ -4,7 +4,7 @@ /* ** tbl_s64_tied1: -** tbl z0\.d, z0\.d, z4\.d +** tbl z0\.d, {z0\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_s64_tied1, svint64_t, svuint64_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_s64_tied1, svint64_t, svuint64_t, /* ** tbl_s64_tied2: -** tbl z0\.d, z4\.d, z0\.d +** tbl z0\.d, {z4\.d}, z0\.d ** ret */ TEST_DUAL_Z_REV (tbl_s64_tied2, svint64_t, svuint64_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_s64_tied2, svint64_t, svuint64_t, /* ** tbl_s64_untied: -** tbl z0\.d, z1\.d, z4\.d +** tbl z0\.d, {z1\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_s64_untied, svint64_t, svuint64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s8.c index 7818d1b6d58..bd974e73550 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_s8.c @@ -4,7 +4,7 @@ /* ** tbl_s8_tied1: -** tbl z0\.b, z0\.b, z4\.b +** tbl z0\.b, {z0\.b}, z4\.b ** ret */ TEST_DUAL_Z (tbl_s8_tied1, svint8_t, svuint8_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_s8_tied1, svint8_t, svuint8_t, /* ** tbl_s8_tied2: -** tbl z0\.b, z4\.b, z0\.b +** tbl z0\.b, {z4\.b}, z0\.b ** ret */ TEST_DUAL_Z_REV (tbl_s8_tied2, svint8_t, svuint8_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_s8_tied2, svint8_t, svuint8_t, /* ** tbl_s8_untied: -** tbl z0\.b, z1\.b, z4\.b +** tbl z0\.b, {z1\.b}, z4\.b ** ret */ TEST_DUAL_Z (tbl_s8_untied, svint8_t, svuint8_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u16.c index f15da921162..088a2491d62 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u16.c @@ -4,7 +4,7 @@ /* ** tbl_u16_tied1: -** tbl z0\.h, z0\.h, z4\.h +** tbl z0\.h, {z0\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_u16_tied1, svuint16_t, svuint16_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_u16_tied1, svuint16_t, svuint16_t, /* ** tbl_u16_tied2: -** tbl z0\.h, z4\.h, z0\.h +** tbl z0\.h, {z4\.h}, z0\.h ** ret */ TEST_DUAL_Z_REV (tbl_u16_tied2, svuint16_t, svuint16_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_u16_tied2, svuint16_t, svuint16_t, /* ** tbl_u16_untied: -** tbl z0\.h, z1\.h, z4\.h +** tbl z0\.h, {z1\.h}, z4\.h ** ret */ TEST_DUAL_Z (tbl_u16_untied, svuint16_t, svuint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u32.c index 494300436f1..d7450004cf6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u32.c @@ -4,7 +4,7 @@ /* ** tbl_u32_tied1: -** tbl z0\.s, z0\.s, z4\.s +** tbl z0\.s, {z0\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_u32_tied1, svuint32_t, svuint32_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_u32_tied1, svuint32_t, svuint32_t, /* ** tbl_u32_tied2: -** tbl z0\.s, z4\.s, z0\.s +** tbl z0\.s, {z4\.s}, z0\.s ** ret */ TEST_DUAL_Z_REV (tbl_u32_tied2, svuint32_t, svuint32_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_u32_tied2, svuint32_t, svuint32_t, /* ** tbl_u32_untied: -** tbl z0\.s, z1\.s, z4\.s +** tbl z0\.s, {z1\.s}, z4\.s ** ret */ TEST_DUAL_Z (tbl_u32_untied, svuint32_t, svuint32_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u64.c index 158990e12c0..66e41250ac2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u64.c @@ -4,7 +4,7 @@ /* ** tbl_u64_tied1: -** tbl z0\.d, z0\.d, z4\.d +** tbl z0\.d, {z0\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_u64_tied1, svuint64_t, svuint64_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_u64_tied1, svuint64_t, svuint64_t, /* ** tbl_u64_tied2: -** tbl z0\.d, z4\.d, z0\.d +** tbl z0\.d, {z4\.d}, z0\.d ** ret */ TEST_DUAL_Z_REV (tbl_u64_tied2, svuint64_t, svuint64_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_u64_tied2, svuint64_t, svuint64_t, /* ** tbl_u64_untied: -** tbl z0\.d, z1\.d, z4\.d +** tbl z0\.d, {z1\.d}, z4\.d ** ret */ TEST_DUAL_Z (tbl_u64_untied, svuint64_t, svuint64_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u8.c index a46309a95f1..f1a23413785 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_u8.c @@ -4,7 +4,7 @@ /* ** tbl_u8_tied1: -** tbl z0\.b, z0\.b, z4\.b +** tbl z0\.b, {z0\.b}, z4\.b ** ret */ TEST_DUAL_Z (tbl_u8_tied1, svuint8_t, svuint8_t, @@ -13,7 +13,7 @@ TEST_DUAL_Z (tbl_u8_tied1, svuint8_t, svuint8_t, /* ** tbl_u8_tied2: -** tbl z0\.b, z4\.b, z0\.b +** tbl z0\.b, {z4\.b}, z0\.b ** ret */ TEST_DUAL_Z_REV (tbl_u8_tied2, svuint8_t, svuint8_t, @@ -22,7 +22,7 @@ TEST_DUAL_Z_REV (tbl_u8_tied2, svuint8_t, svuint8_t, /* ** tbl_u8_untied: -** tbl z0\.b, z1\.b, z4\.b +** tbl z0\.b, {z1\.b}, z4\.b ** ret */ TEST_DUAL_Z (tbl_u8_untied, svuint8_t, svuint8_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_6.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_6.c index 28824611be8..1a411292409 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_6.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_6.c @@ -19,4 +19,4 @@ f (uint8_t *restrict a, uint8_t *restrict b) } } -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_7.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_7.c index da9e0a271a0..8dc7f1faba0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_7.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_perm_7.c @@ -19,4 +19,4 @@ f (uint8_t *restrict a, uint8_t *restrict b) } } -/* { dg-final { scan-assembler {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} } } */ +/* { dg-final { scan-assembler {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/uzp1_1.c b/gcc/testsuite/gcc.target/aarch64/sve/uzp1_1.c index 789fb0c28e0..84c6c6f1c60 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/uzp1_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/uzp1_1.c @@ -31,10 +31,10 @@ UZP1 (vnx4sf, ((vnx4si) { 0, 2, 4, 6, 8, 10, 12, 14 })); UZP1 (vnx8hf, ((vnx8hi) { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 })); -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/uzp2_1.c b/gcc/testsuite/gcc.target/aarch64/sve/uzp2_1.c index def490daa12..1336cafc5c7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/uzp2_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/uzp2_1.c @@ -30,10 +30,10 @@ UZP2 (vnx4sf, ((vnx4si) { 1, 3, 5, 7, 9, 11, 13, 15 })); UZP2 (vnx8hf, ((vnx8hi) { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 })); -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} } } */ -/* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ +/* { dg-final { scan-assembler-not {\ttbl\t} } } */ /* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ /* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_1.c index 74a48bfdd60..6b60eb02f98 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_1.c @@ -26,7 +26,7 @@ VEC_PERM (vnx2df, vnx2di); VEC_PERM (vnx4sf, vnx4si); VEC_PERM (vnx8hf, vnx8hi); -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, {z[0-9]+\.d}, z[0-9]+\.d\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, {z[0-9]+\.s}, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, {z[0-9]+\.h}, z[0-9]+\.h\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1.c index 3194342f280..548cb49200c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1.c @@ -31,7 +31,7 @@ VEC_PERM_CONST (vnx4sf, ((vnx4si) { 1, 9, 13, 11, 2, 5, 4, 2 })); VEC_PERM_CONST (vnx8hf, ((vnx8hi) { 8, 27, 5, 4, 21, 12, 13, 0, 22, 1, 8, 9, 3, 24, 15, 1 })); -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, {z[0-9]+\.d}, z[0-9]+\.d\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, {z[0-9]+\.s}, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, {z[0-9]+\.h}, z[0-9]+\.h\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1_overrun.c b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1_overrun.c index b0732d0cc77..34e34696ed2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1_overrun.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_1_overrun.c @@ -62,7 +62,7 @@ VEC_PERM_CONST_OVERRUN (vnx8hf, ((vnx8hi) { 8 + (32 * 3), 27 + (32 * 1), 3 + (32 * 2), 24 + (32 * 2), 15 + (32 * 1), 1 + (32 * 1) })); -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, {z[0-9]+\.d}, z[0-9]+\.d\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, {z[0-9]+\.s}, z[0-9]+\.s\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, {z[0-9]+\.h}, z[0-9]+\.h\n} 4 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_single_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_single_1.c index 61122ba11b8..cdbd9ef8506 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_single_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_const_single_1.c @@ -30,7 +30,7 @@ VEC_PERM_SINGLE (vnx4sf, ((vnx4si) { 4, 5, 6, 0, 2, 7, 4, 2 })); VEC_PERM_SINGLE (vnx8hf, ((vnx8hi) { 8, 7, 5, 4, 11, 12, 13, 0, 1, 1, 8, 9, 3, 14, 15, 1 })); -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, {z[0-9]+\.d}, z[0-9]+\.d\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, {z[0-9]+\.s}, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, {z[0-9]+\.h}, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_single_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_single_1.c index 41646d3c20d..40d6da39896 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_single_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/vec_perm_single_1.c @@ -25,7 +25,7 @@ VEC_PERM (vnx2df, vnx2di) VEC_PERM (vnx4sf, vnx4si) VEC_PERM (vnx8hf, vnx8hi) -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */ -/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, {z[0-9]+\.d}, z[0-9]+\.d\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, {z[0-9]+\.s}, z[0-9]+\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, {z[0-9]+\.h}, z[0-9]+\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, {z[0-9]+\.b}, z[0-9]+\.b\n} 1 } } */ From patchwork Wed Nov 6 18:18:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007651 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD5b2WpQz1xyS for ; Thu, 7 Nov 2024 05:20:19 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 66E623858C56 for ; Wed, 6 Nov 2024 18:20:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C621D3858408; Wed, 6 Nov 2024 18:18:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C621D3858408 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C621D3858408 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917132; cv=none; b=DFsIjgghk4iJ37GHPOiYMcBrnv2PhJq9mjowP+irjpe6/2Qal3Xkg4dOpd8E99fGlnoyNeJf6xr7RGytLvCsqbqivr6fHk+FdW4xiuX2paIm88P1I0wxBNiRnoko2znTrcqRzngpNNF+HYfBhUKIGUCCFs3w1Lwfa6RfkQYKr0U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917132; c=relaxed/simple; bh=VAiyenTo26t9ggy/4rGvobf/PGkXR0uIYKXlNojmM2g=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=JBG7H/k+Z4Yzvc+XoLbKVt0L5k80L7jvPUFH0Db6Huaq7sH4e04/IcRNJBgouk4iGRrD8BBuPuNppYfHYQO4ZhxSM8MHMnc8PCuWL/Z7FNJb1v5nDc34mBdJVfRx0IwOCGkY73SZqV35Qz9fy1yHdPYRFwcILJ7FKp2Umho1qH0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4E069497; Wed, 6 Nov 2024 10:19:20 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C9EF93F66E; Wed, 6 Nov 2024 10:18:49 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 05/15] aarch64: Add an abstraction for vector base addresses In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:18:48 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org In the upcoming SVE2.1 svld1q and svst1q intrinsics, the relationship between the base vector and the data vector differs from existing gather/scatter intrinsics. This patch adds a new abstraction to handle the difference. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_shape::vector_base_type): New member function. * config/aarch64/aarch64-sve-builtins.cc (function_shape::vector_base_type): Likewise. (function_resolver::resolve_sv_displacement): Use it. (function_resolver::resolve_gather_address): Likewise. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 24 ++++++++++++++++------ gcc/config/aarch64/aarch64-sve-builtins.h | 2 ++ 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index c0b5115fdeb..a259f637a29 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1176,6 +1176,21 @@ aarch64_const_binop (enum tree_code code, tree arg1, tree arg2) return NULL_TREE; } +/* Return the type that a vector base should have in a gather load or + scatter store involving vectors of type TYPE. In an extending load, + TYPE is the result of the extension; in a truncating store, it is the + input to the truncation. + + Index vectors have the same width as base vectors, but can be either + signed or unsigned. */ +type_suffix_index +function_shape::vector_base_type (type_suffix_index type) const +{ + unsigned int required_bits = type_suffixes[type].element_bits; + gcc_assert (required_bits == 32 || required_bits == 64); + return required_bits == 32 ? TYPE_SUFFIX_u32 : TYPE_SUFFIX_u64; +} + /* Return a hash code for a function_instance. */ hashval_t function_instance::hash () const @@ -2750,7 +2765,8 @@ function_resolver::resolve_sv_displacement (unsigned int argno, return mode; } - unsigned int required_bits = type_suffixes[type].element_bits; + auto base_type = shape->vector_base_type (type); + unsigned int required_bits = type_suffixes[base_type].element_bits; if (required_bits == 32 && displacement_units () == UNITS_elements && !lookup_form (MODE_s32index, type) @@ -2900,11 +2916,7 @@ function_resolver::resolve_gather_address (unsigned int argno, return MODE_none; /* Check whether the type is the right one. */ - unsigned int required_bits = type_suffixes[type].element_bits; - gcc_assert (required_bits == 32 || required_bits == 64); - type_suffix_index required_type = (required_bits == 32 - ? TYPE_SUFFIX_u32 - : TYPE_SUFFIX_u64); + auto required_type = shape->vector_base_type (type); if (required_type != base_type) { error_at (location, "passing %qT to argument %d of %qE," diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index d5cc6e0a40d..1fb7abe132f 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -784,6 +784,8 @@ public: more common than false, so provide a default definition. */ virtual bool explicit_group_suffix_p () const { return true; } + virtual type_suffix_index vector_base_type (type_suffix_index) const; + /* Define all functions associated with the given group. */ virtual void build (function_builder &, const function_group_info &) const = 0; From patchwork Wed Nov 6 18:19:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007649 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD4y0v6Gz1xyb for ; Thu, 7 Nov 2024 05:19:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 245D03858420 for ; Wed, 6 Nov 2024 18:19:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 445563858424; Wed, 6 Nov 2024 18:19:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 445563858424 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 445563858424 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917156; cv=none; b=PYKZqSeY+bfwnN07LyM8+Ol97mrOPoLzKHKvBD1GAySseqbqZ2cUhgLN94edhgwahuJJwUiEsDxiXAWxXPTXAIaFK3Bn7Vyn0GuzMAJZKoszqOhTvqg9UIUElDnKmh1ESx3O4P5B7/uk90/UmiyEbKiaNtW2S5NNX7g2Zzwi2vk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917156; c=relaxed/simple; bh=sKT3sB2bed7irPvmRokEY8tT6QekumJ7Yr9lGKuhQoQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=NkLTWyvbtKjRpwoajlVJSB1GCjI/1vQljUpieUyFlryx/WVOflQIfyWD5qIAiSs2BLTTbq5nFkNQO0DzaYXTOgJGFdHjKMqcROHli3zgA1P+vBChsKFKN8PHL+Cy6+eOWgJl38R7gG0Ok15vSH1BJADp72KR+9fyFGt8BpYpzCg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BBECD497; Wed, 6 Nov 2024 10:19:41 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 449533F66E; Wed, 6 Nov 2024 10:19:11 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 06/15] aarch64: Add an abstraction for scatter store type inference In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:19:09 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Until now, all data arguments to a scatter store needed to have 32-bit or 64-bit elements. This isn't true for the upcoming SVE2.1 svst1q scatter intrinsic, so this patch adds an abstraction around the restriction. gcc/ * config/aarch64/aarch64-sve-builtins-shapes.cc (store_scatter_base::infer_vector_type): New virtual member function. (store_scatter_base::resolve): Use it. --- gcc/config/aarch64/aarch64-sve-builtins-shapes.cc | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index f190770250f..e1204c283b6 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -994,12 +994,18 @@ struct store_scatter_base : public overloaded_base<0> mode_suffix_index mode; type_suffix_index type; if (!r.check_gp_argument (has_displacement_p ? 3 : 2, i, nargs) - || (type = r.infer_sd_vector_type (nargs - 1)) == NUM_TYPE_SUFFIXES + || (type = infer_vector_type (r, nargs - 1)) == NUM_TYPE_SUFFIXES || (mode = r.resolve_gather_address (i, type, false)) == MODE_none) return error_mark_node; return r.resolve_to (mode, type); } + + virtual type_suffix_index + infer_vector_type (function_resolver &r, unsigned int argno) const + { + return r.infer_sd_vector_type (argno); + } }; /* Base class for ternary operations in which the final argument is an From patchwork Wed Nov 6 18:19:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007650 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD5W4jcrz1xyS for ; Thu, 7 Nov 2024 05:20:15 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CF2033858427 for ; Wed, 6 Nov 2024 18:20:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5C45F3858430; Wed, 6 Nov 2024 18:19:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C45F3858430 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5C45F3858430 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917185; cv=none; b=TNyhc6KSGZZvUKF5OlQH59FjdbTQGECx8XufY0DhddUz6e0bq8T69LA27KOH4Ut4KfsHFD0ngG2Ag0pcW/zs4NsqBvpQf8Q+zEd/9f8JlR9Lvd/5y4kbAuuHpaxvDeA2/75qXNWqze608wgub3FRabdJinOVkXN+cAOLJ4iI5fk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917185; c=relaxed/simple; bh=CdnymUL3Ciyuz4segoZ/B8T5L42/kYw9aE5AVanh9ko=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=UStf2q7uzkvcmThP4a9xtY+W7HStByreGeTZbFg7XheY0btkpcKRfFHvKqtq38Zdc2my21+PDiGA2Rg+Bwo9bgX1zfjvQb9TTonJ/b6wLSO+V+D7x5qjCsXdn7WCJvF6vPviG1DTsCpHW8gI4gi+XK9JoCdQMawcFa890DW2GJs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 06F4B497; Wed, 6 Nov 2024 10:20:07 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AEF463F66E; Wed, 6 Nov 2024 10:19:35 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 07/15] aarch64: Parameterise SVE pointer type inference In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:19:32 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org All extending gather load intrinsics encode the source type in their name (e.g. svld1sb for an extending load from signed bytes). The type of the extension result has to be specified using an explicit type suffix; it isn't something that can be inferred from the arguments, since there are multiple valid choices for the same arguments. This meant that type inference for gather loads was only needed for non-extending loads, in which case the pointer target had to be a 32-bit or 64-bit element type. The gather_scatter_p argument to function_resolver::infer_pointer_type therefore controlled two things: how we should react to vector base addresses, and whether we should require a minimum element size of 32. The element size restriction doesn't apply to the upcomding SVE2.1 svld1q intrinsic, so this patch adds a separate argument for the minimum element size requirement. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_resolver::target_type_restrictions): New enum. (function_resolver::infer_pointer_type): Add an extra argument that specifies what the target type can be. * config/aarch64/aarch64-sve-builtins.cc (function_resolver::infer_pointer_type): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (load_gather_sv_base::get_target_type_restrictions): New virtual member function. (load_gather_sv_base::resolve): Use it. Update call to infer_pointer_type. --- gcc/config/aarch64/aarch64-sve-builtins-shapes.cc | 10 +++++++++- gcc/config/aarch64/aarch64-sve-builtins.cc | 8 +++++--- gcc/config/aarch64/aarch64-sve-builtins.h | 4 +++- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index e1204c283b6..cf321540b60 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -815,14 +815,22 @@ struct load_gather_sv_base : public overloaded_base<0> unsigned int i, nargs; mode_suffix_index mode; type_suffix_index type; + auto restrictions = get_target_type_restrictions (r); if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_pointer_type (i, true)) == NUM_TYPE_SUFFIXES + || (type = r.infer_pointer_type (i, true, + restrictions)) == NUM_TYPE_SUFFIXES || (mode = r.resolve_sv_displacement (i + 1, type, true), mode == MODE_none)) return error_mark_node; return r.resolve_to (mode, type); } + + virtual function_resolver::target_type_restrictions + get_target_type_restrictions (const function_instance &) const + { + return function_resolver::TARGET_32_64; + } }; /* Base class for load_ext_gather_index and load_ext_gather_offset, diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index a259f637a29..9fb0d6fd416 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1998,10 +1998,12 @@ function_resolver::infer_64bit_scalar_integer_pair (unsigned int argno) corresponding type suffix. Return that type suffix on success, otherwise report an error and return NUM_TYPE_SUFFIXES. GATHER_SCATTER_P is true if the function is a gather/scatter - operation, and so requires a pointer to 32-bit or 64-bit data. */ + operation. RESTRICTIONS describes any additional restrictions + on the target type. */ type_suffix_index function_resolver::infer_pointer_type (unsigned int argno, - bool gather_scatter_p) + bool gather_scatter_p, + target_type_restrictions restrictions) { tree actual = get_argument_type (argno); if (actual == error_mark_node) @@ -2027,7 +2029,7 @@ function_resolver::infer_pointer_type (unsigned int argno, return NUM_TYPE_SUFFIXES; } unsigned int bits = type_suffixes[type].element_bits; - if (gather_scatter_p && bits != 32 && bits != 64) + if (restrictions == TARGET_32_64 && bits != 32 && bits != 64) { error_at (location, "passing %qT to argument %d of %qE, which" " expects a pointer to 32-bit or 64-bit elements", diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 1fb7abe132f..5bd9b88d117 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -488,6 +488,7 @@ public: class function_resolver : public function_call_info { public: + enum target_type_restrictions { TARGET_ANY, TARGET_32_64 }; enum { SAME_SIZE = 256, HALF_SIZE, QUARTER_SIZE }; static const type_class_index SAME_TYPE_CLASS = NUM_TYPE_CLASSES; @@ -518,7 +519,8 @@ public: vector_type_index infer_predicate_type (unsigned int); type_suffix_index infer_integer_scalar_type (unsigned int); type_suffix_index infer_64bit_scalar_integer_pair (unsigned int); - type_suffix_index infer_pointer_type (unsigned int, bool = false); + type_suffix_index infer_pointer_type (unsigned int, bool = false, + target_type_restrictions = TARGET_ANY); sve_type infer_sve_type (unsigned int); sve_type infer_vector_or_tuple_type (unsigned int, unsigned int); type_suffix_index infer_vector_type (unsigned int); From patchwork Wed Nov 6 18:20:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007653 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD6W2H6fz1xyS for ; Thu, 7 Nov 2024 05:21:07 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 33D583858C98 for ; Wed, 6 Nov 2024 18:21:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 239453858D21; Wed, 6 Nov 2024 18:20:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 239453858D21 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 239453858D21 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917213; cv=none; b=lOY71mCgCWrN17F99CRweYgz/NmdqdFQVSsOsBc70YMKL3BmbI+rTI9plsbuMyaFbR0RUxNT8MdDcBDKXD9Wmn9TWjT6YzUudiue/ql7Wg9ab8WiEanf+/tJQ8uwh96k7+GhNrITNnSHRdQqxkVMgFtwzQdLmfaoZtn8iv7aN5M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917213; c=relaxed/simple; bh=ZRF0j0wYp/mc936tY6W2WTdMj3c0YsE6naOCBrfgjpY=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=xcMPrKoBcccDwf0nP1TZ2KoP/5k29RRcTdRZy7QclthhUjnUzXIjLJTdbEvlOyjgL9JfMgJ2JAd5Nk+QbW2dgQO+vT9xMJV+jgylkrUWUqghsVv3bJLxpjAeLuFKE7NkSa213UX5lSsL+LaAvK0Jww1ggL+RdKAf2MTDN6qo7T4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A298A497; Wed, 6 Nov 2024 10:20:37 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 742DF3F66E; Wed, 6 Nov 2024 10:20:06 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 08/15] aarch64: Factor out part of the SVE ext_def class In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:20:03 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch factors out some of ext_def into a base class, so that it can be reused for the SVE2.1 svextq intrinsic. gcc/ * config/aarch64/aarch64-sve-builtins-shapes.cc (ext_base): New base class, extracted from... (ext_def): ...here. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 32 +++++++++++-------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index cf321540b60..62277afaeff 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -735,6 +735,23 @@ struct binary_za_slice_opt_single_base : public overloaded_base<1> } }; +/* Base class for ext. */ +struct ext_base : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,v0,su64", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + return r.resolve_uniform (2, 1); + } +}; + /* Base class for inc_dec and inc_dec_pat. */ struct inc_dec_base : public overloaded_base<0> { @@ -2413,21 +2430,8 @@ SHAPE (dupq) where the final argument is an integer constant expression that when multiplied by the number of bytes in t0 is in the range [0, 255]. */ -struct ext_def : public overloaded_base<0> +struct ext_def : public ext_base { - void - build (function_builder &b, const function_group_info &group) const override - { - b.add_overloaded_functions (group, MODE_none); - build_all (b, "v0,v0,v0,su64", group, MODE_none); - } - - tree - resolve (function_resolver &r) const override - { - return r.resolve_uniform (2, 1); - } - bool check (function_checker &c) const override { From patchwork Wed Nov 6 18:20:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007654 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD6Y135Yz1xyS for ; Thu, 7 Nov 2024 05:21:09 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 58BE83858C53 for ; Wed, 6 Nov 2024 18:21:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 62B153858D20; Wed, 6 Nov 2024 18:20:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 62B153858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 62B153858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917238; cv=none; b=HT3XHi572PH/VubFe9uaGOZdM4lY9VB1peWdwq9974QAfxjmzu9+oLsD5MMiSIMCvwirRbHVTyEcIb60jJrPbNiQqLOl9KVOKkaKiejHZEHgML+vk9LStQW6DuM65j8dtvMBIFCL8oIQYFur9H8MZj2NJW5Sz25n9tsLDCjQl6E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917238; c=relaxed/simple; bh=pkEwPfSoFK2MFL6xCPvIQNkHR1o8xtPA/FHZQyiZ/Jc=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=Aj9g9/X17TLNFnOaTDlxK46FAEN9XveqdVEdUOLtO16jC7snRPeCFG32LjIuZuYqeJKQhdjBRzSUK42sDgE5s9VeNv3KLhjabDCG4Urr33Mx1HZuimojHI8w0Lr40BM11/7a2LdXfTdhQPYJUPLZXD+7DHxtwQshxBted/SNbrY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0B34B497; Wed, 6 Nov 2024 10:20:58 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 87B623F66E; Wed, 6 Nov 2024 10:20:27 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 09/15] aarch64: Sort some SVE2 lists alphabetically In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:20:26 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ * config/aarch64/aarch64-sve-builtins-sve2.def: Sort entries alphabetically. * config/aarch64/aarch64-sve-builtins-sve2.h: Likewise. * config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise. --- .../aarch64/aarch64-sve-builtins-sve2.cc | 24 +++++++------- .../aarch64/aarch64-sve-builtins-sve2.def | 32 +++++++++---------- .../aarch64/aarch64-sve-builtins-sve2.h | 14 ++++---- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index f0ab7400ef5..24e95afd6eb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -589,20 +589,20 @@ FUNCTION (svabalb, unspec_based_add_function, (UNSPEC_SABDLB, UNSPEC_UABDLB, -1)) FUNCTION (svabalt, unspec_based_add_function, (UNSPEC_SABDLT, UNSPEC_UABDLT, -1)) +FUNCTION (svabdlb, unspec_based_function, (UNSPEC_SABDLB, UNSPEC_UABDLB, -1)) +FUNCTION (svabdlt, unspec_based_function, (UNSPEC_SABDLT, UNSPEC_UABDLT, -1)) +FUNCTION (svadalp, unspec_based_function, (UNSPEC_SADALP, UNSPEC_UADALP, -1)) FUNCTION (svadclb, unspec_based_function, (-1, UNSPEC_ADCLB, -1)) FUNCTION (svadclt, unspec_based_function, (-1, UNSPEC_ADCLT, -1)) FUNCTION (svaddhnb, unspec_based_function, (UNSPEC_ADDHNB, UNSPEC_ADDHNB, -1)) FUNCTION (svaddhnt, unspec_based_function, (UNSPEC_ADDHNT, UNSPEC_ADDHNT, -1)) -FUNCTION (svabdlb, unspec_based_function, (UNSPEC_SABDLB, UNSPEC_UABDLB, -1)) -FUNCTION (svabdlt, unspec_based_function, (UNSPEC_SABDLT, UNSPEC_UABDLT, -1)) -FUNCTION (svadalp, unspec_based_function, (UNSPEC_SADALP, UNSPEC_UADALP, -1)) FUNCTION (svaddlb, unspec_based_function, (UNSPEC_SADDLB, UNSPEC_UADDLB, -1)) FUNCTION (svaddlbt, unspec_based_function, (UNSPEC_SADDLBT, -1, -1)) FUNCTION (svaddlt, unspec_based_function, (UNSPEC_SADDLT, UNSPEC_UADDLT, -1)) -FUNCTION (svaddwb, unspec_based_function, (UNSPEC_SADDWB, UNSPEC_UADDWB, -1)) -FUNCTION (svaddwt, unspec_based_function, (UNSPEC_SADDWT, UNSPEC_UADDWT, -1)) FUNCTION (svaddp, unspec_based_pred_function, (UNSPEC_ADDP, UNSPEC_ADDP, UNSPEC_FADDP)) +FUNCTION (svaddwb, unspec_based_function, (UNSPEC_SADDWB, UNSPEC_UADDWB, -1)) +FUNCTION (svaddwt, unspec_based_function, (UNSPEC_SADDWT, UNSPEC_UADDWT, -1)) FUNCTION (svaesd, fixed_insn_function, (CODE_FOR_aarch64_sve2_aesd)) FUNCTION (svaese, fixed_insn_function, (CODE_FOR_aarch64_sve2_aese)) FUNCTION (svaesimc, fixed_insn_function, (CODE_FOR_aarch64_sve2_aesimc)) @@ -649,12 +649,12 @@ FUNCTION (svldnt1uh_gather, svldnt1_gather_extend_impl, (TYPE_SUFFIX_u16)) FUNCTION (svldnt1uw_gather, svldnt1_gather_extend_impl, (TYPE_SUFFIX_u32)) FUNCTION (svlogb, unspec_based_function, (-1, -1, UNSPEC_COND_FLOGB)) FUNCTION (svmatch, svmatch_svnmatch_impl, (UNSPEC_MATCH)) +FUNCTION (svmaxnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMAXNMP)) FUNCTION (svmaxp, unspec_based_pred_function, (UNSPEC_SMAXP, UNSPEC_UMAXP, UNSPEC_FMAXP)) -FUNCTION (svmaxnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMAXNMP)) +FUNCTION (svminnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMINNMP)) FUNCTION (svminp, unspec_based_pred_function, (UNSPEC_SMINP, UNSPEC_UMINP, UNSPEC_FMINP)) -FUNCTION (svminnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMINNMP)) FUNCTION (svmlalb, unspec_based_mla_function, (UNSPEC_SMULLB, UNSPEC_UMULLB, UNSPEC_FMLALB)) FUNCTION (svmlalb_lane, unspec_based_mla_lane_function, (UNSPEC_SMULLB, @@ -723,15 +723,15 @@ FUNCTION (svqdmullt_lane, unspec_based_lane_function, (UNSPEC_SQDMULLT, FUNCTION (svqneg, rtx_code_function, (SS_NEG, UNKNOWN, UNKNOWN)) FUNCTION (svqrdcmlah, svqrdcmlah_impl,) FUNCTION (svqrdcmlah_lane, svqrdcmlah_lane_impl,) -FUNCTION (svqrdmulh, unspec_based_function, (UNSPEC_SQRDMULH, -1, -1)) -FUNCTION (svqrdmulh_lane, unspec_based_lane_function, (UNSPEC_SQRDMULH, - -1, -1)) FUNCTION (svqrdmlah, unspec_based_function, (UNSPEC_SQRDMLAH, -1, -1)) FUNCTION (svqrdmlah_lane, unspec_based_lane_function, (UNSPEC_SQRDMLAH, -1, -1)) FUNCTION (svqrdmlsh, unspec_based_function, (UNSPEC_SQRDMLSH, -1, -1)) FUNCTION (svqrdmlsh_lane, unspec_based_lane_function, (UNSPEC_SQRDMLSH, -1, -1)) +FUNCTION (svqrdmulh, unspec_based_function, (UNSPEC_SQRDMULH, -1, -1)) +FUNCTION (svqrdmulh_lane, unspec_based_lane_function, (UNSPEC_SQRDMULH, + -1, -1)) FUNCTION (svqrshl, svqrshl_impl,) FUNCTION (svqrshr, unspec_based_uncond_function, (UNSPEC_SQRSHR, UNSPEC_UQRSHR, -1, 1)) @@ -805,12 +805,12 @@ FUNCTION (svunpk, svunpk_impl,) FUNCTION (svuqadd, svuqadd_impl,) FUNCTION (svuzp, multireg_permute, (UNSPEC_UZP)) FUNCTION (svuzpq, multireg_permute, (UNSPEC_UZPQ)) -FUNCTION (svzip, multireg_permute, (UNSPEC_ZIP)) -FUNCTION (svzipq, multireg_permute, (UNSPEC_ZIPQ)) FUNCTION (svwhilege, while_comparison, (UNSPEC_WHILEGE, UNSPEC_WHILEHS)) FUNCTION (svwhilegt, while_comparison, (UNSPEC_WHILEGT, UNSPEC_WHILEHI)) FUNCTION (svwhilerw, svwhilerw_svwhilewr_impl, (UNSPEC_WHILERW)) FUNCTION (svwhilewr, svwhilerw_svwhilewr_impl, (UNSPEC_WHILEWR)) FUNCTION (svxar, svxar_impl,) +FUNCTION (svzip, multireg_permute, (UNSPEC_ZIP)) +FUNCTION (svzipq, multireg_permute, (UNSPEC_ZIPQ)) } /* end namespace aarch64_sve */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index e4021559f36..12548fe39cb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -21,13 +21,13 @@ DEF_SVE_FUNCTION (svaba, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svabalb, ternary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svabalt, ternary_long_opt_n, hsd_integer, none) +DEF_SVE_FUNCTION (svabdlb, binary_long_opt_n, hsd_integer, none) +DEF_SVE_FUNCTION (svabdlt, binary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svadalp, binary_wide, hsd_integer, mxz) DEF_SVE_FUNCTION (svadclb, ternary_opt_n, sd_unsigned, none) DEF_SVE_FUNCTION (svadclt, ternary_opt_n, sd_unsigned, none) DEF_SVE_FUNCTION (svaddhnb, binary_narrowb_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svaddhnt, binary_narrowt_opt_n, hsd_integer, none) -DEF_SVE_FUNCTION (svabdlb, binary_long_opt_n, hsd_integer, none) -DEF_SVE_FUNCTION (svabdlt, binary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svaddlb, binary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svaddlbt, binary_long_opt_n, hsd_signed, none) DEF_SVE_FUNCTION (svaddlt, binary_long_opt_n, hsd_integer, none) @@ -54,8 +54,10 @@ DEF_SVE_FUNCTION (svhadd, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svhsub, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svhsubr, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svlogb, unary_to_int, all_float, mxz) -DEF_SVE_FUNCTION (svmaxp, binary, all_arith, mx) DEF_SVE_FUNCTION (svmaxnmp, binary, all_float, mx) +DEF_SVE_FUNCTION (svmaxp, binary, all_arith, mx) +DEF_SVE_FUNCTION (svminnmp, binary, all_float, mx) +DEF_SVE_FUNCTION (svminp, binary, all_arith, mx) DEF_SVE_FUNCTION (svmla_lane, ternary_lane, hsd_integer, none) DEF_SVE_FUNCTION (svmlalb, ternary_long_opt_n, s_float_hsd_integer, none) DEF_SVE_FUNCTION (svmlalb_lane, ternary_long_lane, s_float_sd_integer, none) @@ -66,8 +68,6 @@ DEF_SVE_FUNCTION (svmlslb, ternary_long_opt_n, s_float_hsd_integer, none) DEF_SVE_FUNCTION (svmlslb_lane, ternary_long_lane, s_float_sd_integer, none) DEF_SVE_FUNCTION (svmlslt, ternary_long_opt_n, s_float_hsd_integer, none) DEF_SVE_FUNCTION (svmlslt_lane, ternary_long_lane, s_float_sd_integer, none) -DEF_SVE_FUNCTION (svminp, binary, all_arith, mx) -DEF_SVE_FUNCTION (svminnmp, binary, all_float, mx) DEF_SVE_FUNCTION (svmovlb, unary_long, hsd_integer, none) DEF_SVE_FUNCTION (svmovlt, unary_long, hsd_integer, none) DEF_SVE_FUNCTION (svmul_lane, binary_lane, hsd_integer, none) @@ -101,14 +101,15 @@ DEF_SVE_FUNCTION (svqdmullb_lane, binary_long_lane, sd_signed, none) DEF_SVE_FUNCTION (svqdmullt, binary_long_opt_n, hsd_signed, none) DEF_SVE_FUNCTION (svqdmullt_lane, binary_long_lane, sd_signed, none) DEF_SVE_FUNCTION (svqneg, unary, all_signed, mxz) -DEF_SVE_FUNCTION (svqrdmulh, binary_opt_n, all_signed, none) -DEF_SVE_FUNCTION (svqrdmulh_lane, binary_lane, hsd_signed, none) +DEF_SVE_FUNCTION (svqrdcmlah, ternary_rotate, all_signed, none) +DEF_SVE_FUNCTION (svqrdcmlah_lane, ternary_lane_rotate, hs_signed, none) DEF_SVE_FUNCTION (svqrdmlah, ternary_opt_n, all_signed, none) DEF_SVE_FUNCTION (svqrdmlah_lane, ternary_lane, hsd_signed, none) DEF_SVE_FUNCTION (svqrdmlsh, ternary_opt_n, all_signed, none) DEF_SVE_FUNCTION (svqrdmlsh_lane, ternary_lane, hsd_signed, none) -DEF_SVE_FUNCTION (svqrdcmlah, ternary_rotate, all_signed, none) -DEF_SVE_FUNCTION (svqrdcmlah_lane, ternary_lane_rotate, hs_signed, none) +DEF_SVE_FUNCTION (svqrdmulh, binary_opt_n, all_signed, none) +DEF_SVE_FUNCTION (svqrdmulh_lane, binary_lane, hsd_signed, none) +DEF_SVE_FUNCTION (svqrshl, binary_int_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svqrshrnb, shift_right_imm_narrowb, hsd_integer, none) DEF_SVE_FUNCTION (svqrshrnt, shift_right_imm_narrowt, hsd_integer, none) DEF_SVE_FUNCTION (svqrshrunb, shift_right_imm_narrowb_to_uint, hsd_signed, none) @@ -119,7 +120,6 @@ DEF_SVE_FUNCTION (svqshrnb, shift_right_imm_narrowb, hsd_integer, none) DEF_SVE_FUNCTION (svqshrnt, shift_right_imm_narrowt, hsd_integer, none) DEF_SVE_FUNCTION (svqshrunb, shift_right_imm_narrowb_to_uint, hsd_signed, none) DEF_SVE_FUNCTION (svqshrunt, shift_right_imm_narrowt_to_uint, hsd_signed, none) -DEF_SVE_FUNCTION (svqrshl, binary_int_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svqsub, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svqsubr, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svqxtnb, unary_narrowb, hsd_integer, none) @@ -130,11 +130,11 @@ DEF_SVE_FUNCTION (svraddhnb, binary_narrowb_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svraddhnt, binary_narrowt_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svrecpe, unary, s_unsigned, mxz) DEF_SVE_FUNCTION (svrhadd, binary_opt_n, all_integer, mxz) -DEF_SVE_FUNCTION (svrsqrte, unary, s_unsigned, mxz) DEF_SVE_FUNCTION (svrshl, binary_int_opt_single_n, all_integer, mxz) DEF_SVE_FUNCTION (svrshr, shift_right_imm, all_integer, mxz) DEF_SVE_FUNCTION (svrshrnb, shift_right_imm_narrowb, hsd_integer, none) DEF_SVE_FUNCTION (svrshrnt, shift_right_imm_narrowt, hsd_integer, none) +DEF_SVE_FUNCTION (svrsqrte, unary, s_unsigned, mxz) DEF_SVE_FUNCTION (svrsra, ternary_shift_right_imm, all_integer, none) DEF_SVE_FUNCTION (svrsubhnb, binary_narrowb_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svrsubhnt, binary_narrowt_opt_n, hsd_integer, none) @@ -172,15 +172,15 @@ DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) DEF_SVE_FUNCTION (svldnt1_gather, load_gather_vs, sd_data, implicit) DEF_SVE_FUNCTION (svldnt1sb_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svldnt1ub_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svnmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_index_restricted, sd_data, implicit) @@ -196,8 +196,8 @@ DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, | AARCH64_FL_SVE2_AES) DEF_SVE_FUNCTION (svaesd, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaese, binary, b_unsigned, none) -DEF_SVE_FUNCTION (svaesmc, unary, b_unsigned, none) DEF_SVE_FUNCTION (svaesimc, unary, b_unsigned, none) +DEF_SVE_FUNCTION (svaesmc, unary, b_unsigned, none) DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, d_unsigned, none) DEF_SVE_FUNCTION (svpmullt_pair, binary_opt_n, d_unsigned, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h index 013a9dfc5fa..d58190280a8 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h @@ -80,8 +80,10 @@ namespace aarch64_sve extern const function_base *const svldnt1uw_gather; extern const function_base *const svlogb; extern const function_base *const svmatch; - extern const function_base *const svmaxp; extern const function_base *const svmaxnmp; + extern const function_base *const svmaxp; + extern const function_base *const svminnmp; + extern const function_base *const svminp; extern const function_base *const svmlalb; extern const function_base *const svmlalb_lane; extern const function_base *const svmlalt; @@ -90,8 +92,6 @@ namespace aarch64_sve extern const function_base *const svmlslb_lane; extern const function_base *const svmlslt; extern const function_base *const svmlslt_lane; - extern const function_base *const svminp; - extern const function_base *const svminnmp; extern const function_base *const svmovlb; extern const function_base *const svmovlt; extern const function_base *const svmullb; @@ -130,12 +130,12 @@ namespace aarch64_sve extern const function_base *const svqneg; extern const function_base *const svqrdcmlah; extern const function_base *const svqrdcmlah_lane; - extern const function_base *const svqrdmulh; - extern const function_base *const svqrdmulh_lane; extern const function_base *const svqrdmlah; extern const function_base *const svqrdmlah_lane; extern const function_base *const svqrdmlsh; extern const function_base *const svqrdmlsh_lane; + extern const function_base *const svqrdmulh; + extern const function_base *const svqrdmulh_lane; extern const function_base *const svqrshl; extern const function_base *const svqrshr; extern const function_base *const svqrshrn; @@ -198,13 +198,13 @@ namespace aarch64_sve extern const function_base *const svuqadd; extern const function_base *const svuzp; extern const function_base *const svuzpq; - extern const function_base *const svzip; - extern const function_base *const svzipq; extern const function_base *const svwhilege; extern const function_base *const svwhilegt; extern const function_base *const svwhilerw; extern const function_base *const svwhilewr; extern const function_base *const svxar; + extern const function_base *const svzip; + extern const function_base *const svzipq; } } From patchwork Wed Nov 6 18:20:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007655 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD7j0wmZz1xyS for ; Thu, 7 Nov 2024 05:22:08 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DBD953858CD1 for ; Wed, 6 Nov 2024 18:22:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BFBBC3857C58; Wed, 6 Nov 2024 18:20:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BFBBC3857C58 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BFBBC3857C58 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917254; cv=none; b=rdq82k6Cq40UbKvbAEVQ5L6yA9tCIIkpgbbKfeqyaj2yTrBgJMv5BisZBLGXJtGYx/MMghQWChzb0qnDQnunZyNOJnxgZrNq3ycMoWeNoRNzHRpw9v3lZv3n26bOklLiAaebAdjy98Y/q7YigAgB89JXC6h9k/gwzamZjHJaFP0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917254; c=relaxed/simple; bh=x9rvklv+LxkEnDUeOQ/LVQLWnfDG2e/6pyODrnBLIRw=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=ZKA0GaFrWI1b6V+sC08Nn6hCJW7sBDkPjidgwmtuTGuWHxBewTA0ZBTq65/8a4r+2X/cVE2lKCtELMlxOoDU+tmmEJ+uEt9eXLBOre5rLrJ0+02jE4S9o72+UXWqMSCRVF0cgElT85brHQyNAYQ+PxzT3FtHg6yE20d18NhJhpQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6400B497; Wed, 6 Nov 2024 10:21:18 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B959D3F66E; Wed, 6 Nov 2024 10:20:47 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 10/15] aarch64: Add svboolx4_t In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:20:46 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds an svboolx4_t type, to go alongside the existing svboolx2_t type. It doesn't require any special ISA support beyond SVE itself and it currently has no associated instructions. gcc/ * config/aarch64/aarch64-modes.def (VNx64BI): New mode. * config/aarch64/aarch64-protos.h (aarch64_split_double_move): Generalize to... (aarch64_split_move): ...this. * config/aarch64/aarch64-sve-builtins-base.def (svcreate4, svget4) (svset4, svundef4): Add bool variants. * config/aarch64/aarch64-sve-builtins.cc (handle_arm_sve_h): Add svboolx4_t. * config/aarch64/iterators.md (SVE_STRUCT_BI): New mode iterator. * config/aarch64/aarch64-sve.md (movvnx32bi): Generalize to... (mov): ...this. * config/aarch64/aarch64.cc (pure_scalable_type_info::piece::get_rtx): Allow num_prs to be 4. (aarch64_classify_vector_mode): Handle VNx64BI. (aarch64_hard_regno_nregs): Likewise. (aarch64_class_max_nregs): Likewise. (aarch64_array_mode): Use VNx64BI for arrays of 4 svbool_ts. (aarch64_split_double_move): Generalize to... (aarch64_split_move): ...this. (aarch64_split_128bit_move): Update call accordingly. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/create_5.c: Expect svcreate4 to succeed for svbool_ts. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_UNDEF_B): New macro. * gcc.target/aarch64/sve/acle/asm/create4_1.c: Test _b form. * gcc.target/aarch64/sve/acle/asm/undef2_1.c: Likewise. * gcc.target/aarch64/sve/acle/asm/undef4_1.c: Likewise. * gcc.target/aarch64/sve/acle/asm/get4_b.c: New test. * gcc.target/aarch64/sve/acle/asm/set4_b.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/svboolx4_1.c: Likewise. --- gcc/config/aarch64/aarch64-modes.def | 3 + gcc/config/aarch64/aarch64-protos.h | 2 +- .../aarch64/aarch64-sve-builtins-base.def | 4 + gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- gcc/config/aarch64/aarch64-sve.md | 8 +- gcc/config/aarch64/aarch64.cc | 50 ++++---- gcc/config/aarch64/iterators.md | 2 + .../aarch64/sve/acle/asm/create4_1.c | 10 ++ .../gcc.target/aarch64/sve/acle/asm/get4_b.c | 73 +++++++++++ .../gcc.target/aarch64/sve/acle/asm/set4_b.c | 87 +++++++++++++ .../aarch64/sve/acle/asm/test_sve_acle.h | 8 ++ .../aarch64/sve/acle/asm/undef2_1.c | 7 ++ .../aarch64/sve/acle/asm/undef4_1.c | 7 ++ .../aarch64/sve/acle/general-c/create_5.c | 2 +- .../aarch64/sve/acle/general-c/svboolx4_1.c | 117 ++++++++++++++++++ 15 files changed, 351 insertions(+), 31 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_b.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_b.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svboolx4_1.c diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def index 25a22c1195e..813421e1e39 100644 --- a/gcc/config/aarch64/aarch64-modes.def +++ b/gcc/config/aarch64/aarch64-modes.def @@ -48,18 +48,21 @@ ADJUST_FLOAT_FORMAT (HF, &ieee_half_format); /* Vector modes. */ +VECTOR_BOOL_MODE (VNx64BI, 64, BI, 8); VECTOR_BOOL_MODE (VNx32BI, 32, BI, 4); VECTOR_BOOL_MODE (VNx16BI, 16, BI, 2); VECTOR_BOOL_MODE (VNx8BI, 8, BI, 2); VECTOR_BOOL_MODE (VNx4BI, 4, BI, 2); VECTOR_BOOL_MODE (VNx2BI, 2, BI, 2); +ADJUST_NUNITS (VNx64BI, aarch64_sve_vg * 32); ADJUST_NUNITS (VNx32BI, aarch64_sve_vg * 16); ADJUST_NUNITS (VNx16BI, aarch64_sve_vg * 8); ADJUST_NUNITS (VNx8BI, aarch64_sve_vg * 4); ADJUST_NUNITS (VNx4BI, aarch64_sve_vg * 2); ADJUST_NUNITS (VNx2BI, aarch64_sve_vg); +ADJUST_ALIGNMENT (VNx64BI, 2); ADJUST_ALIGNMENT (VNx32BI, 2); ADJUST_ALIGNMENT (VNx16BI, 2); ADJUST_ALIGNMENT (VNx8BI, 2); diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index e8588e1cb17..660e335bf34 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1045,7 +1045,7 @@ rtx aarch64_simd_expand_builtin (int, tree, rtx); void aarch64_simd_lane_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT, const_tree); rtx aarch64_endian_lane_rtx (machine_mode, unsigned int); -void aarch64_split_double_move (rtx, rtx, machine_mode); +void aarch64_split_move (rtx, rtx, machine_mode); void aarch64_split_128bit_move (rtx, rtx); bool aarch64_split_128bit_move_p (rtx, rtx); diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index da2a0e41aa5..0353f56e705 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -74,6 +74,7 @@ DEF_SVE_FUNCTION (svcreate2, create, all_data, none) DEF_SVE_FUNCTION (svcreate2, create, b, none) DEF_SVE_FUNCTION (svcreate3, create, all_data, none) DEF_SVE_FUNCTION (svcreate4, create, all_data, none) +DEF_SVE_FUNCTION (svcreate4, create, b, none) DEF_SVE_FUNCTION (svcvt, unary_convertxn, cvt, mxz) DEF_SVE_FUNCTION (svdiv, binary_opt_n, all_float_and_sd_integer, mxz) DEF_SVE_FUNCTION (svdivr, binary_opt_n, all_float_and_sd_integer, mxz) @@ -96,6 +97,7 @@ DEF_SVE_FUNCTION (svget2, get, all_data, none) DEF_SVE_FUNCTION (svget2, get, b, none) DEF_SVE_FUNCTION (svget3, get, all_data, none) DEF_SVE_FUNCTION (svget4, get, all_data, none) +DEF_SVE_FUNCTION (svget4, get, b, none) DEF_SVE_FUNCTION (svindex, binary_scalar, all_integer, none) DEF_SVE_FUNCTION (svinsr, binary_n, all_data, none) DEF_SVE_FUNCTION (svlasta, reduction, all_data, implicit) @@ -223,6 +225,7 @@ DEF_SVE_FUNCTION (svset2, set, all_data, none) DEF_SVE_FUNCTION (svset2, set, b, none) DEF_SVE_FUNCTION (svset3, set, all_data, none) DEF_SVE_FUNCTION (svset4, set, all_data, none) +DEF_SVE_FUNCTION (svset4, set, b, none) DEF_SVE_FUNCTION (svsplice, binary, all_data, implicit) DEF_SVE_FUNCTION (svsqrt, unary, all_float, mxz) DEF_SVE_FUNCTION (svst1, storexn, all_data, implicit) @@ -245,6 +248,7 @@ DEF_SVE_FUNCTION (svundef2, inherent, all_data, none) DEF_SVE_FUNCTION (svundef2, inherent, b, none) DEF_SVE_FUNCTION (svundef3, inherent, all_data, none) DEF_SVE_FUNCTION (svundef4, inherent, all_data, none) +DEF_SVE_FUNCTION (svundef4, inherent, b, none) DEF_SVE_FUNCTION (svunpkhi, unary_widen, hsd_integer, none) DEF_SVE_FUNCTION (svunpkhi, unary_widen, b, none) DEF_SVE_FUNCTION (svunpklo, unary_widen, hsd_integer, none) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 9fb0d6fd416..259e7b7975c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -4697,7 +4697,7 @@ handle_arm_sve_h (bool function_nulls_p) register_vector_type (type); if (type != VECTOR_TYPE_svcount_t) for (unsigned int count = 2; count <= MAX_TUPLE_SIZE; ++count) - if (type != VECTOR_TYPE_svbool_t || count == 2) + if (type != VECTOR_TYPE_svbool_t || count == 2 || count == 4) register_tuple_type (count, type); } diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 0955a697680..3d92a2a454f 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1074,9 +1074,9 @@ (define_insn_and_rewrite "*aarch64_sve_ptrue_ptest" ;; ---- Moves of multiple predicates ;; ------------------------------------------------------------------------- -(define_insn_and_split "movvnx32bi" - [(set (match_operand:VNx32BI 0 "nonimmediate_operand") - (match_operand:VNx32BI 1 "aarch64_mov_operand"))] +(define_insn_and_split "mov" + [(set (match_operand:SVE_STRUCT_BI 0 "nonimmediate_operand") + (match_operand:SVE_STRUCT_BI 1 "aarch64_mov_operand"))] "TARGET_SVE" {@ [ cons: =0 , 1 ] [ Upa , Upa ] # @@ -1086,7 +1086,7 @@ (define_insn_and_split "movvnx32bi" "&& reload_completed" [(const_int 0)] { - aarch64_split_double_move (operands[0], operands[1], VNx16BImode); + aarch64_split_move (operands[0], operands[1], VNx16BImode); DONE; } ) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 9347e06f0e9..e306f86f514 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -969,7 +969,7 @@ pure_scalable_type_info::piece::get_rtx (unsigned int first_zr, if (num_zr > 0 && num_pr == 0) return gen_rtx_REG (mode, first_zr); - if (num_zr == 0 && num_pr <= 2) + if (num_zr == 0 && num_pr > 0) return gen_rtx_REG (mode, first_pr); gcc_unreachable (); @@ -1684,6 +1684,7 @@ aarch64_classify_vector_mode (machine_mode mode, bool any_target_p = false) return (TARGET_FLOAT || any_target_p) ? VEC_ADVSIMD : 0; case E_VNx32BImode: + case E_VNx64BImode: return TARGET_SVE ? VEC_SVE_PRED | VEC_STRUCT : 0; default: @@ -1815,13 +1816,15 @@ aarch64_array_mode (machine_mode mode, unsigned HOST_WIDE_INT nelems) { if (TARGET_SVE && GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) { - /* Use VNx32BI for pairs of predicates, but explicitly reject giving - a mode to other array sizes. Using integer modes requires a round - trip through memory and generates terrible code. */ + /* Use VNx32BI and VNx64BI for tuples of predicates, but explicitly + reject giving a mode to other array sizes. Using integer modes + requires a round trip through memory and generates terrible code. */ if (nelems == 1) return mode; if (mode == VNx16BImode && nelems == 2) return VNx32BImode; + if (mode == VNx16BImode && nelems == 4) + return VNx64BImode; return BLKmode; } @@ -2094,7 +2097,7 @@ aarch64_hard_regno_nregs (unsigned regno, machine_mode mode) case PR_REGS: case PR_LO_REGS: case PR_HI_REGS: - return mode == VNx32BImode ? 2 : 1; + return mode == VNx64BImode ? 4 : mode == VNx32BImode ? 2 : 1; case MOVEABLE_SYSREGS: case FFR_REGS: @@ -3270,31 +3273,30 @@ aarch64_emit_binop (rtx dest, optab binoptab, rtx op0, rtx op1) emit_move_insn (dest, tmp); } -/* Split a move from SRC to DST into two moves of mode SINGLE_MODE. */ +/* Split a move from SRC to DST into multiple moves of mode SINGLE_MODE. */ void -aarch64_split_double_move (rtx dst, rtx src, machine_mode single_mode) +aarch64_split_move (rtx dst, rtx src, machine_mode single_mode) { machine_mode mode = GET_MODE (dst); + auto npieces = exact_div (GET_MODE_SIZE (mode), + GET_MODE_SIZE (single_mode)).to_constant (); + auto_vec dst_pieces, src_pieces; - rtx dst0 = simplify_gen_subreg (single_mode, dst, mode, 0); - rtx dst1 = simplify_gen_subreg (single_mode, dst, mode, - GET_MODE_SIZE (single_mode)); - rtx src0 = simplify_gen_subreg (single_mode, src, mode, 0); - rtx src1 = simplify_gen_subreg (single_mode, src, mode, - GET_MODE_SIZE (single_mode)); - - /* At most one pairing may overlap. */ - if (reg_overlap_mentioned_p (dst0, src1)) + for (unsigned int i = 0; i < npieces; ++i) { - aarch64_emit_move (dst1, src1); - aarch64_emit_move (dst0, src0); + auto off = i * GET_MODE_SIZE (single_mode); + dst_pieces.safe_push (simplify_gen_subreg (single_mode, dst, mode, off)); + src_pieces.safe_push (simplify_gen_subreg (single_mode, src, mode, off)); } + + /* At most one pairing may overlap. */ + if (reg_overlap_mentioned_p (dst_pieces[0], src)) + for (unsigned int i = npieces; i-- > 0;) + aarch64_emit_move (dst_pieces[i], src_pieces[i]); else - { - aarch64_emit_move (dst0, src0); - aarch64_emit_move (dst1, src1); - } + for (unsigned int i = 0; i < npieces; ++i) + aarch64_emit_move (dst_pieces[i], src_pieces[i]); } /* Split a 128-bit move operation into two 64-bit move operations, @@ -3338,7 +3340,7 @@ aarch64_split_128bit_move (rtx dst, rtx src) } } - aarch64_split_double_move (dst, src, word_mode); + aarch64_split_move (dst, src, word_mode); } /* Return true if we should split a move from 128-bit value SRC @@ -13172,7 +13174,7 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) case PR_REGS: case PR_LO_REGS: case PR_HI_REGS: - return mode == VNx32BImode ? 2 : 1; + return mode == VNx64BImode ? 4 : mode == VNx32BImode ? 2 : 1; case MOVEABLE_SYSREGS: case STACK_REG: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 4942631aa95..b8924cdc74b 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -556,6 +556,8 @@ (define_mode_iterator SVE_FULLx24 [SVE_FULLx2 SVE_FULLx4]) ;; All SVE vector structure modes. (define_mode_iterator SVE_STRUCT [SVE_FULLx2 SVE_FULLx3 SVE_FULLx4]) +(define_mode_iterator SVE_STRUCT_BI [VNx32BI VNx64BI]) + ;; All SVE vector and structure modes. (define_mode_iterator SVE_ALL_STRUCT [SVE_ALL SVE_STRUCT]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c index b5ffd4e6aaf..1d2ff4e871d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c @@ -145,3 +145,13 @@ TEST_CREATE (create4_u64, svuint64x4_t, svuint64_t, TEST_CREATE (create4_f64, svfloat64x4_t, svfloat64_t, z0 = svcreate4_f64 (z5, z4, z7, z6), z0 = svcreate4 (z5, z4, z7, z6)) + +/* This is awkward to code-generate, so don't match a particular output. */ +TEST_CREATE_B (create4_b_0, svboolx4_t, + p0_res = svcreate4_b (p0, p1, p2, p3), + p0_res = svcreate4 (p0, p1, p2, p3)) + +/* This is awkward to code-generate, so don't match a particular output. */ +TEST_CREATE_B (create4_b_1, svboolx4_t, + p0_res = svcreate4_b (p3, p2, p1, p0), + p0_res = svcreate4 (p3, p2, p1, p0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_b.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_b.c new file mode 100644 index 00000000000..146253aac3b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_b.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** get4_b_p0_0: +** mov p0\.b, p4\.b +** ret +*/ +TEST_GET_B (get4_b_p0_0, svboolx4_t, + p0 = svget4_b (p4, 0), + p0 = svget4 (p4, 0)) + +/* +** get4_b_p0_1: +** mov p0\.b, p5\.b +** ret +*/ +TEST_GET_B (get4_b_p0_1, svboolx4_t, + p0 = svget4_b (p4, 1), + p0 = svget4 (p4, 1)) + +/* +** get4_b_p0_2: +** mov p0\.b, p6\.b +** ret +*/ +TEST_GET_B (get4_b_p0_2, svboolx4_t, + p0 = svget4_b (p4, 2), + p0 = svget4 (p4, 2)) + +/* +** get4_b_p0_3: +** mov p0\.b, p7\.b +** ret +*/ +TEST_GET_B (get4_b_p0_3, svboolx4_t, + p0 = svget4_b (p4, 3), + p0 = svget4 (p4, 3)) + +/* +** get4_b_p4_0: +** ret +*/ +TEST_GET_B (get4_b_p4_0, svboolx4_t, + p4_res = svget4_b (p4, 0), + p4_res = svget4 (p4, 0)) + +/* +** get4_b_p4_3: +** mov p4\.b, p7\.b +** ret +*/ +TEST_GET_B (get4_b_p4_3, svboolx4_t, + p4_res = svget4_b (p4, 3), + p4_res = svget4 (p4, 3)) + +/* +** get4_b_p5_0: +** mov p5\.b, p4\.b +** ret +*/ +TEST_GET_B (get4_b_p5_0, svboolx4_t, + p5_res = svget4_b (p4, 0), + p5_res = svget4 (p4, 0)) + +/* +** get4_b_p5_1: +** ret +*/ +TEST_GET_B (get4_b_p5_1, svboolx4_t, + p5_res = svget4_b (p4, 1), + p5_res = svget4 (p4, 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_b.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_b.c new file mode 100644 index 00000000000..13efdf9bc2e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_b.c @@ -0,0 +1,87 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** set4_b_p8_0: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** mov p8\.b, p0\.b +** ret +*/ +TEST_SET_B (set4_b_p8_0, svboolx4_t, + p8 = svset4_b (p4, 0, p0), + p8 = svset4 (p4, 0, p0)) + +/* +** set4_b_p8_1: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** mov p9\.b, p0\.b +** ret +*/ +TEST_SET_B (set4_b_p8_1, svboolx4_t, + p8 = svset4_b (p4, 1, p0), + p8 = svset4 (p4, 1, p0)) + +/* +** set4_b_p8_2: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** mov p10\.b, p0\.b +** ret +*/ +TEST_SET_B (set4_b_p8_2, svboolx4_t, + p8 = svset4_b (p4, 2, p0), + p8 = svset4 (p4, 2, p0)) + +/* +** set4_b_p8_3: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** mov p11\.b, p0\.b +** ret +*/ +TEST_SET_B (set4_b_p8_3, svboolx4_t, + p8 = svset4_b (p4, 3, p0), + p8 = svset4 (p4, 3, p0)) + +/* +** set4_b_p4_0: +** mov p4\.b, p12\.b +** ret +*/ +TEST_SET_B (set4_b_p4_0, svboolx4_t, + p4 = svset4_b (p4, 0, p12), + p4 = svset4 (p4, 0, p12)) + +/* +** set4_b_p4_1: +** mov p5\.b, p13\.b +** ret +*/ +TEST_SET_B (set4_b_p4_1, svboolx4_t, + p4 = svset4_b (p4, 1, p13), + p4 = svset4 (p4, 1, p13)) + +/* +** set4_b_p4_2: +** mov p6\.b, p12\.b +** ret +*/ +TEST_SET_B (set4_b_p4_2, svboolx4_t, + p4 = svset4_b (p4, 2, p12), + p4 = svset4 (p4, 2, p12)) + +/* +** set4_b_p4_3: +** mov p7\.b, p13\.b +** ret +*/ +TEST_SET_B (set4_b_p4_3, svboolx4_t, + p4 = svset4_b (p4, 3, p13), + p4 = svset4 (p4, 3, p13)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index 367024be863..6c966a188de 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -543,6 +543,14 @@ return z0; \ } +#define TEST_UNDEF_B(NAME, TYPE, CODE) \ + PROTO (NAME, TYPE, (void)) \ + { \ + TYPE p0; \ + CODE; \ + return p0; \ + } + #define TEST_CREATE(NAME, TTYPE, ZTYPE, CODE1, CODE2) \ PROTO (NAME, TTYPE, (ZTYPE unused0, ZTYPE unused1, \ ZTYPE unused2, ZTYPE unused3, \ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c index fe6c4c7c7d5..2c520df99a3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c @@ -85,3 +85,10 @@ TEST_UNDEF (uint64, svuint64x2_t, */ TEST_UNDEF (float64, svfloat64x2_t, z0 = svundef2_f64 ()) + +/* +** bools: +** ret +*/ +TEST_UNDEF_B (bools, svboolx2_t, + p0 = svundef2_b ()) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c index 4d6b86b04b5..9bda4d66e89 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c @@ -85,3 +85,10 @@ TEST_UNDEF (uint64, svuint64x4_t, */ TEST_UNDEF (float64, svfloat64x4_t, z0 = svundef4_f64 ()) + +/* +** bools: +** ret +*/ +TEST_UNDEF_B (bools, svboolx4_t, + p0 = svundef4_b ()) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c index bf3dd5d7514..687327d7173 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c @@ -17,7 +17,7 @@ f1 (svint32x4_t *ptr, svbool_t pg, svint32_t s32, svfloat64_t f64, *ptr = svcreate4 (s32, x, s32, s32); /* { dg-error {passing 'int' to argument 2 of 'svcreate4', which expects an SVE type rather than a scalar} } */ *ptr = svcreate4 (x, s32, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svcreate4', which expects an SVE type rather than a scalar} } */ *ptr = svcreate4 (pg, s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcreate4', but argument 1 had type 'svbool_t'} } */ - *ptr = svcreate4 (pg, pg, pg, pg); /* { dg-error {'svcreate4' has no form that takes 'svbool_t' arguments} } */ + *ptr = svcreate4 (pg, pg, pg, pg); /* { dg-error {incompatible types when assigning to type 'svint32x4_t' from type 'svboolx4_t'} } */ *ptr = svcreate4 (s32, s32, s32, s32); *ptr = svcreate4 (f64, f64, f64, f64); /* { dg-error {incompatible types when assigning to type 'svint32x4_t' from type 'svfloat64x4_t'} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svboolx4_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svboolx4_1.c new file mode 100644 index 00000000000..498c0fa40a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svboolx4_1.c @@ -0,0 +1,117 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** ret_p0: +** ret +*/ +svboolx4_t +ret_p0 (svboolx4_t p0) +{ + return p0; +} + +/* +** ret_p1: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** mov p0\.b, p1\.b +** mov p1\.b, p2\.b +** mov p2\.b, p3\.b +** mov p3\.b, p4\.b +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +svboolx4_t +ret_p1 (void) +{ + register svboolx4_t p1 asm ("p1"); + asm volatile ("" : "=Upa" (p1)); + return p1; +} + +/* +** ret_mem: +** ( +** ldr p0, \[x0\] +** ldr p1, \[x0, #1, mul vl\] +** ldr p2, \[x0, #2, mul vl\] +** ldr p3, \[x0, #3, mul vl\] +** | +** ldr p3, \[x0, #3, mul vl\] +** ldr p2, \[x0, #2, mul vl\] +** ldr p1, \[x0, #1, mul vl\] +** ldr p0, \[x0\] +** ) +** ret +*/ +svboolx4_t +ret_mem (svboolx4_t p0, svboolx4_t mem) +{ + return mem; +} + +/* +** load: +** ( +** ldr p0, \[x0\] +** ldr p1, \[x0, #1, mul vl\] +** ldr p2, \[x0, #2, mul vl\] +** ldr p3, \[x0, #3, mul vl\] +** | +** ldr p3, \[x0, #2, mul vl\] +** ldr p2, \[x0, #3, mul vl\] +** ldr p1, \[x0, #1, mul vl\] +** ldr p0, \[x0\] +** ) +** ret +*/ +svboolx4_t +load (svboolx4_t *ptr) +{ + return *ptr; +} + +/* +** store: +** ( +** str p0, \[x0\] +** str p1, \[x0, #1, mul vl\] +** str p2, \[x0, #2, mul vl\] +** str p3, \[x0, #3, mul vl\] +** | +** str p3, \[x0, #3, mul vl\] +** str p2, \[x0, #2, mul vl\] +** str p1, \[x0, #1, mul vl\] +** str p0, \[x0\] +** ) +** ret +*/ +void +store (svboolx4_t p0, svboolx4_t *ptr) +{ + *ptr = p0; +} + +/* +** p0_to_p1: +** addvl sp, sp, #-1 +** str p4, \[sp\] +** mov p4\.b, p3\.b +** mov p3\.b, p2\.b +** mov p2\.b, p1\.b +** mov p1\.b, p0\.b +** ldr p4, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void +p0_to_p1 (svboolx4_t p0) +{ + register svboolx4_t p1 asm ("p1") = p0; + asm volatile ("" :: "Upa" (p1)); +} From patchwork Wed Nov 6 18:21:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007656 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD7z6PdTz1xyS for ; Thu, 7 Nov 2024 05:22:23 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1BF7F3858D21 for ; Wed, 6 Nov 2024 18:22:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9BFFD3858402; Wed, 6 Nov 2024 18:21:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9BFFD3858402 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9BFFD3858402 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917269; cv=none; b=W7dDBRX1PoYcImgWfLFw32jlH+u2an5L4ZHU42tdsdU5nxDkEW++b0BbREMAuCjDxCLRlQtrNemJnD+NA25ju323RPAi4uM0jYzSGg7r3MwfvVBN3ZwBFrypEz6avoNH3/ibjkcXgYeyDLBNvCf9B2RBRcqUYs5TT9RGtuHQwbM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917269; c=relaxed/simple; bh=cFPU7A9NiEKpT6D90HLriR/PFv1E98tBK9Qzw/AsRnA=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=Gxh5DVYiDutGcNGtmJNECtU+6hyEtqydTkAoMEqmxgKS8H9+8pH1v3OHuYRG/P/yB0uB6kPj2ug1126xiSw5omcbFH+PXWwwKYEp+FajhUQazhMOZ5iGi/s6es5I68lupuTqSIKMwbSXAVZSTDK7Wrg4GMBX/DJDZiTWkRD8FEI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C999497; Wed, 6 Nov 2024 10:21:36 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B99633F66E; Wed, 6 Nov 2024 10:21:05 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 11/15] aarch64: Define arm_neon.h types in arm_sve.h too In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:21:04 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch moves the scalar and single-vector Advanced SIMD types from arm_neon.h into a private header, so that they can be defined by arm_sve.h as well. This is needed for the upcoming SVE2.1 hybrid-VLA reductions, which return 128-bit Advanced SIMD vectors. The approach follows Claudio's patch for FP8. gcc/ * config.gcc (extra_headers): Add arm_private_neon_types.h. * config/aarch64/arm_private_neon_types.h: New file, split out from... * config/aarch64/arm_neon.h: ...here. * config/aarch64/arm_sve.h: Include arm_private_neon_types.h --- gcc/config.gcc | 2 +- gcc/config/aarch64/arm_neon.h | 49 +------------ gcc/config/aarch64/arm_private_neon_types.h | 79 +++++++++++++++++++++ gcc/config/aarch64/arm_sve.h | 5 +- 4 files changed, 84 insertions(+), 51 deletions(-) create mode 100644 gcc/config/aarch64/arm_private_neon_types.h diff --git a/gcc/config.gcc b/gcc/config.gcc index 1b0637d7ff8..7e0108e2154 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -347,7 +347,7 @@ m32c*-*-*) ;; aarch64*-*-*) cpu_type=aarch64 - extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h" + extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h arm_private_neon_types.h" c_target_objs="aarch64-c.o" cxx_target_objs="aarch64-c.o" d_target_objs="aarch64-d.o" diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index d3533f3ee6f..c727302ac75 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -30,58 +30,15 @@ #pragma GCC push_options #pragma GCC target ("+nothing+simd") +#include #include -#pragma GCC aarch64 "arm_neon.h" +#include -#include +#pragma GCC aarch64 "arm_neon.h" #define __AARCH64_UINT64_C(__C) ((uint64_t) __C) #define __AARCH64_INT64_C(__C) ((int64_t) __C) -typedef __Int8x8_t int8x8_t; -typedef __Int16x4_t int16x4_t; -typedef __Int32x2_t int32x2_t; -typedef __Int64x1_t int64x1_t; -typedef __Float16x4_t float16x4_t; -typedef __Float32x2_t float32x2_t; -typedef __Poly8x8_t poly8x8_t; -typedef __Poly16x4_t poly16x4_t; -typedef __Uint8x8_t uint8x8_t; -typedef __Uint16x4_t uint16x4_t; -typedef __Uint32x2_t uint32x2_t; -typedef __Float64x1_t float64x1_t; -typedef __Uint64x1_t uint64x1_t; -typedef __Int8x16_t int8x16_t; -typedef __Int16x8_t int16x8_t; -typedef __Int32x4_t int32x4_t; -typedef __Int64x2_t int64x2_t; -typedef __Float16x8_t float16x8_t; -typedef __Float32x4_t float32x4_t; -typedef __Float64x2_t float64x2_t; -typedef __Poly8x16_t poly8x16_t; -typedef __Poly16x8_t poly16x8_t; -typedef __Poly64x2_t poly64x2_t; -typedef __Poly64x1_t poly64x1_t; -typedef __Uint8x16_t uint8x16_t; -typedef __Uint16x8_t uint16x8_t; -typedef __Uint32x4_t uint32x4_t; -typedef __Uint64x2_t uint64x2_t; - -typedef __Poly8_t poly8_t; -typedef __Poly16_t poly16_t; -typedef __Poly64_t poly64_t; -typedef __Poly128_t poly128_t; - -typedef __Mfloat8x8_t mfloat8x8_t; -typedef __Mfloat8x16_t mfloat8x16_t; - -typedef __fp16 float16_t; -typedef float float32_t; -typedef double float64_t; - -typedef __Bfloat16x4_t bfloat16x4_t; -typedef __Bfloat16x8_t bfloat16x8_t; - /* __aarch64_vdup_lane internal macros. */ #define __aarch64_vdup_lane_any(__size, __q, __a, __b) \ vdup##__q##_n_##__size (__aarch64_vget_lane_any (__a, __b)) diff --git a/gcc/config/aarch64/arm_private_neon_types.h b/gcc/config/aarch64/arm_private_neon_types.h new file mode 100644 index 00000000000..0f588f026b7 --- /dev/null +++ b/gcc/config/aarch64/arm_private_neon_types.h @@ -0,0 +1,79 @@ +/* AArch64 type definitions for arm_neon.h + Do not include this file directly. Use one of arm_neon.h, arm_sme.h, + or arm_sve.h instead. + + Copyright (C) 2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _GCC_ARM_PRIVATE_NEON_TYPES_H +#define _GCC_ARM_PRIVATE_NEON_TYPES_H + +#if !defined(_AARCH64_NEON_H_) && !defined(_ARM_SVE_H_) +#error "This file should not be used standalone. Please include one of arm_neon.h arm_sve.h arm_sme.h instead." +#endif + +typedef __Int8x8_t int8x8_t; +typedef __Int16x4_t int16x4_t; +typedef __Int32x2_t int32x2_t; +typedef __Int64x1_t int64x1_t; +typedef __Float16x4_t float16x4_t; +typedef __Float32x2_t float32x2_t; +typedef __Poly8x8_t poly8x8_t; +typedef __Poly16x4_t poly16x4_t; +typedef __Uint8x8_t uint8x8_t; +typedef __Uint16x4_t uint16x4_t; +typedef __Uint32x2_t uint32x2_t; +typedef __Float64x1_t float64x1_t; +typedef __Uint64x1_t uint64x1_t; +typedef __Int8x16_t int8x16_t; +typedef __Int16x8_t int16x8_t; +typedef __Int32x4_t int32x4_t; +typedef __Int64x2_t int64x2_t; +typedef __Float16x8_t float16x8_t; +typedef __Float32x4_t float32x4_t; +typedef __Float64x2_t float64x2_t; +typedef __Poly8x16_t poly8x16_t; +typedef __Poly16x8_t poly16x8_t; +typedef __Poly64x2_t poly64x2_t; +typedef __Poly64x1_t poly64x1_t; +typedef __Uint8x16_t uint8x16_t; +typedef __Uint16x8_t uint16x8_t; +typedef __Uint32x4_t uint32x4_t; +typedef __Uint64x2_t uint64x2_t; + +typedef __Poly8_t poly8_t; +typedef __Poly16_t poly16_t; +typedef __Poly64_t poly64_t; +typedef __Poly128_t poly128_t; + +typedef __Mfloat8x8_t mfloat8x8_t; +typedef __Mfloat8x16_t mfloat8x16_t; + +typedef __fp16 float16_t; +typedef float float32_t; +typedef double float64_t; + +typedef __Bfloat16x4_t bfloat16x4_t; +typedef __Bfloat16x8_t bfloat16x8_t; + +#endif diff --git a/gcc/config/aarch64/arm_sve.h b/gcc/config/aarch64/arm_sve.h index aa0bd9909f9..a887c0f2f45 100644 --- a/gcc/config/aarch64/arm_sve.h +++ b/gcc/config/aarch64/arm_sve.h @@ -27,12 +27,9 @@ #include #include +#include #include -typedef __fp16 float16_t; -typedef float float32_t; -typedef double float64_t; - /* NOTE: This implementation of arm_sve.h is intentionally short. It does not define the SVE types and intrinsic functions directly in C and C++ code, but instead uses the following pragma to tell GCC to insert the From patchwork Wed Nov 6 18:22:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007667 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD8m1tyPz1xyX for ; Thu, 7 Nov 2024 05:23:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 70B943858C39 for ; Wed, 6 Nov 2024 18:23:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 78B4E3858C2B; Wed, 6 Nov 2024 18:22:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78B4E3858C2B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 78B4E3858C2B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917355; cv=none; b=w00UMHWrwE8ytk8QtBuGvrk0Tjd2UVR1h2ek44H4XvkcZ9cL7CM78Mk9qZjlmV+CfY1SG//e7xsK+kJ9yQQShrNAnXnxdIpsMyvi9Uj8tbqtAgPBbtXU1oLI6BN+3XFI/v17aYIZgWj31YW69ZazWzC+1weBSBmhysEOPv6aBDQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917355; c=relaxed/simple; bh=uo/MZk2iqm44OrTaV2NvGmmf6Z6D/GpKusoP/rosQoA=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=CAZq7D0egtjUC7GQQg3Soueu3sVpsOpXtvYD3jU9ocXHTsCJqSS1njheNPorbfudoczY7ndVU9eiBjDZ70fBzLIvAK6avRFn+U5kIiSV9H89EBZt/DCpj2mDZx0JEge0vRuXUDOpj9nJykQtQjQosmZsGGw7fDAOnsolsxsq+BI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1E56A497; Wed, 6 Nov 2024 10:23:02 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 77A513F66E; Wed, 6 Nov 2024 10:22:31 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 12/15] aarch64: Add common subset of SVE2p1 and SME In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:22:30 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Some instructions that were previously restricted to streaming mode can also be used in non-streaming mode with SVE2.1. This patch adds support for those, as well as the usual new-extension boilerplate. A later patch will add the feature macro. gcc/ * config/aarch64/aarch64-option-extensions.def (sve2p1): New extension. * config/aarch64/aarch64-sve-builtins-sve2.def: Mark instructions that are common to both SVE2p1 and SME. * config/aarch64/aarch64.h (TARGET_SVE2p1): New macro. (TARGET_SVE2p1_OR_SME): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_sve_psel): Require TARGET_SVE2p1_OR_SME instead of TARGET_STREAMING. (*aarch64_sve_psel_plus): Likewise. (@aarch64_sve_clamp): Likewise. (*aarch64_sve_clamp_x): Likewise. (@aarch64_pred_): Likewise. (@cond_): Likewise. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_aarch64_asm_sve2p1_ok): New procedure. * gcc.target/aarch64/sve/clamp_1.c: New test. * gcc.target/aarch64/sve/clamp_2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_b16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_b32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_b64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_b8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/psel_lane_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/revd_u8.c: Likewise. --- .../aarch64/aarch64-option-extensions.def | 2 + .../aarch64/aarch64-sve-builtins-sve2.def | 2 +- gcc/config/aarch64/aarch64-sve2.md | 12 +-- gcc/config/aarch64/aarch64.h | 9 ++ .../gcc.target/aarch64/sve/clamp_1.c | 40 ++++++++ .../gcc.target/aarch64/sve/clamp_2.c | 34 +++++++ .../aarch64/sve2/acle/asm/clamp_s16.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_s32.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_s64.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_s8.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_u16.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_u32.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_u64.c | 46 +++++++++ .../aarch64/sve2/acle/asm/clamp_u8.c | 46 +++++++++ .../aarch64/sve2/acle/asm/psel_lane_b16.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_b32.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_b64.c | 84 +++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_b8.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_c16.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_c32.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_c64.c | 84 +++++++++++++++++ .../aarch64/sve2/acle/asm/psel_lane_c8.c | 93 +++++++++++++++++++ .../aarch64/sve2/acle/asm/revd_bf16.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_f16.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_f32.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_f64.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_s16.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_s32.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_s64.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_s8.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_u16.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_u32.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_u64.c | 80 ++++++++++++++++ .../aarch64/sve2/acle/asm/revd_u8.c | 80 ++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 10 ++ 35 files changed, 2156 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/clamp_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/clamp_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_b16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_b32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_b64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_b8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/psel_lane_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_u8.c diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index 8279f5a76ea..c9d419afc8f 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -192,6 +192,8 @@ AARCH64_OPT_EXTENSION("sve2-sm4", SVE2_SM4, (SVE2, SM4), (), (), "svesm4") AARCH64_FMV_FEATURE("sve2-sm4", SVE_SM4, (SVE2_SM4)) +AARCH64_OPT_EXTENSION("sve2p1", SVE2p1, (SVE2), (), (), "") + AARCH64_OPT_FMV_EXTENSION("sme", SME, (BF16, SVE2), (), (), "sme") AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "") diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 12548fe39cb..5cc32aa8871 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -220,7 +220,7 @@ DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) DEF_SVE_FUNCTION (svsm4ekey, binary, s_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS streaming_only (0) +#define REQUIRED_EXTENSIONS sve_and_sme (AARCH64_FL_SVE2p1, 0) DEF_SVE_FUNCTION (svclamp, clamp, all_integer, none) DEF_SVE_FUNCTION (svpsel_lane, select_pred, all_pred_count, none) DEF_SVE_FUNCTION (svrevd, unary, all_data, mxz) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index a7b29daeba4..fd4bd42b6d9 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -418,7 +418,7 @@ (define_insn "@aarch64_sve_psel" (match_operand:SI 3 "register_operand" "Ucj") (const_int BHSD_BITS)] UNSPEC_PSEL))] - "TARGET_STREAMING" + "TARGET_SVE2p1_OR_SME" "psel\t%0, %1, %2.[%w3, 0]" ) @@ -432,7 +432,7 @@ (define_insn "*aarch64_sve_psel_plus" (match_operand:SI 4 "const_int_operand")) (const_int BHSD_BITS)] UNSPEC_PSEL))] - "TARGET_STREAMING + "TARGET_SVE2p1_OR_SME && UINTVAL (operands[4]) < 128 / " "psel\t%0, %1, %2.[%w3, %4]" ) @@ -560,7 +560,7 @@ (define_insn "@aarch64_sve_clamp" (match_operand:SVE_FULL_I 1 "register_operand") (match_operand:SVE_FULL_I 2 "register_operand")) (match_operand:SVE_FULL_I 3 "register_operand")))] - "TARGET_STREAMING" + "TARGET_SVE2p1_OR_SME" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] clamp\t%0., %2., %3. [ ?&w, w, w, w; yes ] movprfx\t%0, %1\;clamp\t%0., %2., %3. @@ -580,7 +580,7 @@ (define_insn_and_split "*aarch64_sve_clamp_x" UNSPEC_PRED_X) (match_operand:SVE_FULL_I 3 "register_operand"))] UNSPEC_PRED_X))] - "TARGET_STREAMING" + "TARGET_SVE2p1_OR_SME" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] # [ ?&w, w, w, w; yes ] # @@ -3182,7 +3182,7 @@ (define_insn "@aarch64_pred_" [(match_operand:SVE_FULL 2 "register_operand")] UNSPEC_REVD_ONLY)] UNSPEC_PRED_X))] - "TARGET_STREAMING" + "TARGET_SVE2p1_OR_SME" {@ [ cons: =0 , 1 , 2 ; attrs: movprfx ] [ w , Upl , 0 ; * ] revd\t%0.q, %1/m, %2.q [ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;revd\t%0.q, %1/m, %2.q @@ -3198,7 +3198,7 @@ (define_insn "@cond_" UNSPEC_REVD_ONLY) (match_operand:SVE_FULL 3 "register_operand")] UNSPEC_SEL))] - "TARGET_STREAMING" + "TARGET_SVE2p1_OR_SME" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , Upl , w , 0 ; * ] revd\t%0.q, %1/m, %2.q [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %3\;revd\t%0.q, %1/m, %2.q diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index d17f40ce22e..404efa16c28 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -338,6 +338,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* SVE2 SM4 instructions, enabled through +sve2-sm4. */ #define TARGET_SVE2_SM4 (AARCH64_HAVE_ISA (SVE2_SM4) && TARGET_NON_STREAMING) +/* SVE2p1 instructions, enabled through +sve2p1. */ +#define TARGET_SVE2p1 AARCH64_HAVE_ISA (SVE2p1) + /* SME instructions, enabled through +sme. Note that this does not imply anything about the state of PSTATE.SM; instructions that require SME and streaming mode should use TARGET_STREAMING instead. */ @@ -481,6 +484,12 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* fp8 instructions are enabled through +fp8. */ #define TARGET_FP8 AARCH64_HAVE_ISA (FP8) +/* Combinatorial tests. */ + +/* There's no need to check TARGET_SME for streaming or streaming-compatible + functions, since streaming mode itself implies SME. */ +#define TARGET_SVE2p1_OR_SME (TARGET_SVE2p1 || TARGET_STREAMING) + /* Standard register usage. */ /* 31 64-bit general purpose registers R0-R30: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clamp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/clamp_1.c new file mode 100644 index 00000000000..92fef098865 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/clamp_1.c @@ -0,0 +1,40 @@ +// { dg-options "-O" } + +#include + +#pragma GCC target "+sve2p1" + +#define TEST(TYPE) \ + TYPE \ + tied1_##TYPE(TYPE a, TYPE b, TYPE c) \ + { \ + return svmin_x(svptrue_b8(), svmax_x(svptrue_b8(), a, b), c); \ + } \ + \ + TYPE \ + tied2_##TYPE(TYPE a, TYPE b, TYPE c) \ + { \ + return svmin_x(svptrue_b8(), svmax_x(svptrue_b8(), b, a), c); \ + } + +TEST(svint8_t) +TEST(svint16_t) +TEST(svint32_t) +TEST(svint64_t) + +TEST(svuint8_t) +TEST(svuint16_t) +TEST(svuint32_t) +TEST(svuint64_t) + +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.b, z1\.b, z2\.b\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.h, z1\.h, z2\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.s, z1\.s, z2\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.d, z1\.d, z2\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.b, z1\.b, z2\.b\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.h, z1\.h, z2\.h\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.s, z1\.s, z2\.s\n} 2 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.d, z1\.d, z2\.d\n} 2 } } */ + +/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clamp_2.c b/gcc/testsuite/gcc.target/aarch64/sve/clamp_2.c new file mode 100644 index 00000000000..f96c0046465 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/clamp_2.c @@ -0,0 +1,34 @@ +// { dg-options "-O" } + +#include + +#pragma GCC target "+sve2p1" + +#define TEST(TYPE) \ + TYPE \ + untied_##TYPE(TYPE a, TYPE b, TYPE c, TYPE d) \ + { \ + return svmin_x(svptrue_b8(), svmax_x(svptrue_b8(), b, c), d); \ + } + +TEST(svint8_t) +TEST(svint16_t) +TEST(svint32_t) +TEST(svint64_t) + +TEST(svuint8_t) +TEST(svuint16_t) +TEST(svuint32_t) +TEST(svuint64_t) + +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.b, z2\.b, z3\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.h, z2\.h, z3\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.s, z2\.s, z3\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tsclamp\tz0\.d, z2\.d, z3\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.b, z2\.b, z3\.b\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.h, z2\.h, z3\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.s, z2\.s, z3\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tuclamp\tz0\.d, z2\.d, z3\.d\n} 1 } } */ + +/* { dg-final { scan-assembler-times {\tmovprfx\tz0, z1\n} 8 } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 75703ddca60..a8833d585c6 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -12100,6 +12100,16 @@ foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" }] } +proc check_effective_target_aarch64_asm_sve2p1_ok { } { + if { [istarget aarch64*-*-*] } { + return [check_no_compiler_messages aarch64_sve2p1_assembler object { + __asm__ (".arch_extension sve2p1; ld1w {z0.q},p7/z,[x0]"); + } "-march=armv8-a+sve2p1"] + } else { + return 0 + } +} + proc check_effective_target_aarch64_small { } { if { [istarget aarch64*-*-*] } { return [check_no_compiler_messages aarch64_small object { From patchwork Wed Nov 6 18:23:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007670 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkDB54cFQz1xyS for ; Thu, 7 Nov 2024 05:24:13 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A95423858CD1 for ; Wed, 6 Nov 2024 18:24:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 569463858D34; Wed, 6 Nov 2024 18:23:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 569463858D34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 569463858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917430; cv=none; b=kFHZHKcM76SyG8mZDRdMo1rvH4ujOvcOAO5mvWxHIGXFofzOviPXIIN+k5W+HceHd4XVt04RdArHFeQMY6WrpCaYnag386JWy3x4j1QHIXlz25OKEacJMhAsliwQWpA663RC+zgi4uHlwfLf9WxC0Nj4slNqpMlGiEx0eR8cOVI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917430; c=relaxed/simple; bh=Pr8MEAy2VYjOORqWK6oEvm5kC9mmdUzel3ECjfiGSzw=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=BtOIcRaIiue1VXHWtvioWPxdLmtlN+4QkWYO1szjFes0joURH63ORpuAuDc04YFRoyWLR1zO9tu5Q8oJwbT1/CaZ08epplSbiNenAsoiKLUSyNpK7v0HfgmL9npWyABf8Fgb3LfKQALclFpVmAlZmW4mLEf+UjFyvAOeqMxE+iI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F0879497; Wed, 6 Nov 2024 10:24:10 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 06E693F66E; Wed, 6 Nov 2024 10:23:39 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 13/15] aarch64: Add common subset of SVE2p1 and SME2 In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:23:38 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch handles the SVE2p1 instructions that are shared with SME2. This includes the consecutive-register forms of the 2-register and 4-register loads and stores, but not the strided-register forms. gcc/ * config/aarch64/aarch64.h (TARGET_SVE2p1_OR_SME2): New macro. * config/aarch64/aarch64-early-ra.cc (is_stride_candidate): Require TARGET_STREAMING_SME2 (early_ra::maybe_convert_to_strided_access): Likewise. * config/aarch64/aarch64-sve-builtins-sve2.def: Mark instructions that are common to both SVE2p1 and SME2. * config/aarch64/aarch64-sve.md (@aarch64_dot_prod_lane): Test TARGET_SVE2p1_OR_SME2 instead of TARGET_STREAMING_SME2. (@aarch64_sve_vnx4sf): Move TARGET_SVE_BF16 condition into SVE_BFLOAT_TERNARY_LONG. (@aarch64_sve__lanevnx4sf): Likewise SVE_BFLOAT_TERNARY_LONG_LANE. * config/aarch64/aarch64-sve2.md (@aarch64_): Require TARGET_SVE2p1_OR_SME2 instead of TARGET_STREAMING_SME2. (@aarch64_): Likewise. (@aarch64_sve_ptrue_c): Likewise. (@aarch64_sve_pext): Likewise. (@aarch64_sve_pextx2): Likewise. (@aarch64_sve_cntp_c): Likewise. (@aarch64_sve_fclamp): Likewise. (*aarch64_sve_fclamp_x): Likewise. (dot_prodvnx4sivnx8hi): Likewise. (aarch64_sve_fdotvnx4sfvnx8hf): Likewise. (aarch64_fdot_prod_lanevnx4sfvnx8hf): Likewise. (@aarch64_sve_while_b_x2): Likewise. (@aarch64_sve_while_c): Likewise. (@aarch64_sve_): Move TARGET_STREAMING_SME2 condition into SVE_QCVTxN. (@aarch64_sve_): Likewise SVE2_INT_SHIFT_IMM_NARROWxN, but also require TARGET_STREAMING_SME2 for the 4-register forms. * config/aarch64/iterators.md (SVE_BFLOAT_TERNARY_LONG): Require TARGET_SVE2p1_OR_SME2 rather than TARGET_STREAMING_SME2 for UNSPEC_BFMLSLB and UNSPEC_BFMLSLT. Require TARGET_SVE_BF16 for the others. (SVE_BFLOAT_TERNARY_LONG_LANE): Likewise. (SVE2_INT_SHIFT_IMM_NARROWxN): Require TARGET_SVE2p1_OR_SME2 for the interleaving forms and TARGET_STREAMING_SME2 for the rest. (SVE_QCVTxN): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/clamp_3.c: New test. * gcc.target/aarch64/sve/clamp_4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bfmlslb_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bfmlslb_lane_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bfmlslt_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bfmlslt_lane_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cntp_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cntp_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cntp_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cntp_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_lane_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_lane_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_lane_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_bf16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_bf16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_f64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_s8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1_u8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_bf16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_bf16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_f64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_s8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_u8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pext_lane_c8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ptrue_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ptrue_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ptrue_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ptrue_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qcvtn_s16_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qcvtn_u16_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qcvtn_u16_u32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qrshrn_s16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qrshrn_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/qrshrun_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_bf16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_bf16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_f64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_s8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1_u8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_bf16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_bf16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_f64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_s8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u16_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u32_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u64_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_u8_x4.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_b16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_b32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_b64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_b8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilege_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_b16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_b32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_b64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_b8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilegt_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_b16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_b32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_b64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_b8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilele_c8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_b16_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_b32_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_b64_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_b8_x2.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_c16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_c32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_c64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilelt_c8.c: Likewise. --- gcc/config/aarch64/aarch64-early-ra.cc | 9 +- .../aarch64/aarch64-sve-builtins-sve2.def | 45 ++- gcc/config/aarch64/aarch64-sve.md | 10 +- gcc/config/aarch64/aarch64-sve2.md | 38 +- gcc/config/aarch64/aarch64.h | 4 + gcc/config/aarch64/iterators.md | 44 ++- .../gcc.target/aarch64/sve/clamp_3.c | 28 ++ .../gcc.target/aarch64/sve/clamp_4.c | 22 ++ .../aarch64/sve2/acle/asm/bfmlslb_f32.c | 72 ++++ .../aarch64/sve2/acle/asm/bfmlslb_lane_f32.c | 91 +++++ .../aarch64/sve2/acle/asm/bfmlslt_f32.c | 72 ++++ .../aarch64/sve2/acle/asm/bfmlslt_lane_f32.c | 91 +++++ .../aarch64/sve2/acle/asm/clamp_f16.c | 49 +++ .../aarch64/sve2/acle/asm/clamp_f32.c | 49 +++ .../aarch64/sve2/acle/asm/clamp_f64.c | 49 +++ .../aarch64/sve2/acle/asm/cntp_c16.c | 46 +++ .../aarch64/sve2/acle/asm/cntp_c32.c | 46 +++ .../aarch64/sve2/acle/asm/cntp_c64.c | 46 +++ .../aarch64/sve2/acle/asm/cntp_c8.c | 46 +++ .../aarch64/sve2/acle/asm/dot_f32.c | 51 +++ .../aarch64/sve2/acle/asm/dot_lane_f32.c | 100 +++++ .../aarch64/sve2/acle/asm/dot_lane_s32.c | 100 +++++ .../aarch64/sve2/acle/asm/dot_lane_u32.c | 100 +++++ .../aarch64/sve2/acle/asm/dot_s32.c | 51 +++ .../aarch64/sve2/acle/asm/dot_u32.c | 51 +++ .../aarch64/sve2/acle/asm/ld1_bf16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_bf16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_f16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_f16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_f32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_f32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_f64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_f64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_s16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_s16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_s32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_s32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_s64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_s64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_s8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_s8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_u16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_u16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_u32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_u32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_u64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_u64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld1_u8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ld1_u8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_bf16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_bf16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_f64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_s8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/ldnt1_u8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/pext_lane_c16.c | 57 +++ .../aarch64/sve2/acle/asm/pext_lane_c16_x2.c | 61 +++ .../aarch64/sve2/acle/asm/pext_lane_c32.c | 57 +++ .../aarch64/sve2/acle/asm/pext_lane_c32_x2.c | 61 +++ .../aarch64/sve2/acle/asm/pext_lane_c64.c | 57 +++ .../aarch64/sve2/acle/asm/pext_lane_c64_x2.c | 61 +++ .../aarch64/sve2/acle/asm/pext_lane_c8.c | 57 +++ .../aarch64/sve2/acle/asm/pext_lane_c8_x2.c | 61 +++ .../aarch64/sve2/acle/asm/ptrue_c16.c | 48 +++ .../aarch64/sve2/acle/asm/ptrue_c32.c | 48 +++ .../aarch64/sve2/acle/asm/ptrue_c64.c | 48 +++ .../aarch64/sve2/acle/asm/ptrue_c8.c | 48 +++ .../aarch64/sve2/acle/asm/qcvtn_s16_s32_x2.c | 57 +++ .../aarch64/sve2/acle/asm/qcvtn_u16_s32_x2.c | 57 +++ .../aarch64/sve2/acle/asm/qcvtn_u16_u32_x2.c | 57 +++ .../aarch64/sve2/acle/asm/qrshrn_s16_x2.c | 57 +++ .../aarch64/sve2/acle/asm/qrshrn_u16_x2.c | 57 +++ .../aarch64/sve2/acle/asm/qrshrun_u16_x2.c | 57 +++ .../aarch64/sve2/acle/asm/st1_bf16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_bf16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_f16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_f16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_f32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_f32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_f64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_f64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_s16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_s16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_s32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_s32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_s64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_s64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_s8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_s8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_u16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_u16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_u32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_u32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_u64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_u64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st1_u8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/st1_u8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_bf16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_bf16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_f64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_s8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u16_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u16_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u32_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u32_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u64_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u64_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u8_x2.c | 269 +++++++++++++ .../aarch64/sve2/acle/asm/stnt1_u8_x4.c | 361 ++++++++++++++++++ .../aarch64/sve2/acle/asm/whilege_b16_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilege_b32_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilege_b64_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilege_b8_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilege_c16.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilege_c32.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilege_c64.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilege_c8.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilegt_b16_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilegt_b32_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilegt_b64_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilegt_b8_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilegt_c16.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilegt_c32.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilegt_c64.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilegt_c8.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilele_b16_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilele_b32_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilele_b64_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilele_b8_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilele_c16.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilele_c32.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilele_c64.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilele_c8.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilelt_b16_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilelt_b32_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilelt_b64_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilelt_b8_x2.c | 126 ++++++ .../aarch64/sve2/acle/asm/whilelt_c16.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilelt_c32.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilelt_c64.c | 124 ++++++ .../aarch64/sve2/acle/asm/whilelt_c8.c | 124 ++++++ 171 files changed, 36491 insertions(+), 65 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/clamp_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/clamp_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bfmlslb_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bfmlslb_lane_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bfmlslt_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bfmlslt_lane_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cntp_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cntp_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cntp_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cntp_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_f64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_s8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_u8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_f64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_s8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_u8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pext_lane_c8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ptrue_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ptrue_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ptrue_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ptrue_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qcvtn_s16_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qcvtn_u16_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qcvtn_u16_u32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qrshrn_s16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qrshrn_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/qrshrun_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_f64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_s8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1_u8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_f64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_s8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u32_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u64_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_u8_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_b16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_b32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_b64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_b8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilege_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_b16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_b32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_b64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_b8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilegt_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_b16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_b32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_b64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_b8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilele_c8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_b16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_b32_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_b64_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_b8_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_c16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_c32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_c64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilelt_c8.c diff --git a/gcc/config/aarch64/aarch64-early-ra.cc b/gcc/config/aarch64/aarch64-early-ra.cc index bbd84686e13..0db8ea24389 100644 --- a/gcc/config/aarch64/aarch64-early-ra.cc +++ b/gcc/config/aarch64/aarch64-early-ra.cc @@ -1062,8 +1062,9 @@ is_stride_candidate (rtx_insn *insn) return false; auto stride_type = get_attr_stride_type (insn); - return (stride_type == STRIDE_TYPE_LD1_CONSECUTIVE - || stride_type == STRIDE_TYPE_ST1_CONSECUTIVE); + return (TARGET_STREAMING_SME2 + && (stride_type == STRIDE_TYPE_LD1_CONSECUTIVE + || stride_type == STRIDE_TYPE_ST1_CONSECUTIVE)); } // Go through the constraints of INSN, which has already been extracted, @@ -3213,9 +3214,9 @@ early_ra::maybe_convert_to_strided_access (rtx_insn *insn) auto stride_type = get_attr_stride_type (insn); rtx pat = PATTERN (insn); rtx op; - if (stride_type == STRIDE_TYPE_LD1_CONSECUTIVE) + if (TARGET_STREAMING_SME2 && stride_type == STRIDE_TYPE_LD1_CONSECUTIVE) op = SET_DEST (pat); - else if (stride_type == STRIDE_TYPE_ST1_CONSECUTIVE) + else if (TARGET_STREAMING_SME2 && stride_type == STRIDE_TYPE_ST1_CONSECUTIVE) op = XVECEXP (SET_SRC (pat), 0, 1); else return false; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 5cc32aa8871..9e8aad957d5 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -226,40 +226,53 @@ DEF_SVE_FUNCTION (svpsel_lane, select_pred, all_pred_count, none) DEF_SVE_FUNCTION (svrevd, unary, all_data, mxz) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS streaming_only (AARCH64_FL_SME2) -DEF_SVE_FUNCTION_GS (svadd, binary_single, all_integer, x24, none) +#define REQUIRED_EXTENSIONS sve_and_sme (AARCH64_FL_SVE2p1, AARCH64_FL_SME2) DEF_SVE_FUNCTION (svbfmlslb, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlslb_lane, ternary_bfloat_lane, s_float, none) DEF_SVE_FUNCTION (svbfmlslt, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlslt_lane, ternary_bfloat_lane, s_float, none) DEF_SVE_FUNCTION (svclamp, clamp, all_float, none) -DEF_SVE_FUNCTION_GS (svclamp, clamp, all_arith, x24, none) DEF_SVE_FUNCTION (svcntp, count_pred_c, all_count, none) -DEF_SVE_FUNCTION_GS (svcvt, unary_convertxn, cvt_h_s_float, x2, none) -DEF_SVE_FUNCTION_GS (svcvt, unary_convertxn, cvt_s_s, x24, none) -DEF_SVE_FUNCTION_GS (svcvtn, unary_convertxn, cvt_h_s_float, x2, none) DEF_SVE_FUNCTION (svdot, ternary_qq_opt_n_or_011, s_narrow_fsu, none) DEF_SVE_FUNCTION (svdot_lane, ternary_qq_or_011_lane, s_narrow_fsu, none) DEF_SVE_FUNCTION_GS (svld1, load, all_data, x24, implicit) DEF_SVE_FUNCTION_GS (svldnt1, load, all_data, x24, implicit) +DEF_SVE_FUNCTION_GS (svpext_lane, extract_pred, all_count, x12, none) +DEF_SVE_FUNCTION (svptrue, inherent, all_count, none) +DEF_SVE_FUNCTION_GS (svqcvtn, unary_convertxn, qcvt_x2, x2, none) +DEF_SVE_FUNCTION_GS (svqrshrn, shift_right_imm_narrowxn, qrshr_x2, x2, none) +DEF_SVE_FUNCTION_GS (svqrshrun, shift_right_imm_narrowxn, qrshru_x2, x2, none) +DEF_SVE_FUNCTION_GS (svst1, storexn, all_data, x24, implicit) +DEF_SVE_FUNCTION_GS (svstnt1, storexn, all_data, x24, implicit) +DEF_SVE_FUNCTION_GS (svwhilege, compare_scalar, while_x, x2, none) +DEF_SVE_FUNCTION (svwhilege, compare_scalar_count, while_x_c, none) +DEF_SVE_FUNCTION_GS (svwhilegt, compare_scalar, while_x, x2, none) +DEF_SVE_FUNCTION (svwhilegt, compare_scalar_count, while_x_c, none) +DEF_SVE_FUNCTION_GS (svwhilele, compare_scalar, while_x, x2, none) +DEF_SVE_FUNCTION (svwhilele, compare_scalar_count, while_x_c, none) +DEF_SVE_FUNCTION_GS (svwhilelt, compare_scalar, while_x, x2, none) +DEF_SVE_FUNCTION (svwhilelt, compare_scalar_count, while_x_c, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS streaming_only (AARCH64_FL_SME2) +DEF_SVE_FUNCTION_GS (svadd, binary_single, all_integer, x24, none) +DEF_SVE_FUNCTION_GS (svclamp, clamp, all_arith, x24, none) +DEF_SVE_FUNCTION_GS (svcvt, unary_convertxn, cvt_h_s_float, x2, none) +DEF_SVE_FUNCTION_GS (svcvt, unary_convertxn, cvt_s_s, x24, none) +DEF_SVE_FUNCTION_GS (svcvtn, unary_convertxn, cvt_h_s_float, x2, none) DEF_SVE_FUNCTION_GS (svmax, binary_opt_single_n, all_arith, x24, none) DEF_SVE_FUNCTION_GS (svmaxnm, binary_opt_single_n, all_float, x24, none) DEF_SVE_FUNCTION_GS (svmin, binary_opt_single_n, all_arith, x24, none) DEF_SVE_FUNCTION_GS (svminnm, binary_opt_single_n, all_float, x24, none) -DEF_SVE_FUNCTION_GS (svpext_lane, extract_pred, all_count, x12, none) -DEF_SVE_FUNCTION (svptrue, inherent, all_count, none) DEF_SVE_FUNCTION_GS (svqcvt, unary_convertxn, qcvt_x2, x2, none) DEF_SVE_FUNCTION_GS (svqcvt, unary_convertxn, qcvt_x4, x4, none) -DEF_SVE_FUNCTION_GS (svqcvtn, unary_convertxn, qcvt_x2, x2, none) DEF_SVE_FUNCTION_GS (svqcvtn, unary_convertxn, qcvt_x4, x4, none) DEF_SVE_FUNCTION_GS (svqdmulh, binary_opt_single_n, all_signed, x24, none) DEF_SVE_FUNCTION_GS (svqrshr, shift_right_imm_narrowxn, qrshr_x2, x2, none) DEF_SVE_FUNCTION_GS (svqrshr, shift_right_imm_narrowxn, qrshr_x4, x4, none) -DEF_SVE_FUNCTION_GS (svqrshrn, shift_right_imm_narrowxn, qrshr_x2, x2, none) DEF_SVE_FUNCTION_GS (svqrshrn, shift_right_imm_narrowxn, qrshr_x4, x4, none) DEF_SVE_FUNCTION_GS (svqrshru, shift_right_imm_narrowxn, qrshru_x2, x2, none) DEF_SVE_FUNCTION_GS (svqrshru, shift_right_imm_narrowxn, qrshru_x4, x4, none) -DEF_SVE_FUNCTION_GS (svqrshrun, shift_right_imm_narrowxn, qrshru_x2, x2, none) DEF_SVE_FUNCTION_GS (svqrshrun, shift_right_imm_narrowxn, qrshru_x4, x4, none) DEF_SVE_FUNCTION_GS (svrinta, unaryxn, s_float, x24, none) DEF_SVE_FUNCTION_GS (svrintm, unaryxn, s_float, x24, none) @@ -267,19 +280,9 @@ DEF_SVE_FUNCTION_GS (svrintn, unaryxn, s_float, x24, none) DEF_SVE_FUNCTION_GS (svrintp, unaryxn, s_float, x24, none) DEF_SVE_FUNCTION_GS (svrshl, binary_int_opt_single_n, all_integer, x24, none) DEF_SVE_FUNCTION_GS (svsel, binaryxn, all_data, x24, implicit) -DEF_SVE_FUNCTION_GS (svst1, storexn, all_data, x24, implicit) -DEF_SVE_FUNCTION_GS (svstnt1, storexn, all_data, x24, implicit) DEF_SVE_FUNCTION_GS (svunpk, unary_convertxn, bhs_widen, x24, none) DEF_SVE_FUNCTION_GS (svuzp, unaryxn, all_data, x24, none) DEF_SVE_FUNCTION_GS (svuzpq, unaryxn, all_data, x24, none) -DEF_SVE_FUNCTION_GS (svwhilege, compare_scalar, while_x, x2, none) -DEF_SVE_FUNCTION (svwhilege, compare_scalar_count, while_x_c, none) -DEF_SVE_FUNCTION_GS (svwhilegt, compare_scalar, while_x, x2, none) -DEF_SVE_FUNCTION (svwhilegt, compare_scalar_count, while_x_c, none) -DEF_SVE_FUNCTION_GS (svwhilele, compare_scalar, while_x, x2, none) -DEF_SVE_FUNCTION (svwhilele, compare_scalar_count, while_x_c, none) -DEF_SVE_FUNCTION_GS (svwhilelt, compare_scalar, while_x, x2, none) -DEF_SVE_FUNCTION (svwhilelt, compare_scalar_count, while_x_c, none) DEF_SVE_FUNCTION_GS (svzip, unaryxn, all_data, x24, none) DEF_SVE_FUNCTION_GS (svzipq, unaryxn, all_data, x24, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 3d92a2a454f..f89036c35f7 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -7222,7 +7222,7 @@ (define_insn "@aarch64_dot_prod_lane" (match_operand:SVE_FULL_SDI 4 "register_operand")))] "TARGET_SVE && ( == * 4 - || (TARGET_STREAMING_SME2 + || (TARGET_SVE2p1_OR_SME2 && == 32 && == 16))" {@ [ cons: =0 , 1 , 2 , 4 ; attrs: movprfx ] @@ -7839,8 +7839,8 @@ (define_insn "@aarch64_sve_tmad" ;; - BFDOT (BF16) ;; - BFMLALB (BF16) ;; - BFMLALT (BF16) -;; - BFMLSLB (SME2) -;; - BFMLSLT (SME2) +;; - BFMLSLB (SVE2p1, SME2) +;; - BFMLSLT (SVE2p1, SME2) ;; - BFMMLA (BF16) ;; ------------------------------------------------------------------------- @@ -7851,7 +7851,7 @@ (define_insn "@aarch64_sve_vnx4sf" (match_operand:VNx8BF 2 "register_operand") (match_operand:VNx8BF 3 "register_operand")] SVE_BFLOAT_TERNARY_LONG))] - "TARGET_SVE_BF16" + "" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , 0 , w , w ; * ] \t%0.s, %2.h, %3.h [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;\t%0.s, %2.h, %3.h @@ -7867,7 +7867,7 @@ (define_insn "@aarch64_sve__lanevnx4sf" (match_operand:VNx8BF 3 "register_operand") (match_operand:SI 4 "const_int_operand")] SVE_BFLOAT_TERNARY_LONG_LANE))] - "TARGET_SVE_BF16" + "" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , 0 , w , y ; * ] \t%0.s, %2.h, %3.h[%4] [ ?&w , w , w , y ; yes ] movprfx\t%0, %1\;\t%0.s, %2.h, %3.h[%4] diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index fd4bd42b6d9..61bae64955f 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -140,7 +140,7 @@ (define_insn "@aarch64_" [(match_operand:VNx16BI 2 "register_operand" "Uph") (match_operand:SVE_FULLx24 1 "memory_operand" "m")] LD1_COUNT))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "\t%0, %K2/z, %1" [(set_attr "stride_type" "ld1_consecutive")] ) @@ -276,7 +276,7 @@ (define_insn "@aarch64_" (match_operand:SVE_FULLx24 1 "aligned_register_operand" "Uw") (match_dup 0)] ST1_COUNT))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "\t%1, %K2, %0" [(set_attr "stride_type" "st1_consecutive")] ) @@ -370,7 +370,7 @@ (define_insn "@aarch64_scatter_stnt_" (define_insn "@aarch64_sve_ptrue_c" [(set (match_operand:VNx16BI 0 "register_operand" "=Uph") (unspec:VNx16BI [(const_int BHSD_BITS)] UNSPEC_PTRUE_C))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "ptrue\t%K0." ) @@ -388,7 +388,7 @@ (define_insn "@aarch64_sve_pext" (match_operand:DI 2 "const_int_operand") (const_int BHSD_BITS)] UNSPEC_PEXT))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "pext\t%0., %K1[%2]" ) @@ -399,7 +399,7 @@ (define_insn "@aarch64_sve_pextx2" (match_operand:DI 2 "const_int_operand") (const_int BHSD_BITS)] UNSPEC_PEXTx2))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "pext\t{%S0., %T0.}, %K1[%2]" ) @@ -451,7 +451,7 @@ (define_insn "@aarch64_sve_cntp_c" (match_operand:DI 2 "const_int_operand") (const_int BHSD_BITS)] UNSPEC_CNTP_C))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "cntp\t%x0, %K1., vlx%2" ) @@ -1117,7 +1117,7 @@ (define_insn "@aarch64_sve_fclamp" UNSPEC_FMAXNM) (match_operand:SVE_FULL_F 3 "register_operand")] UNSPEC_FMINNM))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] fclamp\t%0., %2., %3. [ ?&w, w, w, w; yes ] movprfx\t%0, %1\;fclamp\t%0., %2., %3. @@ -1137,7 +1137,7 @@ (define_insn_and_split "*aarch64_sve_fclamp_x" UNSPEC_COND_FMAXNM) (match_operand:SVE_FULL_F 3 "register_operand")] UNSPEC_COND_FMINNM))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" {@ [cons: =0, 1, 2, 3; attrs: movprfx] [ w, %0, w, w; * ] # [ ?&w, w, w, w; yes ] # @@ -2039,7 +2039,7 @@ (define_insn "dot_prodvnx4sivnx8hi" (match_operand:VNx8HI 2 "register_operand")] DOTPROD) (match_operand:VNx4SI 3 "register_operand")))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , w , w , 0 ; * ] dot\t%0.s, %1.h, %2.h [ ?&w , w , w , w ; yes ] movprfx\t%0, %3\;dot\t%0.s, %1.h, %2.h @@ -2137,7 +2137,7 @@ (define_insn "aarch64_sve_fdotvnx4sfvnx8hf" (match_operand:VNx8HF 2 "register_operand")] UNSPEC_FDOT) (match_operand:VNx4SF 3 "register_operand")))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , w , w , 0 ; * ] fdot\t%0.s, %1.h, %2.h [ ?&w , w , w , w ; yes ] movprfx\t%0, %3\;fdot\t%0.s, %1.h, %2.h @@ -2155,7 +2155,7 @@ (define_insn "aarch64_fdot_prod_lanevnx4sfvnx8hf" UNSPEC_SVE_LANE_SELECT)] UNSPEC_FDOT) (match_operand:VNx4SF 4 "register_operand")))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" {@ [ cons: =0 , 1 , 2 , 4 ; attrs: movprfx ] [ w , w , y , 0 ; * ] fdot\t%0.s, %1.h, %2.h[%3] [ ?&w , w , y , w ; yes ] movprfx\t%0, %4\;fdot\t%0.s, %1.h, %2.h[%3] @@ -2222,7 +2222,7 @@ (define_insn "@aarch64_sve_" (unspec:VNx8HI_ONLY [(match_operand:VNx8SI_ONLY 1 "aligned_register_operand" "Uw")] SVE_QCVTxN))] - "TARGET_STREAMING_SME2" + "" "\t%0.h, %1" ) @@ -2336,6 +2336,14 @@ (define_insn "@aarch64_sve_" ;; ------------------------------------------------------------------------- ;; ---- [INT] Multi-vector narrowing right shifts ;; ------------------------------------------------------------------------- +;; Includes: +;; - SQRSHR +;; - SQRSHRN +;; - SQRSHRU +;; - SQRSHRUN +;; - UQRSHR +;; - UQRSHRN +;; ------------------------------------------------------------------------- (define_insn "@aarch64_sve_" [(set (match_operand: 0 "register_operand" "=w") @@ -2343,7 +2351,7 @@ (define_insn "@aarch64_sve_" [(match_operand:SVE_FULL_SIx2_SDIx4 1 "register_operand" "Uw") (match_operand:DI 2 "const_int_operand")] SVE2_INT_SHIFT_IMM_NARROWxN))] - "TARGET_STREAMING_SME2" + "(mode == VNx8SImode || TARGET_STREAMING_SME2)" "\t%0., %1, #%2" ) @@ -3145,7 +3153,7 @@ (define_insn "@aarch64_sve_while_b_x2" (const_int BHSD_BITS)] SVE_WHILE_ORDER)) (clobber (reg:CC_NZC CC_REGNUM))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "while\t{%S0., %T0.}, %x1, %x2" ) @@ -3159,7 +3167,7 @@ (define_insn "@aarch64_sve_while_c" (match_operand:DI 3 "const_int_operand")] SVE_WHILE_ORDER)) (clobber (reg:CC_NZC CC_REGNUM))] - "TARGET_STREAMING_SME2" + "TARGET_SVE2p1_OR_SME2" "while\t%K0., %x1, %x2, vlx%3" ) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 404efa16c28..f07b2c49f0d 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -490,6 +490,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED functions, since streaming mode itself implies SME. */ #define TARGET_SVE2p1_OR_SME (TARGET_SVE2p1 || TARGET_STREAMING) +#define TARGET_SVE2p1_OR_SME2 \ + ((TARGET_SVE2p1 || TARGET_STREAMING) \ + && (TARGET_SME2 || TARGET_NON_STREAMING)) + /* Standard register usage. */ /* 31 64-bit general purpose registers R0-R30: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index b8924cdc74b..73d674816f1 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3050,19 +3050,19 @@ (define_int_iterator SVE_FP_BINARY_MULTI [UNSPEC_FMAX UNSPEC_FMAXNM UNSPEC_FMIN UNSPEC_FMINNM]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG - [UNSPEC_BFDOT - UNSPEC_BFMLALB - UNSPEC_BFMLALT - (UNSPEC_BFMLSLB "TARGET_STREAMING_SME2") - (UNSPEC_BFMLSLT "TARGET_STREAMING_SME2") - (UNSPEC_BFMMLA "TARGET_NON_STREAMING")]) + [(UNSPEC_BFDOT "TARGET_SVE_BF16") + (UNSPEC_BFMLALB "TARGET_SVE_BF16") + (UNSPEC_BFMLALT "TARGET_SVE_BF16") + (UNSPEC_BFMLSLB "TARGET_SVE2p1_OR_SME2") + (UNSPEC_BFMLSLT "TARGET_SVE2p1_OR_SME2") + (UNSPEC_BFMMLA "TARGET_SVE_BF16 && TARGET_NON_STREAMING")]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG_LANE - [UNSPEC_BFDOT - UNSPEC_BFMLALB - UNSPEC_BFMLALT - (UNSPEC_BFMLSLB "TARGET_STREAMING_SME2") - (UNSPEC_BFMLSLT "TARGET_STREAMING_SME2")]) + [(UNSPEC_BFDOT "TARGET_SVE_BF16") + (UNSPEC_BFMLALB "TARGET_SVE_BF16") + (UNSPEC_BFMLALT "TARGET_SVE_BF16") + (UNSPEC_BFMLSLB "TARGET_SVE2p1_OR_SME2") + (UNSPEC_BFMLSLT "TARGET_SVE2p1_OR_SME2")]) (define_int_iterator SVE_INT_REDUCTION [UNSPEC_ANDV UNSPEC_IORV @@ -3338,12 +3338,13 @@ (define_int_iterator SVE2_INT_SHIFT_IMM_NARROWT [UNSPEC_RSHRNT UNSPEC_UQRSHRNT UNSPEC_UQSHRNT]) -(define_int_iterator SVE2_INT_SHIFT_IMM_NARROWxN [UNSPEC_SQRSHR - UNSPEC_SQRSHRN - UNSPEC_SQRSHRU - UNSPEC_SQRSHRUN - UNSPEC_UQRSHR - UNSPEC_UQRSHRN]) +(define_int_iterator SVE2_INT_SHIFT_IMM_NARROWxN + [(UNSPEC_SQRSHR "TARGET_STREAMING_SME2") + (UNSPEC_SQRSHRN "TARGET_SVE2p1_OR_SME2") + (UNSPEC_SQRSHRU "TARGET_STREAMING_SME2") + (UNSPEC_SQRSHRUN "TARGET_SVE2p1_OR_SME2") + (UNSPEC_UQRSHR "TARGET_STREAMING_SME2") + (UNSPEC_UQRSHRN "TARGET_SVE2p1_OR_SME2")]) (define_int_iterator SVE2_INT_SHIFT_INSERT [UNSPEC_SLI UNSPEC_SRI]) @@ -3488,9 +3489,12 @@ (define_int_iterator SVE2_PMULL [UNSPEC_PMULLB UNSPEC_PMULLT]) (define_int_iterator SVE2_PMULL_PAIR [UNSPEC_PMULLB_PAIR UNSPEC_PMULLT_PAIR]) -(define_int_iterator SVE_QCVTxN [UNSPEC_SQCVT UNSPEC_SQCVTN - UNSPEC_SQCVTU UNSPEC_SQCVTUN - UNSPEC_UQCVT UNSPEC_UQCVTN]) +(define_int_iterator SVE_QCVTxN [(UNSPEC_SQCVT "TARGET_STREAMING_SME2") + (UNSPEC_SQCVTN "TARGET_SVE2p1_OR_SME2") + (UNSPEC_SQCVTU "TARGET_STREAMING_SME2") + (UNSPEC_SQCVTUN "TARGET_SVE2p1_OR_SME2") + (UNSPEC_UQCVT "TARGET_STREAMING_SME2") + (UNSPEC_UQCVTN "TARGET_SVE2p1_OR_SME2")]) (define_int_iterator SVE2_SFx24_UNARY [UNSPEC_FRINTA UNSPEC_FRINTM UNSPEC_FRINTN UNSPEC_FRINTP]) From patchwork Wed Nov 6 18:24:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007672 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkDD373QTz1xyM for ; Thu, 7 Nov 2024 05:25:55 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 223DE3858403 for ; Wed, 6 Nov 2024 18:25:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F2C053858D37; Wed, 6 Nov 2024 18:25:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F2C053858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F2C053858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917518; cv=none; b=UAKQfUViVu0RpdzDh+mzjXmNFRArMd3WDj6YCokqIcVZIHDw5Mug7ZvgceiHk7xzZwvjDMFGOoqgT8AiwgmTRde3Wd70AI7F9PtRQbNYDm1/tveEhcHEQGGsubpTEEGklEk2T5Zc5n59OyhtpLbFcP1FHR3AfzGTsqTECuPEZHY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917518; c=relaxed/simple; bh=cbbgsGXyT8YLAGM7cX3+8XLosDzAyu4KiECjqMaDnso=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=ih9RohfO7HB5bEEkAVqRMjY1tbK+5i5id/w8J2qn0M+SmUXMwLzGGzPwC9RaTujQwUdhH5Iqr2HXOxbmHYhBetU9coeaie9QqtYpeHeZTDZT8/OL9+ljtXwR7CI4g58gf3XniBvfHDyuR9Bb84w8yNKhvayInoI51CoG1h5rd2Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 75AA5497; Wed, 6 Nov 2024 10:25:30 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6FD963F66E; Wed, 6 Nov 2024 10:24:59 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 14/15] aarch64: Add remaining SVE2p1 support In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:24:57 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the instructions that are new to FEAT_SVE2p1. It mostly contains simple additions, so it didn't seem worth splitting up further. It's likely that we'll find more autovec uses for some of these instructions, but for now this patch just deals with one obvious case: using the new hybrid-VLA permutations to handle "stepped" versions of some Advanced SIMD permutations. See aarch64_evpc_hvla for details. The patch also continues the existing practice of lowering ACLE permutation intrinsics to VEC_PERM_EXPR. That's admittedly a bit inconsistent with the approach I've been advocating for when it comes to arithmetic, but I think the difference is that (a) these are pure data movement, and so there's limited scope for things like gimple canonicalisations to mess with the instruction selection or operation mix; and (b) there are no added UB rules to worry about. Another new thing in the patch is the concept of "memory-only" SVE vector modes. These are used to represent the memory operands of the new LD1[DW] (to .Q), LD[234]Q, ST1[DW] (from .Q), and ST[234]Q instructions. We continue to use .B, .H, .S, and .D modes for the registers, since there's no predicated contiguous LD1Q instruction, and since there's no arithmetic that can be done on TI. (The new instructions are instead intended for hybrid VLA, i.e. for vectors of vectors.) For now, all of the new instructions are non-streaming-only. Some of them are streaming-compatible with SME2p1, but that's a later patch. gcc/ * config/aarch64/aarch64-modes.def (VNx1SI, VNx1DI): New modes. * config/aarch64/aarch64-sve-builtins-base.cc (svdup_lane_impl::expand): Update generation of TBL instruction. (svtbl_impl): Delete. (svtbl): Use unspec_based_uncond_function instead. * config/aarch64/aarch64-sve-builtins-functions.h (permute::fold_permute): Handle trailing immediate arguments. * config/aarch64/aarch64-sve-builtins-shapes.h (extq): Declare. (load_gather64_sv_index, load_gather64_sv_offset): Likewise. (load_gather64_vs_index, load_gather64_vs_offset): Likewise. (pmov_from_vector, pmov_from_vector_lane, pmov_to_vector_lane) (reduction_neonq, store_scatter64_index, store_scatter64_offset) (unary_lane): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (load_gather64_sv_base, store_scatter64_base): New classes. (extq_def, ext): New shape. (load_gather64_sv_index_def, load_gather64_sv_index): Likewise. (load_gather64_sv_offset_def, load_gather64_sv_offset): Likewise. (load_gather64_vs_index_def, load_gather64_vs_index): Likewise. (load_gather64_vs_offset_def, load_gather64_vs_offset): Likewise. (pmov_from_vector_def, pmov_from_vector): Likewise. (pmov_from_vector_lane_def, pmov_from_vector_lane): Likewise. (pmov_to_vector_lane_def, pmov_to_vector_lane): Likewise. (reduction_neonq_def, reduction_neonq): Likewise. (store_scatter64_index_def, store_scatter64_index): Likewise. (store_scatter64_offset_def, store_scatter64_offset): Likewise. (unary_lane_def, unary_lane): Likewise. * config/aarch64/aarch64-sve-builtins-sve2.h (svaddqv, svandqv) (svdup_laneq, sveorqv, svextq, svld1q_gather, svld1udq, svld1uwq) (svld2q, svld3q, svld4q, svmaxnmqv, svmaxqv, svminnmqv, svminqv) (svorqv, svpmov, svpmov_lane, svst1qd, svst1q_scatter, svst1wq) (svst2q, svst3q, svst4q, svtblq, svtbx, svtbxq, svuzpq1, svuzpq2) (svzipq1, svzipq2): Declare. * config/aarch64/aarch64-sve-builtins-sve2.cc (ld1uxq_st1xq_base) (ld234q_st234q_base, svdup_laneq_impl, svextq_impl): New classes. (svld1q_gather_impl, svld1uxq_impl, svld234q_impl): Likewise. (svpmov_impl, svpmov_lane_impl, svst1q_scatter_impl): Likewise. (svst1xq_impl, svst234q_impl, svuzpq_impl, svzipq_impl): Likewise. (svaddqv, svandqv, svdup_laneq, sveorqv, svextq, svld1q_gather) (svld1udq, svld1uwq, svld2q, svld3q, svld4q, svmaxnmqv, svmaxqv) (svminnmqv, svminqv, svorqv, svpmov, svpmov_lane, svst1qd) (svst1q_scatter, svst1wq, svst2q, svst3q, svst4q, svtblq, svtbx) (svtbxq, svuzpq1, svuzpq2, svzipq1, svzipq2): New function entries. * config/aarch64/aarch64-sve-builtins-sve2.def (svaddqv, svandqv) (svdup_laneq, sveorqv, svextq, svld2q, svld3q, svld4q, svmaxnmqv) (svmaxqv, svminnmqv, svminqv, svorqv, svpmov, svpmov_lanes, vst2q) (svst3q, svst4q, svtblq, svtbxq, svuzpq1, svuzpq2, svzipq1, svzipq2) (svld1q_gather, svld1udq, svld1uwq, svst1dq, svst1q_scatter) (svst1wq): New function definitions. * config/aarch64/aarch64-sve-builtins.cc (TYPES_hsd_data) (hsd_data, s_data): New type lists. (function_resolver::infer_pointer_type): Give a specific error about passing a pointer to 8-bit elements to an _index function. (function_resolver::resolve_sv_displacement): Check whether the function allows 32-bit bases. * config/aarch64/iterators.md (UNSPEC_TBLQ, UNSPEC_TBXQ): New unspecs. (UNSPEC_ADDQV, UNSPEC_ANDQV, UNSPEC_DUPQ, UNSPEC_EORQV, UNSPEC_EXTQ) (UNSPEC_FADDQV, UNSPEC_FMAXQV, UNSPEC_FMAXNMQV, UNSPEC_FMINQV) (UNSPEC_FMINNMQV, UNSPEC_LD1_EXTENDQ, UNSPEC_LD1Q_GATHER): Likewise. (UNSPEC_LDNQ, UNSPEC_ORQV, UNSPEC_PMOV_PACK, UNSPEC_PMOV_PACK_LANE) (UNSPEC_PMOV_UNPACK, UNSPEC_PMOV_UNPACK_LANE, UNSPEC_SMAXQV): Likewise. (UNSPEC_SMINQV, UNSPEC_ST1_TRUNCQ, UNSPEC_ST1Q_SCATTER, UNSPEC_STNQ) (UNSPEC_UMAXQV, UNSPEC_UMINQV, UNSPEC_UZPQ1, UNSPEC_UZPQ2): Likewise. (UNSPEC_ZIPQ1, UNSPEC_ZIPQ2): Likewise. (Vtype): Handle single-vector SVE modes. (Vendreg): Handle SVE structure modes. (VNxTI, LD1_EXTENDQ_MEM): New mode attributes. (SVE_PERMUTE, SVE_TBL, SVE_TBX): New int iterators. (SVE_INT_REDUCTION_128, SVE_FP_REDUCTION_128): Likewise. (optab): Handle the new SVE2.1 reductions. (perm_insn): Handle the new SVE2.1 permutations. * config/aarch64/aarch64-sve.md (@aarch64_sve_tbl): Generalize to... (@aarch64_sve_): ...this. (@aarch64_sve_): Generalize to... (@aarch64_sve_): ...this. * config/aarch64/aarch64-sve2.md (@aarch64_pmov_to_) (@aarch64_pmov_lane_to_, @aarch64_pmov_from_) (@aarch64_pmov_lane_from_, @aarch64_sve_ld1_extendq) (@aarch64_sve_ldnq, aarch64_gather_ld1q): New patterns. (@aarch64_sve_st1_truncq, @aarch64_sve_stnq): Likewise. (aarch64_scatter_st1q, @aarch64_pred_reduc__): Likewise. (@aarch64_sve_dupq, @aarch64_sve_extq): Likewise. (@aarch64_sve2_tbx): Generalize to... (@aarch64_sve_): ...this. * config/aarch64/aarch64.cc (aarch64_classify_vector_memory_mode): New function. (aarch64_regmode_natural_size): Use it. (aarch64_classify_index): Likewise. (aarch64_classify_address): Likewise. (aarch64_print_address_internal): Likewise. (aarch64_evpc_hvla): New function. (aarch64_expand_vec_perm_const_1): Use it. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_1.c, * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c, * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c, * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c, * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c, * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c: Adjust the "did you mean" suggestion. * gcc.target/aarch64/sve/acle/general-c/ld1sh_gather_1.c: Removed. * gcc.target/aarch64/sve/acle/general-c/extq_1.c: New test. * gcc.target/aarch64/sve/acle/general-c/load_gather64_sv_index_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/load_gather64_sv_offset_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/pmov_from_vector_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/pmov_from_vector_lane_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/pmov_to_vector_lane_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/pmov_to_vector_lane_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/store_scatter64_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/store_scatter64_index_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/store_scatter64_offset_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/unary_lane_1.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/addqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/andqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dup_laneq_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/eorqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/extq_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1udq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1udq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1udq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1uwq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1uwq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld1uwq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld2q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld3q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ld4q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxnmqv_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxnmqv_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxnmqv_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minnmqv_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minnmqv_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minnmqv_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/orqv_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmov_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1dq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1dq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1dq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1wq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1wq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st1wq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st2q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st3q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/st4q_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tblq_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbxq_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq1_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/uzpq2_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq1_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_f16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/zipq2_u8.c: Likewise. * gcc.target/aarch64/sve2/dupq_1.c: Likewise. * gcc.target/aarch64/sve2/extq_1.c: Likewise. * gcc.target/aarch64/sve2/uzpq_1.c: Likewise. * gcc.target/aarch64/sve2/zipq_1.c: Likewise. --- gcc/config/aarch64/aarch64-modes.def | 17 +- .../aarch64/aarch64-sve-builtins-base.cc | 16 +- .../aarch64/aarch64-sve-builtins-functions.h | 2 +- .../aarch64/aarch64-sve-builtins-shapes.cc | 287 ++++++++++++++- .../aarch64/aarch64-sve-builtins-shapes.h | 12 + .../aarch64/aarch64-sve-builtins-sve2.cc | 317 ++++++++++++++++- .../aarch64/aarch64-sve-builtins-sve2.def | 42 +++ .../aarch64/aarch64-sve-builtins-sve2.h | 30 ++ gcc/config/aarch64/aarch64-sve-builtins.cc | 19 +- gcc/config/aarch64/aarch64-sve.md | 13 +- gcc/config/aarch64/aarch64-sve2.md | 314 +++++++++++++++- gcc/config/aarch64/aarch64.cc | 147 +++++++- gcc/config/aarch64/iterators.md | 121 ++++++- .../aarch64/sve/acle/general-c/extq_1.c | 77 ++++ .../sve/acle/general-c/ld1sh_gather_1.c | 35 -- .../acle/general-c/load_ext_gather_index_1.c | 2 +- .../acle/general-c/load_ext_gather_offset_1.c | 2 +- .../acle/general-c/load_ext_gather_offset_2.c | 2 +- .../acle/general-c/load_ext_gather_offset_3.c | 2 +- .../acle/general-c/load_ext_gather_offset_4.c | 2 +- .../acle/general-c/load_ext_gather_offset_5.c | 2 +- .../acle/general-c/load_gather64_sv_index_1.c | 57 +++ .../general-c/load_gather64_sv_offset_1.c | 54 +++ .../sve/acle/general-c/pmov_from_vector_1.c | 26 ++ .../acle/general-c/pmov_from_vector_lane_1.c | 41 +++ .../acle/general-c/pmov_to_vector_lane_1.c | 45 +++ .../acle/general-c/pmov_to_vector_lane_2.c | 19 + .../sve/acle/general-c/store_scatter64_1.c | 32 ++ .../acle/general-c/store_scatter64_index_1.c | 59 +++ .../acle/general-c/store_scatter64_offset_1.c | 58 +++ .../aarch64/sve/acle/general-c/unary_lane_1.c | 42 +++ .../aarch64/sve2/acle/asm/addqv_f16.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_f32.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_f64.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/addqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/andqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/dup_laneq_bf16.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_f16.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_f32.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_f64.c | 35 ++ .../aarch64/sve2/acle/asm/dup_laneq_s16.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_s32.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_s64.c | 35 ++ .../aarch64/sve2/acle/asm/dup_laneq_s8.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_u16.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_u32.c | 53 +++ .../aarch64/sve2/acle/asm/dup_laneq_u64.c | 35 ++ .../aarch64/sve2/acle/asm/dup_laneq_u8.c | 53 +++ .../aarch64/sve2/acle/asm/eorqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/eorqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/extq_bf16.c | 77 ++++ .../aarch64/sve2/acle/asm/extq_f16.c | 77 ++++ .../aarch64/sve2/acle/asm/extq_f32.c | 67 ++++ .../aarch64/sve2/acle/asm/extq_f64.c | 47 +++ .../aarch64/sve2/acle/asm/extq_s16.c | 77 ++++ .../aarch64/sve2/acle/asm/extq_s32.c | 67 ++++ .../aarch64/sve2/acle/asm/extq_s64.c | 47 +++ .../aarch64/sve2/acle/asm/extq_s8.c | 77 ++++ .../aarch64/sve2/acle/asm/extq_u16.c | 77 ++++ .../aarch64/sve2/acle/asm/extq_u32.c | 67 ++++ .../aarch64/sve2/acle/asm/extq_u64.c | 47 +++ .../aarch64/sve2/acle/asm/extq_u8.c | 77 ++++ .../aarch64/sve2/acle/asm/ld1q_gather_bf16.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_f16.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_f32.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_f64.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_s16.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_s32.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_s64.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_s8.c | 109 ++++++ .../aarch64/sve2/acle/asm/ld1q_gather_u16.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_u32.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_u64.c | 179 ++++++++++ .../aarch64/sve2/acle/asm/ld1q_gather_u8.c | 109 ++++++ .../aarch64/sve2/acle/asm/ld1udq_f64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld1udq_s64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld1udq_u64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld1uwq_f32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld1uwq_s32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld1uwq_u32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/ld2q_bf16.c | 234 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_f16.c | 234 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_f32.c | 224 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_f64.c | 214 +++++++++++ .../aarch64/sve2/acle/asm/ld2q_s16.c | 234 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_s32.c | 224 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_s64.c | 214 +++++++++++ .../aarch64/sve2/acle/asm/ld2q_s8.c | 244 +++++++++++++ .../aarch64/sve2/acle/asm/ld2q_u16.c | 234 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_u32.c | 224 ++++++++++++ .../aarch64/sve2/acle/asm/ld2q_u64.c | 214 +++++++++++ .../aarch64/sve2/acle/asm/ld2q_u8.c | 244 +++++++++++++ .../aarch64/sve2/acle/asm/ld3q_bf16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_f16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_f32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_f64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_s16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_s32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_s64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_s8.c | 291 +++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_u16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_u32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_u64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/ld3q_u8.c | 291 +++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_bf16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_f16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_f32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_f64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_s16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_s32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_s64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_s8.c | 335 ++++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_u16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_u32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_u64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/ld4q_u8.c | 335 ++++++++++++++++++ .../aarch64/sve2/acle/asm/maxnmqv_f16.c | 26 ++ .../aarch64/sve2/acle/asm/maxnmqv_f32.c | 26 ++ .../aarch64/sve2/acle/asm/maxnmqv_f64.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_f16.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_f32.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_f64.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/maxqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/minnmqv_f16.c | 26 ++ .../aarch64/sve2/acle/asm/minnmqv_f32.c | 26 ++ .../aarch64/sve2/acle/asm/minnmqv_f64.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_f16.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_f32.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_f64.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/minqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_s16.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_s32.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_s64.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_s8.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_u16.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_u32.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_u64.c | 26 ++ .../aarch64/sve2/acle/asm/orqv_u8.c | 26 ++ .../aarch64/sve2/acle/asm/pmov_s16.c | 68 ++++ .../aarch64/sve2/acle/asm/pmov_s32.c | 104 ++++++ .../aarch64/sve2/acle/asm/pmov_s64.c | 104 ++++++ .../aarch64/sve2/acle/asm/pmov_s8.c | 35 ++ .../aarch64/sve2/acle/asm/pmov_u16.c | 68 ++++ .../aarch64/sve2/acle/asm/pmov_u32.c | 104 ++++++ .../aarch64/sve2/acle/asm/pmov_u64.c | 104 ++++++ .../aarch64/sve2/acle/asm/pmov_u8.c | 35 ++ .../aarch64/sve2/acle/asm/st1dq_f64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st1dq_s64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st1dq_u64.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st1q_scatter_bf16.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_f16.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_f32.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_f64.c | 152 ++++++++ .../aarch64/sve2/acle/asm/st1q_scatter_s16.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_s32.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_s64.c | 152 ++++++++ .../aarch64/sve2/acle/asm/st1q_scatter_s8.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_u16.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_u32.c | 93 +++++ .../aarch64/sve2/acle/asm/st1q_scatter_u64.c | 152 ++++++++ .../aarch64/sve2/acle/asm/st1q_scatter_u8.c | 93 +++++ .../aarch64/sve2/acle/asm/st1wq_f32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st1wq_s32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st1wq_u32.c | 163 +++++++++ .../aarch64/sve2/acle/asm/st2q_bf16.c | 239 +++++++++++++ .../aarch64/sve2/acle/asm/st2q_f16.c | 239 +++++++++++++ .../aarch64/sve2/acle/asm/st2q_f32.c | 229 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_f64.c | 219 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_s16.c | 239 +++++++++++++ .../aarch64/sve2/acle/asm/st2q_s32.c | 229 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_s64.c | 219 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_s8.c | 249 +++++++++++++ .../aarch64/sve2/acle/asm/st2q_u16.c | 239 +++++++++++++ .../aarch64/sve2/acle/asm/st2q_u32.c | 229 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_u64.c | 219 ++++++++++++ .../aarch64/sve2/acle/asm/st2q_u8.c | 249 +++++++++++++ .../aarch64/sve2/acle/asm/st3q_bf16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/st3q_f16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/st3q_f32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_f64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_s16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/st3q_s32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_s64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_s8.c | 291 +++++++++++++++ .../aarch64/sve2/acle/asm/st3q_u16.c | 281 +++++++++++++++ .../aarch64/sve2/acle/asm/st3q_u32.c | 271 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_u64.c | 261 ++++++++++++++ .../aarch64/sve2/acle/asm/st3q_u8.c | 291 +++++++++++++++ .../aarch64/sve2/acle/asm/st4q_bf16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_f16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_f32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_f64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_s16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_s32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_s64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_s8.c | 335 ++++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_u16.c | 325 +++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_u32.c | 315 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_u64.c | 305 ++++++++++++++++ .../aarch64/sve2/acle/asm/st4q_u8.c | 335 ++++++++++++++++++ .../aarch64/sve2/acle/asm/tblq_bf16.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_f16.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_f32.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_f64.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_s16.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_s32.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_s64.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_s8.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_u16.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_u32.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_u64.c | 35 ++ .../aarch64/sve2/acle/asm/tblq_u8.c | 35 ++ .../aarch64/sve2/acle/asm/tbxq_bf16.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_f16.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_f32.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_f64.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_s16.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_s32.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_s64.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_s8.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_u16.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_u32.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_u64.c | 42 +++ .../aarch64/sve2/acle/asm/tbxq_u8.c | 42 +++ .../aarch64/sve2/acle/asm/uzpq1_bf16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_f16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_f32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_f64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq1_s16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_s32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_s64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq1_s8.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_u16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_u32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq1_u64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq1_u8.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_bf16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_f16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_f32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_f64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq2_s16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_s32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_s64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq2_s8.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_u16.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_u32.c | 35 ++ .../aarch64/sve2/acle/asm/uzpq2_u64.c | 47 +++ .../aarch64/sve2/acle/asm/uzpq2_u8.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_bf16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_f16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_f32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_f64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq1_s16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_s32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_s64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq1_s8.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_u16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_u32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq1_u64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq1_u8.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_bf16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_f16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_f32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_f64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq2_s16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_s32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_s64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq2_s8.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_u16.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_u32.c | 35 ++ .../aarch64/sve2/acle/asm/zipq2_u64.c | 47 +++ .../aarch64/sve2/acle/asm/zipq2_u8.c | 35 ++ .../gcc.target/aarch64/sve2/dupq_1.c | 162 +++++++++ .../gcc.target/aarch64/sve2/extq_1.c | 128 +++++++ .../gcc.target/aarch64/sve2/uzpq_1.c | 111 ++++++ .../gcc.target/aarch64/sve2/zipq_1.c | 111 ++++++ 310 files changed, 33776 insertions(+), 81 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/extq_1.c delete mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ld1sh_gather_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_gather64_sv_index_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/load_gather64_sv_offset_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/pmov_from_vector_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/pmov_from_vector_lane_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/pmov_to_vector_lane_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/pmov_to_vector_lane_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter64_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter64_index_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter64_offset_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_lane_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/addqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/andqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dup_laneq_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/eorqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/extq_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1q_gather_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1udq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1udq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1udq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1uwq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1uwq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1uwq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld2q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld3q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld4q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxnmqv_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxnmqv_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxnmqv_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minnmqv_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minnmqv_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minnmqv_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/orqv_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmov_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1dq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1dq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1dq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1q_scatter_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1wq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1wq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st1wq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st2q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st3q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/st4q_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tblq_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbxq_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq1_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/uzpq2_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq1_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_s16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_s32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_s64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_s8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_u16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_u32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_u64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/zipq2_u8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/dupq_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/extq_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/zipq_1.c diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def index 813421e1e39..c401d670633 100644 --- a/gcc/config/aarch64/aarch64-modes.def +++ b/gcc/config/aarch64/aarch64-modes.def @@ -194,7 +194,7 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) stored in each 128-bit unit. The actual size of the mode depends on command-line flags. - VNx1TI isn't really a native SVE mode, but it can be useful in some + VNx1* aren't really native SVE modes, but they can be useful in some limited situations. */ VECTOR_MODE_WITH_PREFIX (VNx, INT, TI, 1, 1); SVE_MODES (1, VNx16, VNx8, VNx4, VNx2, VNx1) @@ -204,9 +204,10 @@ SVE_MODES (4, VNx64, VNx32, VNx16, VNx8, VNx4) /* Partial SVE vectors: - VNx2QI VNx4QI VNx8QI - VNx2HI VNx4HI - VNx2SI + VNx2QI VNx4QI VNx8QI + VNx2HI VNx4HI + VNx1SI VNx2SI + VNx1DI In memory they occupy contiguous locations, in the same way as fixed-length vectors. E.g. VNx8QImode is half the size of VNx16QImode. @@ -214,12 +215,17 @@ SVE_MODES (4, VNx64, VNx32, VNx16, VNx8, VNx4) Passing 2 as the final argument ensures that the modes come after all other single-vector modes in the GET_MODE_WIDER chain, so that we never pick them in preference to a full vector mode. */ +VECTOR_MODE_WITH_PREFIX (VNx, INT, SI, 1, 2); +VECTOR_MODE_WITH_PREFIX (VNx, INT, DI, 1, 2); VECTOR_MODES_WITH_PREFIX (VNx, INT, 2, 2); VECTOR_MODES_WITH_PREFIX (VNx, INT, 4, 2); VECTOR_MODES_WITH_PREFIX (VNx, INT, 8, 2); VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 4, 2); VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 8, 2); +ADJUST_NUNITS (VNx1SI, exact_div (aarch64_sve_vg, 2)); +ADJUST_NUNITS (VNx1DI, exact_div (aarch64_sve_vg, 2)); + ADJUST_NUNITS (VNx2QI, aarch64_sve_vg); ADJUST_NUNITS (VNx2HI, aarch64_sve_vg); ADJUST_NUNITS (VNx2SI, aarch64_sve_vg); @@ -245,9 +251,12 @@ ADJUST_ALIGNMENT (VNx2BF, 2); ADJUST_ALIGNMENT (VNx4HF, 2); ADJUST_ALIGNMENT (VNx4BF, 2); +ADJUST_ALIGNMENT (VNx1SI, 4); ADJUST_ALIGNMENT (VNx2SI, 4); ADJUST_ALIGNMENT (VNx2SF, 4); +ADJUST_ALIGNMENT (VNx1DI, 8); + /* Quad float: 128-bit floating mode for long doubles. */ FLOAT_MODE (TF, 16, ieee_quad_format); diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 1c9f515a52c..2117eceb606 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -956,7 +956,8 @@ public: return e.use_exact_insn (code_for_aarch64_sve_dup_lane (mode)); /* Treat svdup_lane as if it were svtbl_n. */ - return e.use_exact_insn (code_for_aarch64_sve_tbl (e.vector_mode (0))); + return e.use_exact_insn (code_for_aarch64_sve (UNSPEC_TBL, + e.vector_mode (0))); } }; @@ -2897,16 +2898,6 @@ public: } }; -class svtbl_impl : public permute -{ -public: - rtx - expand (function_expander &e) const override - { - return e.use_exact_insn (code_for_aarch64_sve_tbl (e.vector_mode (0))); - } -}; - /* Implements svtrn1 and svtrn2. */ class svtrn_impl : public binary_permute { @@ -3432,7 +3423,8 @@ FUNCTION (svsub, svsub_impl,) FUNCTION (svsubr, rtx_code_function_rotated, (MINUS, MINUS, UNSPEC_COND_FSUB)) FUNCTION (svsudot, svusdot_impl, (true)) FUNCTION (svsudot_lane, svdotprod_lane_impl, (UNSPEC_SUDOT, -1, -1)) -FUNCTION (svtbl, svtbl_impl,) +FUNCTION (svtbl, quiet, (UNSPEC_TBL, UNSPEC_TBL, + UNSPEC_TBL)) FUNCTION (svtmad, CODE_FOR_MODE0 (aarch64_sve_tmad),) FUNCTION (svtrn1, svtrn_impl, (0)) FUNCTION (svtrn1q, unspec_based_function, (UNSPEC_TRN1Q, UNSPEC_TRN1Q, diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 7d06a57ff83..08443ebd5bb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -600,7 +600,7 @@ public: tree perm_type = build_vector_type (ssizetype, nelts); return gimple_build_assign (f.lhs, VEC_PERM_EXPR, gimple_call_arg (f.call, 0), - gimple_call_arg (f.call, nargs - 1), + gimple_call_arg (f.call, nargs == 1 ? 0 : 1), vec_perm_indices_to_tree (perm_type, indices)); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 62277afaeff..1088fbaa676 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -735,7 +735,7 @@ struct binary_za_slice_opt_single_base : public overloaded_base<1> } }; -/* Base class for ext. */ +/* Base class for ext and extq. */ struct ext_base : public overloaded_base<0> { void @@ -850,6 +850,22 @@ struct load_gather_sv_base : public overloaded_base<0> } }; +/* Base class for load_gather64_sv_index and load_gather64_sv_offset. */ +struct load_gather64_sv_base : public load_gather_sv_base +{ + type_suffix_index + vector_base_type (type_suffix_index) const override + { + return TYPE_SUFFIX_u64; + } + + function_resolver::target_type_restrictions + get_target_type_restrictions (const function_instance &) const override + { + return function_resolver::TARGET_ANY; + } +}; + /* Base class for load_ext_gather_index and load_ext_gather_offset, which differ only in the units of the displacement. */ struct load_ext_gather_base : public overloaded_base<1> @@ -1033,6 +1049,22 @@ struct store_scatter_base : public overloaded_base<0> } }; +/* Base class for store_scatter64_index and store_scatter64_offset. */ +struct store_scatter64_base : public store_scatter_base +{ + type_suffix_index + vector_base_type (type_suffix_index) const override + { + return TYPE_SUFFIX_u64; + } + + type_suffix_index + infer_vector_type (function_resolver &r, unsigned int argno) const override + { + return r.infer_vector_type (argno); + } +}; + /* Base class for ternary operations in which the final argument is an immediate shift amount. The derived class should check the range. */ struct ternary_shift_imm_base : public overloaded_base<0> @@ -2441,6 +2473,21 @@ struct ext_def : public ext_base }; SHAPE (ext) +/* sv_t svfoo[_t0](sv_t, sv_t, uint64_t) + + where the final argument is an integer constant expression that when + multiplied by the number of bytes in t0 is in the range [0, 15]. */ +struct extq_def : public ext_base +{ + bool + check (function_checker &c) const override + { + unsigned int bytes = c.type_suffix (0).element_bytes; + return c.require_immediate_range (2, 0, 16 / bytes - 1); + } +}; +SHAPE (extq) + /* svboolx_t svfoo_t0_g(sv_t, sv_t, uint32_t). */ struct extract_pred_def : public nonoverloaded_base { @@ -2992,6 +3039,75 @@ struct load_gather_vs_def : public overloaded_base<1> }; SHAPE (load_gather_vs) +/* sv_t svfoo_[s64]index[_t0](const _t *, svint64_t) + sv_t svfoo_[u64]index[_t0](const _t *, svuint64_t). */ +struct load_gather64_sv_index_def : public load_gather64_sv_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_index); + build_all (b, "t0,al,d", group, MODE_s64index); + build_all (b, "t0,al,d", group, MODE_u64index); + } +}; +SHAPE (load_gather64_sv_index) + +/* sv_t svfoo_[s64]offset[_t0](const _t *, svint64_t) + sv_t svfoo_[u64]offset[_t0](const _t *, svuint64_t). */ +struct load_gather64_sv_offset_def : public load_gather64_sv_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_offset); + build_all (b, "t0,al,d", group, MODE_s64offset); + build_all (b, "t0,al,d", group, MODE_u64offset); + } +}; +SHAPE (load_gather64_sv_offset) + +/* sv_t svfoo[_u64base]_index_t0(svuint64_t, int64_t). */ +struct load_gather64_vs_index_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "t0,b,ss64", group, MODE_u64base_index, true); + } + + tree + resolve (function_resolver &) const override + { + /* The short name just makes the base vector mode implicit; + no resolution is needed. */ + gcc_unreachable (); + } +}; +SHAPE (load_gather64_vs_index) + +/* sv_t svfoo[_u64base]_t0(svuint64_t) + + sv_t svfoo[_u64base]_offset_t0(svuint64_t, int64_t). */ +struct load_gather64_vs_offset_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "t0,b", group, MODE_u64base, true); + build_all (b, "t0,b,ss64", group, MODE_u64base_offset, true); + } + + tree + resolve (function_resolver &) const override + { + /* The short name just makes the base vector mode implicit; + no resolution is needed. */ + gcc_unreachable (); + } +}; +SHAPE (load_gather64_vs_offset) + /* sv_t svfoo[_t0](const _t *) The only difference from "load" is that this shape has no vnum form. */ @@ -3044,6 +3160,92 @@ struct pattern_pred_def : public nonoverloaded_base }; SHAPE (pattern_pred) +/* svbool_t svfoo[_t0](sv_t). */ +struct pmov_from_vector_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "vp,v0", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + return r.resolve_uniform (1); + } +}; +SHAPE (pmov_from_vector) + +/* svbool_t svfoo[_t0](sv_t, uint64_t) + + where the final argument is an integer constant expression in the + range [0, sizeof (_t) - 1]. */ +struct pmov_from_vector_lane_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "vp,v0,su64", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + return r.resolve_uniform (1, 1); + } + + bool + check (function_checker &c) const override + { + unsigned int bytes = c.type_suffix (0).element_bytes; + return c.require_immediate_range (1, 0, bytes - 1); + } +}; +SHAPE (pmov_from_vector_lane) + +/* sv_t svfoo_t0(uint64_t) + + where the final argument is an integer constant expression in the + range [1, sizeof (_t) - 1]. */ +struct pmov_to_vector_lane_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,su64", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + gcc_assert (r.pred == PRED_m); + if (!r.check_num_arguments (3) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_integer_immediate (2)) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } + + bool + check (function_checker &c) const override + { + unsigned int bytes = c.type_suffix (0).element_bytes; + /* 1 to account for the vector argument. + + ??? This should probably be folded into function_checker::m_base_arg, + but it doesn't currently have the necessary information. */ + return c.require_immediate_range (1, 1, bytes - 1); + } +}; +SHAPE (pmov_to_vector_lane) + /* void svfoo(const void *, svprfop) void svfoo_vnum(const void *, int64_t, svprfop). */ struct prefetch_def : public nonoverloaded_base @@ -3215,6 +3417,24 @@ struct reduction_def : public overloaded_base<0> }; SHAPE (reduction) +/* xN_t svfoo[_t0](sv_t). */ +struct reduction_neonq_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "Q0,v0", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + return r.resolve_uniform (1); + } +}; +SHAPE (reduction_neonq) + /* int64_t svfoo[_t0](sv_t) (for signed t0) uint64_t svfoo[_t0](sv_t) (for unsigned t0) _t svfoo[_t0](sv_t) (for floating-point t0) @@ -3612,6 +3832,44 @@ struct store_scatter_offset_restricted_def : public store_scatter_base }; SHAPE (store_scatter_offset_restricted) +/* void svfoo_[s64]index[_t0](_t *, svint64_t, sv_t) + void svfoo_[u64]index[_t0](_t *, svuint64_t, sv_t) + + void svfoo[_u64base]_index[_t0](svuint64_t, int64_t, sv_t). */ +struct store_scatter64_index_def : public store_scatter64_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_index); + build_all (b, "_,as,d,t0", group, MODE_s64index); + build_all (b, "_,as,d,t0", group, MODE_u64index); + build_all (b, "_,b,ss64,t0", group, MODE_u64base_index); + } +}; +SHAPE (store_scatter64_index) + +/* void svfoo_[s64]offset[_t0](_t *, svint64_t, sv_t) + void svfoo_[u64]offset[_t0](_t *, svuint64_t, sv_t) + + void svfoo[_u64base_t0](svuint64_t, sv_t) + + void svfoo[_u64base]_offset[_t0](svuint64_t, int64_t, sv_t). */ +struct store_scatter64_offset_def : public store_scatter64_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + b.add_overloaded_functions (group, MODE_offset); + build_all (b, "_,as,d,t0", group, MODE_s64offset); + build_all (b, "_,as,d,t0", group, MODE_u64offset); + build_all (b, "_,b,t0", group, MODE_u64base); + build_all (b, "_,b,ss64,t0", group, MODE_u64base_offset); + } +}; +SHAPE (store_scatter64_offset) + /* void svfoo_t0(uint64_t, uint32_t, svbool_t, void *) void svfoo_vnum_t0(uint64_t, uint32_t, svbool_t, void *, int64_t) @@ -4365,6 +4623,33 @@ struct unary_convertxn_def : public unary_convert_def }; SHAPE (unary_convertxn) +/* sv_t svfoo_(sv_t, uint64_t) + + where the final argument is an integer constant expression in the + range [0, 16 / sizeof (_t) - 1]. */ +struct unary_lane_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,su64", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + return r.resolve_uniform (1, 1); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_lane_index (1, 0); + } +}; +SHAPE (unary_lane) + /* sv_t svfoo[_t0](sv_t). */ struct unary_long_def : public overloaded_base<0> { diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index ea87240518d..12ef2c99238 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -128,6 +128,7 @@ namespace aarch64_sve extern const function_shape *const dupq; extern const function_shape *const dup_neonq; extern const function_shape *const ext; + extern const function_shape *const extq; extern const function_shape *const extract_pred; extern const function_shape *const fold_left; extern const function_shape *const get; @@ -152,12 +153,19 @@ namespace aarch64_sve extern const function_shape *const load_gather_sv; extern const function_shape *const load_gather_sv_restricted; extern const function_shape *const load_gather_vs; + extern const function_shape *const load_gather64_sv_index; + extern const function_shape *const load_gather64_sv_offset; + extern const function_shape *const load_gather64_vs_index; + extern const function_shape *const load_gather64_vs_offset; extern const function_shape *const load_replicate; extern const function_shape *const load_za; extern const function_shape *const luti2_lane_zt; extern const function_shape *const luti4_lane_zt; extern const function_shape *const mmla; extern const function_shape *const pattern_pred; + extern const function_shape *const pmov_from_vector; + extern const function_shape *const pmov_from_vector_lane; + extern const function_shape *const pmov_to_vector_lane; extern const function_shape *const prefetch; extern const function_shape *const prefetch_gather_index; extern const function_shape *const prefetch_gather_offset; @@ -167,6 +175,7 @@ namespace aarch64_sve extern const function_shape *const read_za_m; extern const function_shape *const read_za_slice; extern const function_shape *const reduction; + extern const function_shape *const reduction_neonq; extern const function_shape *const reduction_wide; extern const function_shape *const reinterpret; extern const function_shape *const select_pred; @@ -186,6 +195,8 @@ namespace aarch64_sve extern const function_shape *const store_scatter_index_restricted; extern const function_shape *const store_scatter_offset; extern const function_shape *const store_scatter_offset_restricted; + extern const function_shape *const store_scatter64_index; + extern const function_shape *const store_scatter64_offset; extern const function_shape *const store_za; extern const function_shape *const storexn; extern const function_shape *const str_za; @@ -218,6 +229,7 @@ namespace aarch64_sve extern const function_shape *const unary_convert; extern const function_shape *const unary_convert_narrowt; extern const function_shape *const unary_convertxn; + extern const function_shape *const unary_lane; extern const function_shape *const unary_long; extern const function_shape *const unary_n; extern const function_shape *const unary_narrowb; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index 24e95afd6eb..fd0c98c6b68 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -78,6 +78,44 @@ unspec_sqrdcmlah (int rot) } } +class ld1uxq_st1xq_base : public function_base +{ +public: + CONSTEXPR ld1uxq_st1xq_base (machine_mode memory_mode) + : m_memory_mode (memory_mode) {} + + tree + memory_scalar_type (const function_instance &fi) const override + { + return fi.scalar_type (0); + } + + machine_mode + memory_vector_mode (const function_instance &) const override + { + return m_memory_mode; + } + +protected: + machine_mode m_memory_mode; +}; + +class ld234q_st234q_base : public full_width_access +{ +public: + CONSTEXPR ld234q_st234q_base (unsigned int vector_count, machine_mode mode) + : full_width_access (vector_count), m_mode (mode) + {} + + machine_mode + memory_vector_mode (const function_instance &) const override + { + return m_mode; + } + + machine_mode m_mode; +}; + class svaba_impl : public function_base { public: @@ -183,6 +221,100 @@ public: } }; +class svdup_laneq_impl : public function_base +{ +public: + rtx + expand (function_expander &e) const override + { + return e.use_exact_insn (code_for_aarch64_sve_dupq (e.result_mode ())); + } +}; + +class svextq_impl : public permute +{ +public: + gimple * + fold (gimple_folder &f) const override + { + unsigned int index = tree_to_uhwi (gimple_call_arg (f.call, 2)); + machine_mode mode = f.vector_mode (0); + unsigned int subelts = 128U / GET_MODE_UNIT_BITSIZE (mode); + poly_uint64 nelts = GET_MODE_NUNITS (mode); + vec_perm_builder builder (nelts, subelts, 3); + for (unsigned int i = 0; i < 3; ++i) + for (unsigned int j = 0; j < subelts; ++j) + { + if (index + j < subelts) + builder.quick_push (i * subelts + index + j); + else + builder.quick_push (i * subelts + index + j - subelts + nelts); + } + return fold_permute (f, builder); + } + + rtx + expand (function_expander &e) const override + { + return e.use_exact_insn (code_for_aarch64_sve_extq (e.vector_mode (0))); + } +}; + +class svld1q_gather_impl : public full_width_access +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + e.prepare_gather_address_operands (1, false); + return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q); + } +}; + +class svld1uxq_impl : public ld1uxq_st1xq_base +{ +public: + using ld1uxq_st1xq_base::ld1uxq_st1xq_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + insn_code icode = code_for_aarch64_sve_ld1_extendq (e.vector_mode (0)); + return e.use_contiguous_load_insn (icode); + } +}; + +class svld234q_impl : public ld234q_st234q_base +{ +public: + using ld234q_st234q_base::ld234q_st234q_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + insn_code icode = code_for_aarch64_sve_ldnq (e.result_mode ()); + return e.use_contiguous_load_insn (icode); + } +}; + class svldnt1_gather_impl : public full_width_access { public: @@ -268,6 +400,38 @@ public: } }; +class svpmov_impl : public function_base +{ +public: + rtx + expand (function_expander &e) const override + { + insn_code icode; + if (e.pred == PRED_z) + icode = code_for_aarch64_pmov_to (e.vector_mode (0)); + else + icode = code_for_aarch64_pmov_from (e.vector_mode (0)); + return e.use_exact_insn (icode); + } +}; + +class svpmov_lane_impl : public function_base +{ +public: + rtx + expand (function_expander &e) const override + { + insn_code icode; + if (e.pred == PRED_m) + icode = code_for_aarch64_pmov_lane_to (e.vector_mode (0)); + else if (e.args[1] == const0_rtx) + icode = code_for_aarch64_pmov_from (e.vector_mode (0)); + else + icode = code_for_aarch64_pmov_lane_from (e.vector_mode (0)); + return e.use_exact_insn (icode); + } +}; + class svpsel_lane_impl : public function_base { public: @@ -479,7 +643,7 @@ public: gimple_call_set_arg (call, 2, imm3_prec); return call; } -public: + rtx expand (function_expander &e) const override { @@ -489,6 +653,64 @@ public: } }; +class svst1q_scatter_impl : public full_width_access +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + rtx data = e.args.last (); + e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data)); + e.prepare_gather_address_operands (1, false); + return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q); + } +}; + +class svst1xq_impl : public ld1uxq_st1xq_base +{ +public: + using ld1uxq_st1xq_base::ld1uxq_st1xq_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + insn_code icode = code_for_aarch64_sve_st1_truncq (e.vector_mode (0)); + return e.use_contiguous_store_insn (icode); + } +}; + +class svst234q_impl : public ld234q_st234q_base +{ +public: + using ld234q_st234q_base::ld234q_st234q_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_MEMORY; + } + + rtx + expand (function_expander &e) const override + { + machine_mode tuple_mode = GET_MODE (e.args.last ()); + insn_code icode = code_for_aarch64_sve_stnq (tuple_mode); + return e.use_contiguous_store_insn (icode); + } +}; + class svstnt1_scatter_impl : public full_width_access { public: @@ -562,6 +784,34 @@ public: } }; +/* Implements svuzpq1 and svuzpq2. */ +class svuzpq_impl : public binary_permute +{ +public: + CONSTEXPR svuzpq_impl (unsigned int base) + : binary_permute (base ? UNSPEC_UZPQ2 : UNSPEC_UZPQ1), m_base (base) {} + + gimple * + fold (gimple_folder &f) const override + { + machine_mode mode = f.vector_mode (0); + unsigned int subelts = 128U / GET_MODE_UNIT_BITSIZE (mode); + poly_uint64 nelts = GET_MODE_NUNITS (mode); + vec_perm_builder builder (nelts, subelts, 3); + for (unsigned int i = 0; i < 3; ++i) + { + for (unsigned int j = 0; j < subelts / 2; ++j) + builder.quick_push (m_base + j * 2 + i * subelts); + for (unsigned int j = 0; j < subelts / 2; ++j) + builder.quick_push (m_base + j * 2 + i * subelts + nelts); + } + return fold_permute (f, builder); + } + + /* 0 for svuzpq1, 1 for svuzpq2. */ + unsigned int m_base; +}; + /* Implements both svwhilerw and svwhilewr; the unspec parameter decides between them. */ class svwhilerw_svwhilewr_impl : public full_width_access @@ -580,6 +830,34 @@ public: int m_unspec; }; +/* Implements svzipq1 and svzipq2. */ +class svzipq_impl : public binary_permute +{ +public: + CONSTEXPR svzipq_impl (unsigned int base) + : binary_permute (base ? UNSPEC_ZIPQ2 : UNSPEC_ZIPQ1), m_base (base) {} + + gimple * + fold (gimple_folder &f) const override + { + machine_mode mode = f.vector_mode (0); + unsigned int pairs = 64U / GET_MODE_UNIT_BITSIZE (mode); + poly_uint64 nelts = GET_MODE_NUNITS (mode); + auto base = m_base * pairs; + vec_perm_builder builder (nelts, pairs * 2, 3); + for (unsigned int i = 0; i < 3; ++i) + for (unsigned int j = 0; j < pairs; ++j) + { + builder.quick_push (base + j + i * pairs * 2); + builder.quick_push (base + j + i * pairs * 2 + nelts); + } + return fold_permute (f, builder); + } + + /* 0 for svzipq1, 1 for svzipq2. */ + unsigned int m_base; +}; + } /* end anonymous namespace */ namespace aarch64_sve { @@ -601,6 +879,7 @@ FUNCTION (svaddlbt, unspec_based_function, (UNSPEC_SADDLBT, -1, -1)) FUNCTION (svaddlt, unspec_based_function, (UNSPEC_SADDLT, UNSPEC_UADDLT, -1)) FUNCTION (svaddp, unspec_based_pred_function, (UNSPEC_ADDP, UNSPEC_ADDP, UNSPEC_FADDP)) +FUNCTION (svaddqv, reduction, (UNSPEC_ADDQV, UNSPEC_ADDQV, UNSPEC_FADDQV)) FUNCTION (svaddwb, unspec_based_function, (UNSPEC_SADDWB, UNSPEC_UADDWB, -1)) FUNCTION (svaddwt, unspec_based_function, (UNSPEC_SADDWT, UNSPEC_UADDWT, -1)) FUNCTION (svaesd, fixed_insn_function, (CODE_FOR_aarch64_sve2_aesd)) @@ -611,6 +890,7 @@ FUNCTION (svamax, cond_or_uncond_unspec_function, (UNSPEC_COND_FAMAX, UNSPEC_FAMAX)) FUNCTION (svamin, cond_or_uncond_unspec_function, (UNSPEC_COND_FAMIN, UNSPEC_FAMIN)) +FUNCTION (svandqv, reduction, (UNSPEC_ANDQV, UNSPEC_ANDQV, -1)) FUNCTION (svbcax, CODE_FOR_MODE0 (aarch64_sve2_bcax),) FUNCTION (svbdep, unspec_based_function, (UNSPEC_BDEP, UNSPEC_BDEP, -1)) FUNCTION (svbext, unspec_based_function, (UNSPEC_BEXT, UNSPEC_BEXT, -1)) @@ -631,15 +911,24 @@ FUNCTION (svcvtlt, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTLT)) FUNCTION (svcvtn, svcvtn_impl,) FUNCTION (svcvtx, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTX)) FUNCTION (svcvtxnt, CODE_FOR_MODE1 (aarch64_sve2_cvtxnt),) +FUNCTION (svdup_laneq, svdup_laneq_impl,) FUNCTION (sveor3, CODE_FOR_MODE0 (aarch64_sve2_eor3),) FUNCTION (sveorbt, unspec_based_function, (UNSPEC_EORBT, UNSPEC_EORBT, -1)) +FUNCTION (sveorqv, reduction, (UNSPEC_EORQV, UNSPEC_EORQV, -1)) FUNCTION (sveortb, unspec_based_function, (UNSPEC_EORTB, UNSPEC_EORTB, -1)) +FUNCTION (svextq, svextq_impl,) FUNCTION (svhadd, unspec_based_function, (UNSPEC_SHADD, UNSPEC_UHADD, -1)) FUNCTION (svhsub, unspec_based_function, (UNSPEC_SHSUB, UNSPEC_UHSUB, -1)) FUNCTION (svhistcnt, CODE_FOR_MODE0 (aarch64_sve2_histcnt),) FUNCTION (svhistseg, CODE_FOR_MODE0 (aarch64_sve2_histseg),) FUNCTION (svhsubr, unspec_based_function_rotated, (UNSPEC_SHSUB, UNSPEC_UHSUB, -1)) +FUNCTION (svld1q_gather, svld1q_gather_impl,) +FUNCTION (svld1udq, svld1uxq_impl, (VNx1DImode)) +FUNCTION (svld1uwq, svld1uxq_impl, (VNx1SImode)) +FUNCTION (svld2q, svld234q_impl, (2, VNx2TImode)) +FUNCTION (svld3q, svld234q_impl, (3, VNx3TImode)) +FUNCTION (svld4q, svld234q_impl, (4, VNx4TImode)) FUNCTION (svldnt1_gather, svldnt1_gather_impl,) FUNCTION (svldnt1sb_gather, svldnt1_gather_extend_impl, (TYPE_SUFFIX_s8)) FUNCTION (svldnt1sh_gather, svldnt1_gather_extend_impl, (TYPE_SUFFIX_s16)) @@ -650,11 +939,15 @@ FUNCTION (svldnt1uw_gather, svldnt1_gather_extend_impl, (TYPE_SUFFIX_u32)) FUNCTION (svlogb, unspec_based_function, (-1, -1, UNSPEC_COND_FLOGB)) FUNCTION (svmatch, svmatch_svnmatch_impl, (UNSPEC_MATCH)) FUNCTION (svmaxnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMAXNMP)) +FUNCTION (svmaxnmqv, reduction, (-1, -1, UNSPEC_FMAXNMQV)) FUNCTION (svmaxp, unspec_based_pred_function, (UNSPEC_SMAXP, UNSPEC_UMAXP, UNSPEC_FMAXP)) +FUNCTION (svmaxqv, reduction, (UNSPEC_SMAXQV, UNSPEC_UMAXQV, UNSPEC_FMAXQV)) FUNCTION (svminnmp, unspec_based_pred_function, (-1, -1, UNSPEC_FMINNMP)) +FUNCTION (svminnmqv, reduction, (-1, -1, UNSPEC_FMINNMQV)) FUNCTION (svminp, unspec_based_pred_function, (UNSPEC_SMINP, UNSPEC_UMINP, UNSPEC_FMINP)) +FUNCTION (svminqv, reduction, (UNSPEC_SMINQV, UNSPEC_UMINQV, UNSPEC_FMINQV)) FUNCTION (svmlalb, unspec_based_mla_function, (UNSPEC_SMULLB, UNSPEC_UMULLB, UNSPEC_FMLALB)) FUNCTION (svmlalb_lane, unspec_based_mla_lane_function, (UNSPEC_SMULLB, @@ -685,7 +978,10 @@ FUNCTION (svmullt_lane, unspec_based_lane_function, (UNSPEC_SMULLT, UNSPEC_UMULLT, -1)) FUNCTION (svnbsl, CODE_FOR_MODE0 (aarch64_sve2_nbsl),) FUNCTION (svnmatch, svmatch_svnmatch_impl, (UNSPEC_NMATCH)) +FUNCTION (svorqv, reduction, (UNSPEC_ORQV, UNSPEC_ORQV, -1)) FUNCTION (svpext_lane, svpext_lane_impl,) +FUNCTION (svpmov, svpmov_impl,) +FUNCTION (svpmov_lane, svpmov_lane_impl,) FUNCTION (svpmul, CODE_FOR_MODE0 (aarch64_sve2_pmul),) FUNCTION (svpmullb, unspec_based_function, (-1, UNSPEC_PMULLB, -1)) FUNCTION (svpmullb_pair, unspec_based_function, (-1, UNSPEC_PMULLB_PAIR, -1)) @@ -787,6 +1083,12 @@ FUNCTION (svsm4ekey, fixed_insn_function, (CODE_FOR_aarch64_sve2_sm4ekey)) FUNCTION (svsqadd, svsqadd_impl,) FUNCTION (svsra, svsra_impl,) FUNCTION (svsri, unspec_based_function, (UNSPEC_SRI, UNSPEC_SRI, -1)) +FUNCTION (svst1dq, svst1xq_impl, (VNx1DImode)) +FUNCTION (svst1q_scatter, svst1q_scatter_impl,) +FUNCTION (svst1wq, svst1xq_impl, (VNx1SImode)) +FUNCTION (svst2q, svst234q_impl, (2, VNx2TImode)) +FUNCTION (svst3q, svst234q_impl, (3, VNx3TImode)) +FUNCTION (svst4q, svst234q_impl, (4, VNx4TImode)) FUNCTION (svstnt1_scatter, svstnt1_scatter_impl,) FUNCTION (svstnt1b_scatter, svstnt1_scatter_truncate_impl, (QImode)) FUNCTION (svstnt1h_scatter, svstnt1_scatter_truncate_impl, (HImode)) @@ -800,11 +1102,20 @@ FUNCTION (svsubltb, unspec_based_function, (UNSPEC_SSUBLTB, -1, -1)) FUNCTION (svsubwb, unspec_based_function, (UNSPEC_SSUBWB, UNSPEC_USUBWB, -1)) FUNCTION (svsubwt, unspec_based_function, (UNSPEC_SSUBWT, UNSPEC_USUBWT, -1)) FUNCTION (svtbl2, svtbl2_impl,) -FUNCTION (svtbx, CODE_FOR_MODE0 (aarch64_sve2_tbx),) +FUNCTION (svtblq, quiet, (UNSPEC_TBLQ, + UNSPEC_TBLQ, + UNSPEC_TBLQ)) +FUNCTION (svtbx, quiet, (UNSPEC_TBX, UNSPEC_TBX, + UNSPEC_TBX)) +FUNCTION (svtbxq, quiet, (UNSPEC_TBXQ, + UNSPEC_TBXQ, + UNSPEC_TBXQ)) FUNCTION (svunpk, svunpk_impl,) FUNCTION (svuqadd, svuqadd_impl,) FUNCTION (svuzp, multireg_permute, (UNSPEC_UZP)) FUNCTION (svuzpq, multireg_permute, (UNSPEC_UZPQ)) +FUNCTION (svuzpq1, svuzpq_impl, (0)) +FUNCTION (svuzpq2, svuzpq_impl, (1)) FUNCTION (svwhilege, while_comparison, (UNSPEC_WHILEGE, UNSPEC_WHILEHS)) FUNCTION (svwhilegt, while_comparison, (UNSPEC_WHILEGT, UNSPEC_WHILEHI)) FUNCTION (svwhilerw, svwhilerw_svwhilewr_impl, (UNSPEC_WHILERW)) @@ -812,5 +1123,7 @@ FUNCTION (svwhilewr, svwhilerw_svwhilewr_impl, (UNSPEC_WHILEWR)) FUNCTION (svxar, svxar_impl,) FUNCTION (svzip, multireg_permute, (UNSPEC_ZIP)) FUNCTION (svzipq, multireg_permute, (UNSPEC_ZIPQ)) +FUNCTION (svzipq1, svzipq_impl, (0)) +FUNCTION (svzipq2, svzipq_impl, (1)) } /* end namespace aarch64_sve */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 9e8aad957d5..c641ed510ff 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -220,6 +220,35 @@ DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) DEF_SVE_FUNCTION (svsm4ekey, binary, s_unsigned, none) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS nonstreaming_sve (AARCH64_FL_SVE2p1) +DEF_SVE_FUNCTION (svaddqv, reduction_neonq, all_arith, implicit) +DEF_SVE_FUNCTION (svandqv, reduction_neonq, all_integer, implicit) +DEF_SVE_FUNCTION (svdup_laneq, unary_lane, all_data, none) +DEF_SVE_FUNCTION (sveorqv, reduction_neonq, all_integer, implicit) +DEF_SVE_FUNCTION (svextq, extq, all_data, none) +DEF_SVE_FUNCTION (svld2q, load, all_data, implicit) +DEF_SVE_FUNCTION (svld3q, load, all_data, implicit) +DEF_SVE_FUNCTION (svld4q, load, all_data, implicit) +DEF_SVE_FUNCTION (svmaxnmqv, reduction_neonq, all_float, implicit) +DEF_SVE_FUNCTION (svmaxqv, reduction_neonq, all_arith, implicit) +DEF_SVE_FUNCTION (svminnmqv, reduction_neonq, all_float, implicit) +DEF_SVE_FUNCTION (svminqv, reduction_neonq, all_arith, implicit) +DEF_SVE_FUNCTION (svpmov, pmov_from_vector, all_integer, none) +DEF_SVE_FUNCTION (svpmov, inherent, all_integer, z) +DEF_SVE_FUNCTION (svpmov_lane, pmov_from_vector_lane, all_integer, none) +DEF_SVE_FUNCTION (svpmov_lane, pmov_to_vector_lane, hsd_integer, m) +DEF_SVE_FUNCTION (svorqv, reduction_neonq, all_integer, implicit) +DEF_SVE_FUNCTION (svst2q, store, all_data, implicit) +DEF_SVE_FUNCTION (svst3q, store, all_data, implicit) +DEF_SVE_FUNCTION (svst4q, store, all_data, implicit) +DEF_SVE_FUNCTION (svtblq, binary_uint, all_data, none) +DEF_SVE_FUNCTION (svtbxq, ternary_uint, all_data, none) +DEF_SVE_FUNCTION (svuzpq1, binary, all_data, none) +DEF_SVE_FUNCTION (svuzpq2, binary, all_data, none) +DEF_SVE_FUNCTION (svzipq1, binary, all_data, none) +DEF_SVE_FUNCTION (svzipq2, binary, all_data, none) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS sve_and_sme (AARCH64_FL_SVE2p1, 0) DEF_SVE_FUNCTION (svclamp, clamp, all_integer, none) DEF_SVE_FUNCTION (svpsel_lane, select_pred, all_pred_count, none) @@ -254,6 +283,19 @@ DEF_SVE_FUNCTION_GS (svwhilelt, compare_scalar, while_x, x2, none) DEF_SVE_FUNCTION (svwhilelt, compare_scalar_count, while_x_c, none) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS nonstreaming_sve (AARCH64_FL_SVE2p1) +DEF_SVE_FUNCTION (svld1q_gather, load_gather64_sv_offset, all_data, implicit) +DEF_SVE_FUNCTION (svld1q_gather, load_gather64_sv_index, hsd_data, implicit) +DEF_SVE_FUNCTION (svld1q_gather, load_gather64_vs_offset, all_data, implicit) +DEF_SVE_FUNCTION (svld1q_gather, load_gather64_vs_index, hsd_data, implicit) +DEF_SVE_FUNCTION (svld1udq, load, d_data, implicit) +DEF_SVE_FUNCTION (svld1uwq, load, s_data, implicit) +DEF_SVE_FUNCTION (svst1dq, store, d_data, implicit) +DEF_SVE_FUNCTION (svst1q_scatter, store_scatter64_offset, all_data, implicit) +DEF_SVE_FUNCTION (svst1q_scatter, store_scatter64_index, hsd_data, implicit) +DEF_SVE_FUNCTION (svst1wq, store, s_data, implicit) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS streaming_only (AARCH64_FL_SME2) DEF_SVE_FUNCTION_GS (svadd, binary_single, all_integer, x24, none) DEF_SVE_FUNCTION_GS (svclamp, clamp, all_arith, x24, none) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h index d58190280a8..bb610cb792b 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h @@ -38,12 +38,14 @@ namespace aarch64_sve extern const function_base *const svaddlbt; extern const function_base *const svaddlt; extern const function_base *const svaddp; + extern const function_base *const svaddqv; extern const function_base *const svaddwb; extern const function_base *const svaddwt; extern const function_base *const svaesd; extern const function_base *const svaese; extern const function_base *const svaesimc; extern const function_base *const svaesmc; + extern const function_base *const svandqv; extern const function_base *const svbcax; extern const function_base *const svbdep; extern const function_base *const svbext; @@ -63,14 +65,23 @@ namespace aarch64_sve extern const function_base *const svcvtn; extern const function_base *const svcvtx; extern const function_base *const svcvtxnt; + extern const function_base *const svdup_laneq; extern const function_base *const sveor3; extern const function_base *const sveorbt; + extern const function_base *const sveorqv; extern const function_base *const sveortb; + extern const function_base *const svextq; extern const function_base *const svhadd; extern const function_base *const svhistcnt; extern const function_base *const svhistseg; extern const function_base *const svhsub; extern const function_base *const svhsubr; + extern const function_base *const svld1q_gather; + extern const function_base *const svld1udq; + extern const function_base *const svld1uwq; + extern const function_base *const svld2q; + extern const function_base *const svld3q; + extern const function_base *const svld4q; extern const function_base *const svldnt1_gather; extern const function_base *const svldnt1sb_gather; extern const function_base *const svldnt1sh_gather; @@ -81,9 +92,13 @@ namespace aarch64_sve extern const function_base *const svlogb; extern const function_base *const svmatch; extern const function_base *const svmaxnmp; + extern const function_base *const svmaxnmqv; extern const function_base *const svmaxp; + extern const function_base *const svmaxqv; extern const function_base *const svminnmp; + extern const function_base *const svminnmqv; extern const function_base *const svminp; + extern const function_base *const svminqv; extern const function_base *const svmlalb; extern const function_base *const svmlalb_lane; extern const function_base *const svmlalt; @@ -100,7 +115,10 @@ namespace aarch64_sve extern const function_base *const svmullt_lane; extern const function_base *const svnbsl; extern const function_base *const svnmatch; + extern const function_base *const svorqv; extern const function_base *const svpext_lane; + extern const function_base *const svpmov; + extern const function_base *const svpmov_lane; extern const function_base *const svpmul; extern const function_base *const svpmullb; extern const function_base *const svpmullb_pair; @@ -180,6 +198,12 @@ namespace aarch64_sve extern const function_base *const svsqadd; extern const function_base *const svsra; extern const function_base *const svsri; + extern const function_base *const svst1dq; + extern const function_base *const svst1q_scatter; + extern const function_base *const svst1wq; + extern const function_base *const svst2q; + extern const function_base *const svst3q; + extern const function_base *const svst4q; extern const function_base *const svstnt1_scatter; extern const function_base *const svstnt1b_scatter; extern const function_base *const svstnt1h_scatter; @@ -193,11 +217,15 @@ namespace aarch64_sve extern const function_base *const svsubwb; extern const function_base *const svsubwt; extern const function_base *const svtbl2; + extern const function_base *const svtblq; extern const function_base *const svtbx; + extern const function_base *const svtbxq; extern const function_base *const svunpk; extern const function_base *const svuqadd; extern const function_base *const svuzp; extern const function_base *const svuzpq; + extern const function_base *const svuzpq1; + extern const function_base *const svuzpq2; extern const function_base *const svwhilege; extern const function_base *const svwhilegt; extern const function_base *const svwhilerw; @@ -205,6 +233,8 @@ namespace aarch64_sve extern const function_base *const svxar; extern const function_base *const svzip; extern const function_base *const svzipq; + extern const function_base *const svzipq1; + extern const function_base *const svzipq2; } } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 259e7b7975c..be6ababde50 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -334,6 +334,11 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_hsd_integer(S, D) \ TYPES_hsd_signed (S, D), S (u16), S (u32), S (u64) +#define TYPES_hsd_data(S, D) \ + TYPES_h_data (S, D), \ + TYPES_s_data (S, D), \ + TYPES_d_data (S, D) + /* _f32. */ #define TYPES_s_float(S, D) \ S (f32) @@ -742,12 +747,14 @@ DEF_SVE_TYPES_ARRAY (hs_data); DEF_SVE_TYPES_ARRAY (hd_unsigned); DEF_SVE_TYPES_ARRAY (hsd_signed); DEF_SVE_TYPES_ARRAY (hsd_integer); +DEF_SVE_TYPES_ARRAY (hsd_data); DEF_SVE_TYPES_ARRAY (s_float); DEF_SVE_TYPES_ARRAY (s_float_hsd_integer); DEF_SVE_TYPES_ARRAY (s_float_sd_integer); DEF_SVE_TYPES_ARRAY (s_signed); DEF_SVE_TYPES_ARRAY (s_unsigned); DEF_SVE_TYPES_ARRAY (s_integer); +DEF_SVE_TYPES_ARRAY (s_data); DEF_SVE_TYPES_ARRAY (sd_signed); DEF_SVE_TYPES_ARRAY (sd_unsigned); DEF_SVE_TYPES_ARRAY (sd_integer); @@ -2036,6 +2043,15 @@ function_resolver::infer_pointer_type (unsigned int argno, actual, argno + 1, fndecl); return NUM_TYPE_SUFFIXES; } + if (displacement_units () == UNITS_elements && bits == 8) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects the data to be 16 bits or wider", + actual, argno + 1, fndecl); + inform (location, "use the % rather than % form" + " for 8-bit data"); + return NUM_TYPE_SUFFIXES; + } return type; } @@ -2827,7 +2843,8 @@ function_resolver::resolve_sv_displacement (unsigned int argno, } } - if (type_suffix_ids[0] == NUM_TYPE_SUFFIXES) + if (type_suffix_ids[0] == NUM_TYPE_SUFFIXES + && shape->vector_base_type (TYPE_SUFFIX_u32) == TYPE_SUFFIX_u32) { /* TYPE has been inferred rather than specified by the user, so mention it in the error messages. */ diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index f89036c35f7..5f0ecf40706 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -9018,6 +9018,7 @@ (define_insn "mask_fold_left_plus_" ;; ------------------------------------------------------------------------- ;; Includes: ;; - TBL +;; - TBLQ (SVE2p1) ;; ------------------------------------------------------------------------- (define_expand "vec_perm" @@ -9033,14 +9034,14 @@ (define_expand "vec_perm" } ) -(define_insn "@aarch64_sve_tbl" +(define_insn "@aarch64_sve_" [(set (match_operand:SVE_FULL 0 "register_operand" "=w") (unspec:SVE_FULL [(match_operand:SVE_FULL 1 "register_operand" "w") (match_operand: 2 "register_operand" "w")] - UNSPEC_TBL))] + SVE_TBL))] "TARGET_SVE" - "tbl\t%0., {%1.}, %2." + "\t%0., {%1.}, %2." ) ;; ------------------------------------------------------------------------- @@ -9129,9 +9130,13 @@ (define_insn "@aarch64_sve_rev" ;; - TRN1 ;; - TRN2 ;; - UZP1 +;; - UZPQ1 (SVE2p1) ;; - UZP2 +;; - UZPQ2 (SVE2p1) ;; - ZIP1 +;; - ZIPQ1 (SVE2p1) ;; - ZIP2 +;; - ZIPQ2 (SVE2p1) ;; ------------------------------------------------------------------------- ;; Like EXT, but start at the first active element. @@ -9156,7 +9161,7 @@ (define_insn "@aarch64_sve_" (unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "w") (match_operand:SVE_ALL 2 "register_operand" "w")] - PERMUTE))] + SVE_PERMUTE))] "TARGET_SVE" "\t%0., %1., %2." ) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 61bae64955f..9383c777d80 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -21,12 +21,22 @@ ;; The file is organised into the following sections (search for the full ;; line): ;; +;; == Moves +;; ---- Predicate to vector moves +;; ---- Vector to predicate moves +;; ;; == Loads +;; ---- 128-bit extending loads +;; ---- 128-bit structure loads ;; ---- Multi-register loads predicated by a counter +;; ---- 128-bit gather loads ;; ---- Non-temporal gather loads ;; ;; == Stores +;; ---- 128-bit truncating stores +;; ---- 128-bit structure stores ;; ---- Multi-register stores predicated by a counter +;; ---- 128-bit scatter stores ;; ---- Non-temporal scatter stores ;; ;; == Predicate manipulation @@ -99,8 +109,13 @@ ;; ---- [INT,FP] Select based on predicates as counters ;; ---- [INT] While tests ;; +;; == Reductions +;; ---- [INT] Reduction to 128-bit vector +;; ---- [FP] Reduction to 128-bit vector +;; ;; == Permutation ;; ---- [INT,FP] Reversal +;; ---- [INT,FP] HVLA permutes ;; ---- [INT,FP] General permutes ;; ---- [INT,FP] Multi-register permutes ;; ---- [INT] Optional bit-permute extensions @@ -115,10 +130,121 @@ ;; ---- Optional SHA-3 extensions ;; ---- Optional SM4 extensions +;; ========================================================================= +;; == Moves +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Predicate to vector moves +;; ------------------------------------------------------------------------- +;; Includes: +;; - PMOV (to vector) +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_pmov_to_" + [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w") + (unspec:SVE_FULL_I + [(match_operand: 1 "register_operand" "Upa")] + UNSPEC_PMOV_UNPACK))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "pmov\t%0, %1." +) + +(define_insn "@aarch64_pmov_lane_to_" + [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w") + (unspec:SVE_FULL_I + [(match_operand:SVE_FULL_I 1 "register_operand" "0") + (match_operand: 2 "register_operand" "Upa") + (match_operand:DI 3 "immediate_operand")] + UNSPEC_PMOV_UNPACK_LANE))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "pmov\t%0[%3], %2." +) + +;; ------------------------------------------------------------------------- +;; ---- Vector to predicate moves +;; ------------------------------------------------------------------------- +;; Includes: +;; - PMOV (from vector) +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_pmov_from_" + [(set (match_operand:VNx16BI 0 "register_operand" "=Upa") + (unspec:VNx16BI + [(match_operand:SVE_FULL_I 1 "register_operand" "w")] + UNSPEC_PMOV_PACK))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "pmov\t%0., %1" +) + +(define_insn "@aarch64_pmov_lane_from_" + [(set (match_operand:VNx16BI 0 "register_operand" "=Upa") + (unspec:VNx16BI + [(match_operand:SVE_FULL_I 1 "register_operand" "w") + (match_operand:DI 2 "immediate_operand")] + UNSPEC_PMOV_PACK_LANE))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "pmov\t%0., %1[%2]" +) + ;; ========================================================================= ;; == Loads ;; ========================================================================= +;; ------------------------------------------------------------------------- +;; ---- 128-bit extending loads +;; ------------------------------------------------------------------------- +;; Includes: +;; - LD1W (to .Q) +;; - LD1D (to .Q) +;; ------------------------------------------------------------------------- + +;; There isn't really a natural way of representing these instructions +;; with the modes that we normally use: +;; +;; (1) It doesn't really make sense to use VNx1TI (or similar) for the +;; result, since there's nothing that can be done with such a mode +;; other than to cast it to another mode. It also isn't how the +;; ACLE represents it (for similar reasons). +;; +;; (2) Only the lowest bit of each 16 in the predicate is significant, +;; but it doesn't really make sense to use VNx1BI to represent it, +;; since there is no "PTRUE Pn.Q, ..." instruction. +;; +;; (3) We do however need to use VNx1DI and VNx1SI to represent the +;; source memories, since none of the normal register modes would +;; give the right extent and alignment information (with the alignment +;; mattering only for -mstrict-align). +(define_insn "@aarch64_sve_ld1_extendq" + [(set (match_operand:SVE_FULL_SD 0 "register_operand" "=w") + (unspec:SVE_FULL_SD + [(match_operand: 2 "register_operand" "Upl") + (match_operand: 1 "memory_operand" "m")] + UNSPEC_LD1_EXTENDQ))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "ld1\t{%0.q}, %2/z, %1" +) + +;; ------------------------------------------------------------------------- +;; ---- 128-bit structure loads +;; ------------------------------------------------------------------------- +;; Includes: +;; - LD2Q +;; - LD3Q +;; - LD4Q +;; ------------------------------------------------------------------------- + +;; Predicated LD[234]Q. +(define_insn "@aarch64_sve_ldnq" + [(set (match_operand:SVE_STRUCT 0 "register_operand" "=w") + (unspec:SVE_STRUCT + [(match_operand: 2 "register_operand" "Upl") + (match_operand: 1 "memory_operand" "m")] + UNSPEC_LDNQ))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "ldq\t{%S0.q - %0.q}, %2/z, %1" +) + ;; ------------------------------------------------------------------------- ;; ---- Multi-register loads predicated by a counter ;; ------------------------------------------------------------------------- @@ -195,6 +321,33 @@ (define_insn "@aarch64__strided4" [(set_attr "stride_type" "ld1_strided")] ) +;; ------------------------------------------------------------------------- +;; ---- 128-bit gather loads +;; ------------------------------------------------------------------------- +;; Includes gather forms of: +;; - LD1Q +;; ------------------------------------------------------------------------- + +;; Model this as operating on the largest valid element size, which is DI. +;; This avoids having to define move patterns & more for VNx1TI, which would +;; be difficult without a non-gather form of LD1Q. +(define_insn "aarch64_gather_ld1q" + [(set (match_operand:VNx2DI 0 "register_operand") + (unspec:VNx2DI + [(match_operand:VNx2BI 1 "register_operand") + (match_operand:DI 2 "aarch64_reg_or_zero") + (match_operand:VNx2DI 3 "register_operand") + (mem:BLK (scratch))] + UNSPEC_LD1_GATHER))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + {@ [cons: =0, 1, 2, 3] + [&w, Upl, Z, w] ld1q\t{%0.q}, %1/z, [%3.d] + [?w, Upl, Z, 0] ^ + [&w, Upl, r, w] ld1q\t{%0.q}, %1/z, [%3.d, %2] + [?w, Upl, r, 0] ^ + } +) + ;; ------------------------------------------------------------------------- ;; ---- Non-temporal gather loads ;; ------------------------------------------------------------------------- @@ -255,6 +408,48 @@ (define_insn_and_rewrite "@aarch64_gather_ldnt_" + [(set (match_operand: 0 "memory_operand" "+m") + (unspec: + [(match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_FULL_SD 1 "register_operand" "w") + (match_dup 0)] + UNSPEC_ST1_TRUNCQ))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "st1\t{%1.q}, %2, %0" +) + +;; ------------------------------------------------------------------------- +;; ---- 128-bit structure stores +;; ------------------------------------------------------------------------- +;; Includes: +;; - ST2Q +;; - ST3Q +;; - ST4Q +;; ------------------------------------------------------------------------- + +;; Predicated ST[234]. +(define_insn "@aarch64_sve_stnq" + [(set (match_operand: 0 "memory_operand" "+m") + (unspec: + [(match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_STRUCT 1 "register_operand" "w") + (match_dup 0)] + UNSPEC_STNQ))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "stq\t{%S1.q - %1.q}, %2, %0" +) + ;; ------------------------------------------------------------------------- ;; ---- Multi-register stores predicated by a counter ;; ------------------------------------------------------------------------- @@ -311,6 +506,28 @@ (define_insn "@aarch64__strided4" [(set_attr "stride_type" "st1_strided")] ) +;; ------------------------------------------------------------------------- +;; ---- 128-bit scatter stores +;; ------------------------------------------------------------------------- +;; Includes scatter form of: +;; - ST1Q +;; ------------------------------------------------------------------------- + +(define_insn "aarch64_scatter_st1q" + [(set (mem:BLK (scratch)) + (unspec:BLK + [(match_operand:VNx2BI 0 "register_operand") + (match_operand:DI 1 "aarch64_reg_or_zero") + (match_operand:VNx2DI 2 "register_operand") + (match_operand:VNx2DI 3 "register_operand")] + UNSPEC_ST1Q_SCATTER))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + {@ [ cons: 0 , 1 , 2 , 3 ] + [ Upl , Z , w , w ] st1q\t{%3.q}, %0, [%2.d] + [ Upl , r , w , w ] st1q\t{%3.q}, %0, [%2.d, %1] + } +) + ;; ------------------------------------------------------------------------- ;; ---- Non-temporal scatter stores ;; ------------------------------------------------------------------------- @@ -3171,6 +3388,55 @@ (define_insn "@aarch64_sve_while_c" "while\t%K0., %x1, %x2, vlx%3" ) +;; ========================================================================= +;; == Reductions +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- [INT] Reduction to 128-bit vector +;; ------------------------------------------------------------------------- +;; Includes: +;; - ADDQV +;; - ANDQV +;; - EORQV +;; - ORQV +;; - SMAXQV +;; - SMINQV +;; - UMAXQV +;; - UMINQV +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_pred_reduc__" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: + [(match_operand: 1 "register_operand" "Upl") + (match_operand:SVE_FULL_I 2 "register_operand" "w")] + SVE_INT_REDUCTION_128))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "\t%0., %1, %2." +) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Reduction to 128-bit vector +;; ------------------------------------------------------------------------- +;; Includes: +;; - FADDQV +;; - FMAXNMQV +;; - FMAXQV +;; - FMINNMQV +;; - FMINQV +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_pred_reduc__" + [(set (match_operand: 0 "register_operand" "=w") + (unspec: + [(match_operand: 1 "register_operand" "Upl") + (match_operand:SVE_FULL_F 2 "register_operand" "w")] + SVE_FP_REDUCTION_128))] + "TARGET_SVE2p1 && TARGET_NON_STREAMING" + "\t%0., %1, %2." +) + ;; ========================================================================= ;; == Permutation ;; ========================================================================= @@ -3213,12 +3479,52 @@ (define_insn "@cond_" } ) +;; ------------------------------------------------------------------------- +;; ---- [INT,FP] HVLA permutes +;; ------------------------------------------------------------------------- +;; Includes: +;; - DUPQ +;; - EXTQ +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sve_dupq" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(match_operand:SVE_FULL 1 "register_operand" "w") + (match_operand:SI 2 "const_int_operand")] + UNSPEC_DUPQ))] + "TARGET_SVE2p1 + && TARGET_NON_STREAMING + && IN_RANGE (INTVAL (operands[2]) * ( / 8), 0, 15)" + "dupq\t%0., %1.[%2]" +) + +(define_insn "@aarch64_sve_extq" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w, ?&w") + (unspec:SVE_FULL + [(match_operand:SVE_FULL 1 "register_operand" "0, w") + (match_operand:SVE_FULL 2 "register_operand" "w, w") + (match_operand:SI 3 "const_int_operand")] + UNSPEC_EXTQ))] + "TARGET_SVE2p1 + && TARGET_NON_STREAMING + && IN_RANGE (INTVAL (operands[3]) * ( / 8), 0, 15)" + { + operands[3] = GEN_INT (INTVAL (operands[3]) * ( / 8)); + return (which_alternative == 0 + ? "extq\\t%0.b, %0.b, %2.b, #%3" + : "movprfx\t%0, %1\;extq\\t%0.b, %0.b, %2.b, #%3"); + } + [(set_attr "movprfx" "*,yes")] +) + ;; ------------------------------------------------------------------------- ;; ---- [INT,FP] General permutes ;; ------------------------------------------------------------------------- ;; Includes: ;; - TBL (vector pair form) ;; - TBX +;; - TBXQ (SVE2p1) ;; ------------------------------------------------------------------------- ;; TBL on a pair of data vectors. @@ -3232,16 +3538,16 @@ (define_insn "@aarch64_sve2_tbl2" "tbl\t%0., %1, %2." ) -;; TBX. These instructions do not take MOVPRFX. -(define_insn "@aarch64_sve2_tbx" +;; TBX(Q). These instructions do not take MOVPRFX. +(define_insn "@aarch64_sve_" [(set (match_operand:SVE_FULL 0 "register_operand" "=w") (unspec:SVE_FULL [(match_operand:SVE_FULL 1 "register_operand" "0") (match_operand:SVE_FULL 2 "register_operand" "w") (match_operand: 3 "register_operand" "w")] - UNSPEC_TBX))] + SVE_TBX))] "TARGET_SVE2" - "tbx\t%0., %2., %3." + "\t%0., %2., %3." ) ;; ------------------------------------------------------------------------- diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index e306f86f514..2efcc7ecc57 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -1692,6 +1692,32 @@ aarch64_classify_vector_mode (machine_mode mode, bool any_target_p = false) } } +/* Like aarch64_classify_vector_mode, but also include modes that are used + for memory operands but not register operands. Such modes do not count + as real vector modes; they are just an internal construct to make things + easier to describe. */ +static unsigned int +aarch64_classify_vector_memory_mode (machine_mode mode) +{ + switch (mode) + { + case VNx1SImode: + case VNx1DImode: + return TARGET_SVE ? VEC_SVE_DATA | VEC_PARTIAL : 0; + + case VNx1TImode: + return TARGET_SVE ? VEC_SVE_DATA : 0; + + case VNx2TImode: + case VNx3TImode: + case VNx4TImode: + return TARGET_SVE ? VEC_SVE_DATA | VEC_STRUCT : 0; + + default: + return aarch64_classify_vector_mode (mode); + } +} + /* Return true if MODE is any of the Advanced SIMD structure modes. */ bool aarch64_advsimd_struct_mode_p (machine_mode mode) @@ -2578,7 +2604,9 @@ aarch64_regmode_natural_size (machine_mode mode) code for Advanced SIMD. */ if (!aarch64_sve_vg.is_constant ()) { - unsigned int vec_flags = aarch64_classify_vector_mode (mode); + /* REGMODE_NATURAL_SIZE influences general subreg validity rules, + so we need to handle memory-only modes as well. */ + unsigned int vec_flags = aarch64_classify_vector_memory_mode (mode); if (vec_flags & VEC_SVE_PRED) return BYTES_PER_SVE_PRED; if (vec_flags & VEC_SVE_DATA) @@ -10484,7 +10512,8 @@ aarch64_classify_index (struct aarch64_address_info *info, rtx x, && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (index))]) index = SUBREG_REG (index); - if (aarch64_sve_data_mode_p (mode) || mode == VNx1TImode) + auto vec_flags = aarch64_classify_vector_memory_mode (mode); + if (vec_flags & VEC_SVE_DATA) { if (type != ADDRESS_REG_REG || (1 << shift) != GET_MODE_UNIT_SIZE (mode)) @@ -10555,7 +10584,7 @@ aarch64_classify_address (struct aarch64_address_info *info, Partial vectors like VNx8QImode allow the same indexed addressing mode and MUL VL addressing mode as full vectors like VNx16QImode; in both cases, MUL VL counts multiples of GET_MODE_SIZE. */ - unsigned int vec_flags = aarch64_classify_vector_mode (mode); + unsigned int vec_flags = aarch64_classify_vector_memory_mode (mode); vec_flags &= ~VEC_PARTIAL; /* On BE, we use load/store pair for all large int mode load/stores. @@ -10591,8 +10620,7 @@ aarch64_classify_address (struct aarch64_address_info *info, && ((vec_flags == 0 && known_lt (GET_MODE_SIZE (mode), 16)) || vec_flags == VEC_ADVSIMD - || vec_flags & VEC_SVE_DATA - || mode == VNx1TImode)); + || vec_flags & VEC_SVE_DATA)); /* For SVE, only accept [Rn], [Rn, #offset, MUL VL] and [Rn, Rm, LSL #shift]. The latter is not valid for SVE predicates, and that's rejected through @@ -10711,7 +10739,7 @@ aarch64_classify_address (struct aarch64_address_info *info, /* Make "m" use the LD1 offset range for SVE data modes, so that pre-RTL optimizers like ivopts will work to that instead of the wider LDR/STR range. */ - if (vec_flags == VEC_SVE_DATA || mode == VNx1TImode) + if (vec_flags == VEC_SVE_DATA) return (type == ADDR_QUERY_M ? offset_4bit_signed_scaled_p (mode, offset) : offset_9bit_signed_scaled_p (mode, offset)); @@ -12029,7 +12057,7 @@ sizetochar (int size) case 64: return 'd'; case 32: return 's'; case 16: return 'h'; - case 8 : return 'b'; + case 8: return 'b'; default: gcc_unreachable (); } } @@ -12611,7 +12639,7 @@ aarch64_print_address_internal (FILE *f, machine_mode mode, rtx x, return true; } - vec_flags = aarch64_classify_vector_mode (mode); + vec_flags = aarch64_classify_vector_memory_mode (mode); if ((vec_flags & VEC_ANY_SVE) && !load_store_pair_p) { HOST_WIDE_INT vnum @@ -26238,6 +26266,107 @@ aarch64_evpc_dup (struct expand_vec_perm_d *d) return true; } +/* Recognize things that can be done using the SVE2p1 Hybrid-VLA + permutations, which apply Advanced-SIMD-style permutations to each + individual 128-bit block. */ + +static bool +aarch64_evpc_hvla (struct expand_vec_perm_d *d) +{ + machine_mode vmode = d->vmode; + if (!TARGET_SVE2p1 + || !TARGET_NON_STREAMING + || BYTES_BIG_ENDIAN + || d->vec_flags != VEC_SVE_DATA + || GET_MODE_UNIT_BITSIZE (vmode) > 64) + return false; + + /* Set SUBELTS to the number of elements in an Advanced SIMD vector + and make sure that adding SUBELTS to each block of SUBELTS indices + gives the next block of SUBELTS indices. That is, it must be possible + to interpret the index vector as SUBELTS interleaved linear series in + which each series has step SUBELTS. */ + unsigned int subelts = 128U / GET_MODE_UNIT_BITSIZE (vmode); + unsigned int pairs = subelts / 2; + for (unsigned int i = 0; i < subelts; ++i) + if (!d->perm.series_p (i, subelts, d->perm[i], subelts)) + return false; + + /* Used once we have verified that we can use UNSPEC to do the operation. */ + auto use_binary = [&](int unspec) -> bool + { + if (!d->testing_p) + { + rtvec vec = gen_rtvec (2, d->op0, d->op1); + emit_set_insn (d->target, gen_rtx_UNSPEC (vmode, vec, unspec)); + } + return true; + }; + + /* Now check whether the first SUBELTS elements match a supported + Advanced-SIMD-style operation. */ + poly_int64 first = d->perm[0]; + poly_int64 nelt = d->perm.length (); + auto try_zip = [&]() -> bool + { + if (maybe_ne (first, 0) && maybe_ne (first, pairs)) + return false; + for (unsigned int i = 0; i < pairs; ++i) + if (maybe_ne (d->perm[i * 2], first + i) + || maybe_ne (d->perm[i * 2 + 1], first + nelt + i)) + return false; + return use_binary (maybe_ne (first, 0) ? UNSPEC_ZIPQ2 : UNSPEC_ZIPQ1); + }; + auto try_uzp = [&]() -> bool + { + if (maybe_ne (first, 0) && maybe_ne (first, 1)) + return false; + for (unsigned int i = 0; i < pairs; ++i) + if (maybe_ne (d->perm[i], first + i * 2) + || maybe_ne (d->perm[i + pairs], first + nelt + i * 2)) + return false; + return use_binary (maybe_ne (first, 0) ? UNSPEC_UZPQ2 : UNSPEC_UZPQ1); + }; + auto try_extq = [&]() -> bool + { + HOST_WIDE_INT start; + if (!first.is_constant (&start) || !IN_RANGE (start, 0, subelts - 1)) + return false; + for (unsigned int i = 0; i < subelts; ++i) + { + poly_int64 next = (start + i >= subelts + ? start + i - subelts + nelt + : start + i); + if (maybe_ne (d->perm[i], next)) + return false; + } + if (!d->testing_p) + { + rtx op2 = gen_int_mode (start, SImode); + emit_insn (gen_aarch64_sve_extq (vmode, d->target, + d->op0, d->op1, op2)); + } + return true; + }; + auto try_dupq = [&]() -> bool + { + HOST_WIDE_INT start; + if (!first.is_constant (&start) || !IN_RANGE (start, 0, subelts - 1)) + return false; + for (unsigned int i = 0; i < subelts; ++i) + if (maybe_ne (d->perm[i], start)) + return false; + if (!d->testing_p) + { + rtx op1 = gen_int_mode (start, SImode); + emit_insn (gen_aarch64_sve_dupq (vmode, d->target, d->op0, op1)); + } + return true; + }; + + return try_zip () || try_uzp () || try_extq () || try_dupq (); +} + static bool aarch64_evpc_tbl (struct expand_vec_perm_d *d) { @@ -26514,6 +26643,8 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) return true; else if (aarch64_evpc_ins (d)) return true; + else if (aarch64_evpc_hvla (d)) + return true; else if (aarch64_evpc_reencode (d)) return true; diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 73d674816f1..8e3b5731939 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -734,7 +734,9 @@ (define_c_enum "unspec" UNSPEC_USHLL ; Used in aarch64-simd.md. UNSPEC_ADDP ; Used in aarch64-simd.md. UNSPEC_TBL ; Used in vector permute patterns. + UNSPEC_TBLQ ; Used in vector permute patterns. UNSPEC_TBX ; Used in vector permute patterns. + UNSPEC_TBXQ ; Used in vector permute patterns. UNSPEC_CONCAT ; Used in vector permute patterns. ;; The following permute unspecs are generated directly by @@ -1071,14 +1073,43 @@ (define_c_enum "unspec" UNSPEC_FAMIN ; Used in aarch64-simd.md. ;; All used in aarch64-sve2.md + UNSPEC_ADDQV + UNSPEC_ANDQV + UNSPEC_DUPQ + UNSPEC_EORQV + UNSPEC_EXTQ + UNSPEC_FADDQV + UNSPEC_FMAXQV + UNSPEC_FMAXNMQV + UNSPEC_FMINQV + UNSPEC_FMINNMQV UNSPEC_FCVTN UNSPEC_FDOT + UNSPEC_LD1_EXTENDQ + UNSPEC_LD1Q_GATHER + UNSPEC_LDNQ + UNSPEC_ORQV + UNSPEC_PMOV_PACK + UNSPEC_PMOV_PACK_LANE + UNSPEC_PMOV_UNPACK + UNSPEC_PMOV_UNPACK_LANE + UNSPEC_SMAXQV + UNSPEC_SMINQV UNSPEC_SQCVT UNSPEC_SQCVTN UNSPEC_SQCVTU UNSPEC_SQCVTUN + UNSPEC_ST1_TRUNCQ + UNSPEC_ST1Q_SCATTER + UNSPEC_STNQ + UNSPEC_UMAXQV + UNSPEC_UMINQV UNSPEC_UQCVT UNSPEC_UQCVTN + UNSPEC_UZPQ1 + UNSPEC_UZPQ2 + UNSPEC_ZIPQ1 + UNSPEC_ZIPQ2 ;; All used in aarch64-sme.md UNSPEC_SME_ADD @@ -1326,7 +1357,11 @@ (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") (V4x16QI "16b") (V4x8HI "8h") (V4x4SI "4s") (V4x2DI "2d") (V4x8HF "8h") (V4x4SF "4s") - (V4x2DF "2d") (V4x8BF "8h")]) + (V4x2DF "2d") (V4x8BF "8h") + (VNx16QI "16b") (VNx8HI "8h") + (VNx4SI "4s") (VNx2DI "2d") + (VNx8HF "8h") (VNx4SF "4s") + (VNx2DF "2d") (VNx8BF "8h")]) ;; Map mode to type used in widening multiplies. (define_mode_attr Vcondtype [(V4HI "4h") (V8HI "4h") (V2SI "2s") (V4SI "2s")]) @@ -1994,7 +2029,22 @@ (define_mode_attr Vendreg [(OI "T") (CI "U") (XI "V") (V4x4HF "V") (V4x8HF "V") (V4x2SF "V") (V4x4SF "V") (V4x1DF "V") (V4x2DF "V") - (V4x4BF "V") (V4x8BF "V")]) + (V4x4BF "V") (V4x8BF "V") + + (VNx32QI "T") (VNx16HI "T") + (VNx8SI "T") (VNx4DI "T") + (VNx16BF "T") (VNx16HF "T") + (VNx8SF "T") (VNx4DF "T") + + (VNx48QI "U") (VNx24HI "U") + (VNx12SI "U") (VNx6DI "U") + (VNx24BF "U") (VNx24HF "U") + (VNx12SF "U") (VNx6DF "U") + + (VNx64QI "V") (VNx32HI "V") + (VNx16SI "V") (VNx8DI "V") + (VNx32BF "V") (VNx32HF "V") + (VNx16SF "V") (VNx8DF "V")]) ;; This is both the number of Q-Registers needed to hold the corresponding ;; opaque large integer mode, and the number of elements touched by the @@ -2338,6 +2388,21 @@ (define_mode_attr VDOUBLE [(VNx16QI "VNx32QI") (VNx4SI "VNx8SI") (VNx4SF "VNx8SF") (VNx2DI "VNx4DI") (VNx2DF "VNx4DF")]) +(define_mode_attr VNxTI [(VNx32QI "VNx2TI") (VNx16HI "VNx2TI") + (VNx8SI "VNx2TI") (VNx4DI "VNx2TI") + (VNx16BF "VNx2TI") (VNx16HF "VNx2TI") + (VNx8SF "VNx2TI") (VNx4DF "VNx2TI") + + (VNx48QI "VNx3TI") (VNx24HI "VNx3TI") + (VNx12SI "VNx3TI") (VNx6DI "VNx3TI") + (VNx24BF "VNx3TI") (VNx24HF "VNx3TI") + (VNx12SF "VNx3TI") (VNx6DF "VNx3TI") + + (VNx64QI "VNx4TI") (VNx32HI "VNx4TI") + (VNx16SI "VNx4TI") (VNx8DI "VNx4TI") + (VNx32BF "VNx4TI") (VNx32HF "VNx4TI") + (VNx16SF "VNx4TI") (VNx8DF "VNx4TI")]) + ;; The Advanced SIMD modes of popcount corresponding to scalar modes. (define_mode_attr VEC_POP_MODE [(QI "V8QI") (HI "V4HI") (SI "V2SI") (DI "V1DI")]) @@ -2448,6 +2513,9 @@ (define_mode_attr aligned_fpr [(VNx16QI "w") (VNx8HI "w") (VNx64QI "Uw4") (VNx32HI "Uw4") (VNx32BF "Uw4") (VNx32HF "Uw4")]) +(define_mode_attr LD1_EXTENDQ_MEM [(VNx4SI "VNx1SI") (VNx4SF "VNx1SI") + (VNx2DI "VNx1DI") (VNx2DF "VNx1DI")]) + ;; ------------------------------------------------------------------- ;; Code Iterators ;; ------------------------------------------------------------------- @@ -2973,6 +3041,21 @@ (define_int_iterator PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2 UNSPEC_TRN1 UNSPEC_TRN2 UNSPEC_UZP1 UNSPEC_UZP2]) +(define_int_iterator SVE_PERMUTE + [PERMUTE + (UNSPEC_UZPQ1 "TARGET_SVE2p1 && TARGET_NON_STREAMING") + (UNSPEC_UZPQ2 "TARGET_SVE2p1 && TARGET_NON_STREAMING") + (UNSPEC_ZIPQ1 "TARGET_SVE2p1 && TARGET_NON_STREAMING") + (UNSPEC_ZIPQ2 "TARGET_SVE2p1 && TARGET_NON_STREAMING")]) + +(define_int_iterator SVE_TBL + [UNSPEC_TBL + (UNSPEC_TBLQ "TARGET_SVE2p1 && TARGET_NON_STREAMING")]) + +(define_int_iterator SVE_TBX + [UNSPEC_TBX + (UNSPEC_TBXQ "TARGET_SVE2p1 && TARGET_NON_STREAMING")]) + (define_int_iterator PERMUTEQ [UNSPEC_ZIP1Q UNSPEC_ZIP2Q UNSPEC_TRN1Q UNSPEC_TRN2Q UNSPEC_UZP1Q UNSPEC_UZP2Q]) @@ -3072,12 +3155,27 @@ (define_int_iterator SVE_INT_REDUCTION [UNSPEC_ANDV UNSPEC_UMINV UNSPEC_XORV]) +(define_int_iterator SVE_INT_REDUCTION_128 [UNSPEC_ADDQV + UNSPEC_ANDQV + UNSPEC_EORQV + UNSPEC_ORQV + UNSPEC_SMAXQV + UNSPEC_SMINQV + UNSPEC_UMAXQV + UNSPEC_UMINQV]) + (define_int_iterator SVE_FP_REDUCTION [UNSPEC_FADDV UNSPEC_FMAXV UNSPEC_FMAXNMV UNSPEC_FMINV UNSPEC_FMINNMV]) +(define_int_iterator SVE_FP_REDUCTION_128 [UNSPEC_FADDQV + UNSPEC_FMAXQV + UNSPEC_FMAXNMQV + UNSPEC_FMINQV + UNSPEC_FMINNMQV]) + (define_int_iterator SVE_COND_FP_UNARY [UNSPEC_COND_FABS UNSPEC_COND_FNEG UNSPEC_COND_FRECPX @@ -3629,6 +3727,8 @@ (define_int_attr optab [(UNSPEC_ANDF "and") (UNSPEC_UMINV "umin") (UNSPEC_SMAXV "smax") (UNSPEC_SMINV "smin") + (UNSPEC_ADDQV "addqv") + (UNSPEC_ANDQV "andqv") (UNSPEC_CADD90 "cadd90") (UNSPEC_CADD270 "cadd270") (UNSPEC_CDOT "cdot") @@ -3639,9 +3739,15 @@ (define_int_attr optab [(UNSPEC_ANDF "and") (UNSPEC_CMLA90 "cmla90") (UNSPEC_CMLA180 "cmla180") (UNSPEC_CMLA270 "cmla270") + (UNSPEC_EORQV "eorqv") (UNSPEC_FADDV "plus") + (UNSPEC_FADDQV "faddqv") + (UNSPEC_FMAXQV "fmaxqv") + (UNSPEC_FMAXNMQV "fmaxnmqv") (UNSPEC_FMAXNMV "smax") (UNSPEC_FMAXV "smax_nan") + (UNSPEC_FMINQV "fminqv") + (UNSPEC_FMINNMQV "fminnmqv") (UNSPEC_FMINNMV "smin") (UNSPEC_FMINV "smin_nan") (UNSPEC_SMUL_HIGHPART "smulh") @@ -3657,11 +3763,16 @@ (define_int_attr optab [(UNSPEC_ANDF "and") (UNSPEC_FTSSEL "ftssel") (UNSPEC_LD1_COUNT "ld1") (UNSPEC_LDNT1_COUNT "ldnt1") + (UNSPEC_ORQV "orqv") (UNSPEC_PMULLB "pmullb") (UNSPEC_PMULLB_PAIR "pmullb_pair") (UNSPEC_PMULLT "pmullt") (UNSPEC_PMULLT_PAIR "pmullt_pair") (UNSPEC_SMATMUL "smatmul") + (UNSPEC_SMAXQV "smaxqv") + (UNSPEC_SMINQV "sminqv") + (UNSPEC_UMAXQV "umaxqv") + (UNSPEC_UMINQV "uminqv") (UNSPEC_UZP "uzp") (UNSPEC_UZPQ "uzpq") (UNSPEC_ZIP "zip") @@ -3955,12 +4066,16 @@ (define_int_attr pauth_hint_num [(UNSPEC_PACIASP "25") (define_int_attr perm_insn [(UNSPEC_ZIP1 "zip1") (UNSPEC_ZIP2 "zip2") (UNSPEC_ZIP1Q "zip1") (UNSPEC_ZIP2Q "zip2") + (UNSPEC_ZIPQ1 "zipq1") (UNSPEC_ZIPQ2 "zipq2") (UNSPEC_TRN1 "trn1") (UNSPEC_TRN2 "trn2") (UNSPEC_TRN1Q "trn1") (UNSPEC_TRN2Q "trn2") (UNSPEC_UZP1 "uzp1") (UNSPEC_UZP2 "uzp2") (UNSPEC_UZP1Q "uzp1") (UNSPEC_UZP2Q "uzp2") + (UNSPEC_UZPQ1 "uzpq1") (UNSPEC_UZPQ2 "uzpq2") (UNSPEC_UZP "uzp") (UNSPEC_UZPQ "uzp") - (UNSPEC_ZIP "zip") (UNSPEC_ZIPQ "zip")]) + (UNSPEC_ZIP "zip") (UNSPEC_ZIPQ "zip") + (UNSPEC_TBL "tbl") (UNSPEC_TBLQ "tblq") + (UNSPEC_TBX "tbx") (UNSPEC_TBXQ "tbxq")]) ; op code for REV instructions (size within which elements are reversed). (define_int_attr rev_op [(UNSPEC_REV64 "64") (UNSPEC_REV32 "32") diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/dupq_1.c b/gcc/testsuite/gcc.target/aarch64/sve2/dupq_1.c new file mode 100644 index 00000000000..5472e30f812 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/dupq_1.c @@ -0,0 +1,162 @@ +/* { dg-options "-O2 -msve-vector-bits=256" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +#pragma GCC target "+sve2p1" + +typedef svint8_t fixed_uint8_t __attribute__((arm_sve_vector_bits(256))); +typedef svuint16_t fixed_uint16_t __attribute__((arm_sve_vector_bits(256))); +typedef svint32_t fixed_int32_t __attribute__((arm_sve_vector_bits(256))); +typedef svuint64_t fixed_uint64_t __attribute__((arm_sve_vector_bits(256))); + +/* +** f1: +** trn1 z0\.d, z0\.d, z0\.d +** ret +*/ +fixed_uint64_t +f1 (fixed_uint64_t z0) +{ + return __builtin_shufflevector (z0, z0, 0, 0, 2, 2); +} + +/* +** f2: +** trn2 z0\.d, z0\.d, z0\.d +** ret +*/ +fixed_uint64_t +f2 (fixed_uint64_t z0) +{ + return __builtin_shufflevector (z0, z0, 1, 1, 3, 3); +} + +/* +** f3: +** dupq z0\.s, z0\.s\[0\] +** ret +*/ +fixed_int32_t +f3 (fixed_int32_t z0) +{ + return __builtin_shufflevector (z0, z0, 0, 0, 0, 0, 4, 4, 4, 4); +} + +/* +** f4: +** dupq z0\.s, z0\.s\[1\] +** ret +*/ +fixed_int32_t +f4 (fixed_int32_t z0) +{ + return __builtin_shufflevector (z0, z0, 1, 1, 1, 1, 5, 5, 5, 5); +} + +/* +** f5: +** dupq z0\.s, z0\.s\[2\] +** ret +*/ +fixed_int32_t +f5 (fixed_int32_t z0) +{ + return __builtin_shufflevector (z0, z0, 2, 2, 2, 2, 6, 6, 6, 6); +} + +/* +** f6: +** dupq z0\.s, z0\.s\[3\] +** ret +*/ +fixed_int32_t +f6 (fixed_int32_t z0) +{ + return __builtin_shufflevector (z0, z0, 3, 3, 3, 3, 7, 7, 7, 7); +} + +/* +** f7: +** dupq z0\.h, z0\.h\[0\] +** ret +*/ +fixed_uint16_t +f7 (fixed_uint16_t z0) +{ + return __builtin_shufflevector (z0, z0, + 0, 0, 0, 0, 0, 0, 0, 0, + 8, 8, 8, 8, 8, 8, 8, 8); +} + + +/* +** f8: +** dupq z0\.h, z0\.h\[5\] +** ret +*/ +fixed_uint16_t +f8 (fixed_uint16_t z0) +{ + return __builtin_shufflevector (z0, z0, + 5, 5, 5, 5, 5, 5, 5, 5, + 13, 13, 13, 13, 13, 13, 13, 13); +} + +/* +** f9: +** dupq z0\.h, z0\.h\[7\] +** ret +*/ +fixed_uint16_t +f9 (fixed_uint16_t z0) +{ + return __builtin_shufflevector (z0, z0, + 7, 7, 7, 7, 7, 7, 7, 7, + 15, 15, 15, 15, 15, 15, 15, 15); +} + +/* +** f10: +** dupq z0\.b, z0\.b\[0\] +** ret +*/ +fixed_uint8_t +f10 (fixed_uint8_t z0) +{ + return __builtin_shufflevector (z0, z0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 16, 16, 16, 16, 16, 16, 16, 16, + 16, 16, 16, 16, 16, 16, 16, 16); +} + +/* +** f11: +** dupq z0\.b, z0\.b\[13\] +** ret +*/ +fixed_uint8_t +f11 (fixed_uint8_t z0) +{ + return __builtin_shufflevector (z0, z0, + 13, 13, 13, 13, 13, 13, 13, 13, + 13, 13, 13, 13, 13, 13, 13, 13, + 29, 29, 29, 29, 29, 29, 29, 29, + 29, 29, 29, 29, 29, 29, 29, 29); +} + +/* +** f12: +** dupq z0\.b, z0\.b\[15\] +** ret +*/ +fixed_uint8_t +f12 (fixed_uint8_t z0) +{ + return __builtin_shufflevector (z0, z0, + 15, 15, 15, 15, 15, 15, 15, 15, + 15, 15, 15, 15, 15, 15, 15, 15, + 31, 31, 31, 31, 31, 31, 31, 31, + 31, 31, 31, 31, 31, 31, 31, 31); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/extq_1.c b/gcc/testsuite/gcc.target/aarch64/sve2/extq_1.c new file mode 100644 index 00000000000..03c5fb143f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/extq_1.c @@ -0,0 +1,128 @@ +/* { dg-options "-O2 -msve-vector-bits=256" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +#pragma GCC target "+sve2p1" + +typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(256))); +typedef svfloat16_t fixed_float16_t __attribute__((arm_sve_vector_bits(256))); +typedef svuint32_t fixed_uint32_t __attribute__((arm_sve_vector_bits(256))); +typedef svfloat64_t fixed_float64_t __attribute__((arm_sve_vector_bits(256))); + +/* +** f1: +** extq z0\.b, z0\.b, z1\.b, #8 +** ret +*/ +fixed_float64_t +f1 (fixed_float64_t z0, fixed_float64_t z1) +{ + return __builtin_shufflevector (z0, z1, 1, 4, 3, 6); +} + +/* +** f2: +** extq z0\.b, z0\.b, z1\.b, #4 +** ret +*/ +fixed_uint32_t +f2 (fixed_uint32_t z0, fixed_uint32_t z1) +{ + return __builtin_shufflevector (z0, z1, 1, 2, 3, 8, 5, 6, 7, 12); +} + +/* +** f3: +** extq z0\.b, z0\.b, z1\.b, #12 +** ret +*/ +fixed_uint32_t +f3 (fixed_uint32_t z0, fixed_uint32_t z1) +{ + return __builtin_shufflevector (z0, z1, 3, 8, 9, 10, 7, 12, 13, 14); +} + +/* +** f4: +** extq z0\.b, z0\.b, z1\.b, #2 +** ret +*/ +fixed_float16_t +f4 (fixed_float16_t z0, fixed_float16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 1, 2, 3, 4, 5, 6, 7, 16, + 9, 10, 11, 12, 13, 14, 15, 24); +} + +/* +** f5: +** extq z0\.b, z0\.b, z1\.b, #10 +** ret +*/ +fixed_float16_t +f5 (fixed_float16_t z0, fixed_float16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 5, 6, 7, 16, 17, 18, 19, 20, + 13, 14, 15, 24, 25, 26, 27, 28); +} + +/* +** f6: +** extq z0\.b, z0\.b, z1\.b, #14 +** ret +*/ +fixed_float16_t +f6 (fixed_float16_t z0, fixed_float16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 7, 16, 17, 18, 19, 20, 21, 22, + 15, 24, 25, 26, 27, 28, 29, 30); +} + +/* +** f7: +** extq z0\.b, z0\.b, z1\.b, #1 +** ret +*/ +fixed_int8_t +f7 (fixed_int8_t z0, fixed_int8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 1, 2, 3, 4, 5, 6, 7, 8, + 9, 10, 11, 12, 13, 14, 15, 32, + 17, 18, 19, 20, 21, 22, 23, 24, + 25, 26, 27, 28, 29, 30, 31, 48); +} + +/* +** f8: +** extq z0\.b, z0\.b, z1\.b, #11 +** ret +*/ +fixed_int8_t +f8 (fixed_int8_t z0, fixed_int8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 11, 12, 13, 14, 15, 32, 33, 34, + 35, 36, 37, 38, 39, 40, 41, 42, + 27, 28, 29, 30, 31, 48, 49, 50, + 51, 52, 53, 54, 55, 56, 57, 58); +} + +/* +** f9: +** extq z0\.b, z0\.b, z1\.b, #15 +** ret +*/ +fixed_int8_t +f9 (fixed_int8_t z0, fixed_int8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 15, 32, 33, 34, 35, 36, 37, 38, + 39, 40, 41, 42, 43, 44, 45, 46, + 31, 48, 49, 50, 51, 52, 53, 54, + 55, 56, 57, 58, 59, 60, 61, 62); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1.c b/gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1.c new file mode 100644 index 00000000000..f923e9447ec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1.c @@ -0,0 +1,111 @@ +/* { dg-options "-O2 -msve-vector-bits=256" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +#pragma GCC target "+sve2p1" + +typedef svuint8_t fixed_uint8_t __attribute__((arm_sve_vector_bits(256))); +typedef svbfloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(256))); +typedef svfloat32_t fixed_float32_t __attribute__((arm_sve_vector_bits(256))); +typedef svint64_t fixed_int64_t __attribute__((arm_sve_vector_bits(256))); + +/* +** f1: +** trn1 z0\.d, z0\.d, z1\.d +** ret +*/ +fixed_int64_t +f1 (fixed_int64_t z0, fixed_int64_t z1) +{ + return __builtin_shufflevector (z0, z1, 0, 4, 2, 6); +} + +/* +** f2: +** trn2 z0\.d, z0\.d, z1\.d +** ret +*/ +fixed_int64_t +f2 (fixed_int64_t z0, fixed_int64_t z1) +{ + return __builtin_shufflevector (z0, z1, 1, 5, 3, 7); +} + +/* +** f3: +** uzpq1 z0\.s, z0\.s, z1\.s +** ret +*/ +fixed_float32_t +f3 (fixed_float32_t z0, fixed_float32_t z1) +{ + return __builtin_shufflevector (z0, z1, 0, 2, 8, 10, 4, 6, 12, 14); +} + +/* +** f4: +** uzpq2 z0\.s, z0\.s, z1\.s +** ret +*/ +fixed_float32_t +f4 (fixed_float32_t z0, fixed_float32_t z1) +{ + return __builtin_shufflevector (z0, z1, 1, 3, 9, 11, 5, 7, 13, 15); +} + +/* +** f5: +** uzpq1 z0\.h, z0\.h, z1\.h +** ret +*/ +fixed_bfloat16_t +f5 (fixed_bfloat16_t z0, fixed_bfloat16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 0, 2, 4, 6, 16, 18, 20, 22, + 8, 10, 12, 14, 24, 26, 28, 30); +} + +/* +** f6: +** uzpq2 z0\.h, z0\.h, z1\.h +** ret +*/ +fixed_bfloat16_t +f6 (fixed_bfloat16_t z0, fixed_bfloat16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 1, 3, 5, 7, 17, 19, 21, 23, + 9, 11, 13, 15, 25, 27, 29, 31); +} + +/* +** f7: +** uzpq1 z0\.b, z0\.b, z1\.b +** ret +*/ +fixed_uint8_t +f7 (fixed_uint8_t z0, fixed_uint8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 0, 2, 4, 6, 8, 10, 12, 14, + 32, 34, 36, 38, 40, 42, 44, 46, + 16, 18, 20, 22, 24, 26, 28, 30, + 48, 50, 52, 54, 56, 58, 60, 62); +} + +/* +** f8: +** uzpq2 z0\.b, z0\.b, z1\.b +** ret +*/ +fixed_uint8_t +f8 (fixed_uint8_t z0, fixed_uint8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 1, 3, 5, 7, 9, 11, 13, 15, + 33, 35, 37, 39, 41, 43, 45, 47, + 17, 19, 21, 23, 25, 27, 29, 31, + 49, 51, 53, 55, 57, 59, 61, 63); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/zipq_1.c b/gcc/testsuite/gcc.target/aarch64/sve2/zipq_1.c new file mode 100644 index 00000000000..fa420a959c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/zipq_1.c @@ -0,0 +1,111 @@ +/* { dg-options "-O2 -msve-vector-bits=256" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +#pragma GCC target "+sve2p1" + +typedef svuint8_t fixed_uint8_t __attribute__((arm_sve_vector_bits(256))); +typedef svbfloat16_t fixed_bfloat16_t __attribute__((arm_sve_vector_bits(256))); +typedef svfloat32_t fixed_float32_t __attribute__((arm_sve_vector_bits(256))); +typedef svint64_t fixed_int64_t __attribute__((arm_sve_vector_bits(256))); + +/* +** f1: +** trn1 z0\.d, z0\.d, z1\.d +** ret +*/ +fixed_int64_t +f1 (fixed_int64_t z0, fixed_int64_t z1) +{ + return __builtin_shufflevector (z0, z1, 0, 4, 2, 6); +} + +/* +** f2: +** trn2 z0\.d, z0\.d, z1\.d +** ret +*/ +fixed_int64_t +f2 (fixed_int64_t z0, fixed_int64_t z1) +{ + return __builtin_shufflevector (z0, z1, 1, 5, 3, 7); +} + +/* +** f3: +** zipq1 z0\.s, z0\.s, z1\.s +** ret +*/ +fixed_float32_t +f3 (fixed_float32_t z0, fixed_float32_t z1) +{ + return __builtin_shufflevector (z0, z1, 0, 8, 1, 9, 4, 12, 5, 13); +} + +/* +** f4: +** zipq2 z0\.s, z0\.s, z1\.s +** ret +*/ +fixed_float32_t +f4 (fixed_float32_t z0, fixed_float32_t z1) +{ + return __builtin_shufflevector (z0, z1, 2, 10, 3, 11, 6, 14, 7, 15); +} + +/* +** f5: +** zipq1 z0\.h, z0\.h, z1\.h +** ret +*/ +fixed_bfloat16_t +f5 (fixed_bfloat16_t z0, fixed_bfloat16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 0, 16, 1, 17, 2, 18, 3, 19, + 8, 24, 9, 25, 10, 26, 11, 27); +} + +/* +** f6: +** zipq2 z0\.h, z0\.h, z1\.h +** ret +*/ +fixed_bfloat16_t +f6 (fixed_bfloat16_t z0, fixed_bfloat16_t z1) +{ + return __builtin_shufflevector (z0, z1, + 4, 20, 5, 21, 6, 22, 7, 23, + 12, 28, 13, 29, 14, 30, 15, 31); +} + +/* +** f7: +** zipq1 z0\.b, z0\.b, z1\.b +** ret +*/ +fixed_uint8_t +f7 (fixed_uint8_t z0, fixed_uint8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 0, 32, 1, 33, 2, 34, 3, 35, + 4, 36, 5, 37, 6, 38, 7, 39, + 16, 48, 17, 49, 18, 50, 19, 51, + 20, 52, 21, 53, 22, 54, 23, 55); +} + +/* +** f8: +** zipq2 z0\.b, z0\.b, z1\.b +** ret +*/ +fixed_uint8_t +f8 (fixed_uint8_t z0, fixed_uint8_t z1) +{ + return __builtin_shufflevector (z0, z1, + 8, 40, 9, 41, 10, 42, 11, 43, + 12, 44, 13, 45, 14, 46, 15, 47, + 24, 56, 25, 57, 26, 58, 27, 59, + 28, 60, 29, 61, 30, 62, 31, 63); +} From patchwork Wed Nov 6 18:25:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007671 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkDCy2qHpz1xyM for ; Thu, 7 Nov 2024 05:25:50 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8F3383858C39 for ; Wed, 6 Nov 2024 18:25:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id EA61C3858CDB; Wed, 6 Nov 2024 18:25:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EA61C3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EA61C3858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917526; cv=none; b=XykXsN4Lphxavy+DQBjgTgoIMLGi1RBnyNzu8mFDMGbiFjEZRxoZDkuEugzjfKF6LSob7mM7xAViifRwMp2Nx2+lNSpcUGCBq9Wguy7yY2QXbdTpIEuUDX4/O9ksu9QdVT0H4F9zTwL0k2cTtmJGZ1GFQROZk1gsDrW5WJ+Yyeg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917526; c=relaxed/simple; bh=wHgbTRay0ozpEgE87s6dkyFeAf+ZWwRXsoTuhoofH1w=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=Sm0Jl8lq2W6hmZY+NjhAlDwEcvoz4fCxDCygMGUuo1FD4b+sqUH+T3bdaDUYFlnqWuMp1MLGXFvqLaasaYr5klt1iasNByCrQ/5LM+lZApmOQxn03lsRDMlvNt4eaCkglrEvPWDLAxUvKoo2aszij1oMNv6VGHYOafVo+jpb1NY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 951D8497; Wed, 6 Nov 2024 10:25:52 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1C0C43F66E; Wed, 6 Nov 2024 10:25:21 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 15/15] aarch64: Conditionally define __ARM_FEATURE_SVE2p1 In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:25:20 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Previous patches are supposed to add full support for SVE2.1, so this patch advertises that through __ARM_FEATURE_SVE2p1. pragma_cpp_predefs_3.c had one fewer pop than push. The final test is triple-nested: - armv8-a (to start with a clean slate, untainted by command-line flags) - the maximal SVE set - general-regs-only gcc/ * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Handle __ARM_FEATURE_SVE2p1. gcc/testsuite/ * gcc.target/aarch64/pragma_cpp_predefs_3.c: Add SVE2p1 tests. --- gcc/config/aarch64/aarch64-c.cc | 1 + .../gcc.target/aarch64/pragma_cpp_predefs_3.c | 84 +++++++++++++++++++ 2 files changed, 85 insertions(+) diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index f9b9e379375..d1ae80c0bb3 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -214,6 +214,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile) "__ARM_FEATURE_SVE2_BITPERM", pfile); aarch64_def_or_undef (TARGET_SVE2_SHA3, "__ARM_FEATURE_SVE2_SHA3", pfile); aarch64_def_or_undef (TARGET_SVE2_SM4, "__ARM_FEATURE_SVE2_SM4", pfile); + aarch64_def_or_undef (TARGET_SVE2p1, "__ARM_FEATURE_SVE2p1", pfile); aarch64_def_or_undef (TARGET_LSE, "__ARM_FEATURE_ATOMICS", pfile); aarch64_def_or_undef (TARGET_AES, "__ARM_FEATURE_AES", pfile); diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_3.c b/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_3.c index 39128528600..f1f70ed7b5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_3.c +++ b/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_3.c @@ -28,6 +28,10 @@ #error "__ARM_FEATURE_SVE2_SM4 is defined but should not be!" #endif +#ifdef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is defined but should not be!" +#endif + #pragma GCC push_options #pragma GCC target ("arch=armv8.2-a+sve") @@ -55,6 +59,10 @@ #error "__ARM_FEATURE_SVE2_SM4 is defined but should not be!" #endif +#ifdef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is defined but should not be!" +#endif + #pragma GCC pop_options #pragma GCC push_options @@ -84,6 +92,10 @@ #error "__ARM_FEATURE_SVE2_SM4 is defined but should not be!" #endif +#ifdef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is defined but should not be!" +#endif + #pragma GCC pop_options #pragma GCC push_options @@ -242,6 +254,72 @@ #error "__ARM_FEATURE_SVE2_SM4 is not defined but should be!" #endif +#pragma GCC pop_options + +#pragma GCC push_options +#pragma GCC target ("arch=armv9-a+sve2p1") + +#ifndef __ARM_FEATURE_SVE +#error "__ARM_FEATURE_SVE is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2 +#error "__ARM_FEATURE_SVE2 is not defined but should be!" +#endif + +#ifdef __ARM_FEATURE_SVE2_AES +#error "__ARM_FEATURE_SVE2_AES is defined but should not be!" +#endif + +#ifdef __ARM_FEATURE_SVE2_BITPERM +#error "__ARM_FEATURE_SVE2_BITPERM is defined but should not be!" +#endif + +#ifdef __ARM_FEATURE_SVE2_SHA3 +#error "__ARM_FEATURE_SVE2_SHA3 is defined but should not be!" +#endif + +#ifdef __ARM_FEATURE_SVE2_SM4 +#error "__ARM_FEATURE_SVE2_SM4 is defined but should not be!" +#endif + +#ifndef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is not defined but should be!" +#endif + +#pragma GCC pop_options + +#pragma GCC push_options +#pragma GCC target ("arch=armv9-a+sve2-aes+sve2-bitperm+sve2-sha3+sve2-sm4+sve2p1") + +#ifndef __ARM_FEATURE_SVE +#error "__ARM_FEATURE_SVE is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2 +#error "__ARM_FEATURE_SVE2 is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2_AES +#error "__ARM_FEATURE_SVE2_AES is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2_BITPERM +#error "__ARM_FEATURE_SVE2_BITPERM is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2_SHA3 +#error "__ARM_FEATURE_SVE2_SHA3 is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2_SM4 +#error "__ARM_FEATURE_SVE2_SM4 is not defined but should be!" +#endif + +#ifndef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is not defined but should be!" +#endif + #pragma GCC push_options #pragma GCC target ("general-regs-only") @@ -269,6 +347,12 @@ #error "__ARM_FEATURE_SVE2_SM4 is defined but should not be!" #endif +#ifdef __ARM_FEATURE_SVE2p1 +#error "__ARM_FEATURE_SVE2p1 is defined but should not be!" +#endif + +#pragma GCC pop_options + #pragma GCC pop_options #pragma GCC pop_options