From patchwork Tue Dec 21 12:31:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1571621 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=SPtYUQP1; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JJG8q62cYz9s1l for ; Tue, 21 Dec 2021 23:33:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 81AFA3858421 for ; Tue, 21 Dec 2021 12:33:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 81AFA3858421 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1640090019; bh=3SAoCer6r19QCtLyqd4Y7v2y9fed7vgtX4yosdMMiFA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=SPtYUQP1echWyE2gNuj63/NmhDA19cWe/rI6KD2ZOHb9httHzvXwn5jYJyNDVwWbE YihthQkzlQKSVYkqKUgUB0zpjhEAOT3IA71OThUS0+Km8oU0+mtzFS4aV/LDcp53Q0 ptRvqFO4tQmQSnZv7ORa4S+S3ueYGWouSdxyPv3s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60042.outbound.protection.outlook.com [40.107.6.42]) by sourceware.org (Postfix) with ESMTPS id DDBA43858400 for ; Tue, 21 Dec 2021 12:32:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DDBA43858400 Received: from DB6PR0801CA0065.eurprd08.prod.outlook.com (2603:10a6:4:2b::33) by DB8PR08MB5340.eurprd08.prod.outlook.com (2603:10a6:10:11c::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4801.17; Tue, 21 Dec 2021 12:32:03 +0000 Received: from DB5EUR03FT014.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:2b:cafe::9) by DB6PR0801CA0065.outlook.office365.com (2603:10a6:4:2b::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4801.17 via Frontend Transport; Tue, 21 Dec 2021 12:32:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT014.mail.protection.outlook.com (10.152.20.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4801.14 via Frontend Transport; Tue, 21 Dec 2021 12:32:02 +0000 Received: ("Tessian outbound dbb52aec1fa6:v110"); Tue, 21 Dec 2021 12:32:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 1e518425402fc1a9 X-CR-MTA-TID: 64aa7808 Received: from dc9c2b3b58c7.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B4693F71-8A56-4F1B-8126-D08C3C7ED614.1; Tue, 21 Dec 2021 12:31:56 +0000 Received: from EUR02-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id dc9c2b3b58c7.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 21 Dec 2021 12:31:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nilD7ro7q4eDG/j7J+Gl4i1lgg5vKluKZK4r+x4N5/pkP/GWK3yhIDYhSG4x8oamyzHeKiAbHt9tTDQt69ugL+bSspSLA8M6DxUlii4vMe6Zt6evsKylRMUnHe8cYIow8KOqNjR8KEj2G6dr8S4g8qhuTwlQBHcCOT0VFK29kri3jAY1I5IS0DIiKchUH8/7jE8j7zbJu7DyNAh0wMHv/gPzViep2V0jnOqyrDkJ5V38aSNrxb2NV1Be+Wd2gvQ6IL5JgAvAQESidJrvaVLqFBLqaBK7dPJfii5z5Gop7Y3mrRvt6t3PaPjkgjxmUSNPwzxAuy3P1gyimpe68mhJEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3SAoCer6r19QCtLyqd4Y7v2y9fed7vgtX4yosdMMiFA=; b=WSfdl2QWpsOm2sXhnvyZbUgcx8Ss08mkFvsBUb3ykgcxVoRq6lfju6vWs/avGUqH3q66k4eVSCHy+zMaoYzCot2l6mQhjzpywT4h2vcdDU30B/rkm7Bm6xgMy5BsLIT1PHqDGqZUZdrxBjnxpmwSLV1Cj3Rj7igp421+KsSNThnth/dux0/opH6MDpMFwmjytZpkzMOljDu4RVv8av05m+ZvZoRqBMkqXHecPOKAj0QIgg36RTOvrraAoAyvTpx/OP7oTowVpMh46WkzzsR2efqOGKRAnOEXJdgUiQ0VY5haUT4nbkYDjYuwRqs7nowTJPXZ+2TletzjlESL10fKKg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR08MB4253.eurprd08.prod.outlook.com (2603:10a6:803:f6::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4801.18; Tue, 21 Dec 2021 12:31:53 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::12a:3d2c:81ff:8fcc]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::12a:3d2c:81ff:8fcc%8]) with mapi id 15.20.4801.022; Tue, 21 Dec 2021 12:31:53 +0000 Date: Tue, 21 Dec 2021 12:31:45 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH][AArch32]: correct usdot-product RTL patterns. Message-ID: Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SA0PR12CA0006.namprd12.prod.outlook.com (2603:10b6:806:6f::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 2dc2b364-b6fe-42b0-8a5d-08d9c47de444 X-MS-TrafficTypeDiagnostic: VI1PR08MB4253:EE_|DB5EUR03FT014:EE_|DB8PR08MB5340:EE_ X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:7219;OLM:7219; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: pBMZnrwfc9o9QplP8T6mVj8kUlFQB8OFTpFr5vFlQt0HJQR9F6IRJtFbgbTeAsfoSKY77SAtCPNNcFx3y5dA0jMVML0IQHHzRHc2jt4aPJwI1ad/rmGefXwbUGUccGLAFk4o3lRXXaUjN0eoCxscEd+fxdhmnwaTAEySSAsKdfqZT8pxJ/719R+2CeISxyH2LPXVwEqZqJKUaM9Rd8Nii2iTTN7EhzltwqxkEV2g99K1KIbeqVVQw4NH9ePU7R6dGOZ/cAUJ6kA4FumAK5NSyrJ4t+8qZlT1bUbNuSOy7cYH6LCi7Kz/7be3q7TjvNzock8KNXx9JS3Eir7ylds5xX5f6xbrDnhXefZx7iuhkmNIZHuSpGFuvwbqq+q27tz0Vga7xiUc5fRbpKMey1kw0eYAApfM44puhsCEE3GmlNvS5cOs0mMolL3x+hZ1qS9t7MlG4F5AAqRtDutNgERFaDfNJnSplu7i2UxySP14trXiGzkg24Eant6RE8FPYoI06Tl6x1SDGNtj0badOefdLKJEwMONcjA8yh7AkvW40Lyf2+htE3kLH83w2XCHiO4IhJ5sHMy/wiUCSEzYQ95B4loxMqEejsH7E90laBOolcL5o2wUK+Zk9JOMkNXAvyRivCePiDj3dHeIEjumodgDIElM57HNNhjI/b6rtdhyApuGnwGvSQcktzVxHLyF3IYA1W2JHr66zMsg8vKRIv3MzTGybrNpVUvi+G8s3nUcnTOKzc8skXixfbtzlJJMhMmz7jppxHiHcGIa5N0HTsZ5u6bqcwSjJX4INznIKeb/lVRyYeVuy0PRI9Hp3Y8HxUy345PIbAk+wCRZmi9cOIm7oYarvU54uaX/bwezHj+HvLBB6FkzaFSjdvNVLnnNYtig X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(6512007)(66946007)(66556008)(186003)(6506007)(2906002)(508600001)(38350700002)(36756003)(6486002)(4326008)(38100700002)(66476007)(8676002)(4743002)(26005)(966005)(86362001)(5660300002)(316002)(8936002)(6916009)(235185007)(2616005)(6666004)(84970400001)(44144004)(33964004)(52116002)(44832011)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB4253 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT014.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 18cd3b6a-c937-4fa8-27e9-08d9c47dde77 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gnevud2Jj4fA8U3gYK6JGApD2SD916RQGBnRQuoZAAKs52ogDY5Go8FCG4VYS/7aO/Y7EZ/xAnLnx+gkkcKUZ2n6mcNuuJ4owYsFpis3E1AfMwr4VmIezLite8DyHDvvu4CRu5ZDEZdoR7tX+PrcXL52HIx5oSB1gOSJTajMY5fbt9c8R32RadnkrKza/5DZg/nXYaSKYOSN0Ohp9PvGnDD3C1Ru7xjbm6TZFBkAjiy+L7aiFPR7pfwwoFBRzjA9eT7uUMZg6B2uwXwQuHLswG/dsxibeaLNGN1IM0hX9lV4cYTSIwi2VWIdBljZ4PGNbv2rG0VTC3aFasUPEdOscVyg5f+elYHkspeLDJ0/KGqnEz9miJTmnYCWPe5ZpYDffseO48evPdPeDzD3ibsyV5v2ULbJiS96wfKRcTIEZ/NUl92EI5SUG2xgPG8fovOnq+BNWcUZx4JLJj5Wc9ALvHc6FagX625Tz6fN5qpYmxy+SS9Y/8cdeleZUEG0iAEqUZbCYs/kJbFUTj5xiWdC4lkDAgQWk4hZqO7mCf5HdqznX/hfmLwQRumxYgwsZl/Dfjv2rTjBI3w2n3vTBEKHGW25F5GIwOZgOHe/wm1vSoCWRaLB7A+63BtKyc3EYYl9QYY60SfKh/UDdI0APFDlbZ4szZ6o3tFaZTJfPjpWBMP16j9X9QXvDfja9flVH5GUawgniajdcHFPDvlYWYjKf7CXSZK6/qoctE9buhMJ3CMJexGAT6nxZ3LHMiKBCjDVpFHeSZBgb/clDT4C44Q+2NbHDCX9NTF2klFht2ieyvNtbHUuJQt23nmQXVpcdJls3LNy3/Wrx35MEAcEKOWXt1BERmmAJM6bJeJw+Xnl+eLRSfr48oxqRW9vNgFSvT/I X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(4743002)(316002)(26005)(6512007)(47076005)(186003)(81166007)(8676002)(6666004)(84970400001)(82310400004)(336012)(6506007)(86362001)(508600001)(8936002)(44144004)(6486002)(36860700001)(33964004)(5660300002)(4326008)(36756003)(235185007)(70206006)(2906002)(2616005)(966005)(356005)(6916009)(44832011)(70586007)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Dec 2021 12:32:02.9132 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2dc2b364-b6fe-42b0-8a5d-08d9c47de444 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT014.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5340 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, Ramana.Radhakrishnan@arm.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, There was a bug in the ACLE specication for dot product which has now been fixed[1]. This means some intrinsics were missing and are added by this patch. Bootstrapped and regtested on arm-none-linux-gnueabihf and no issues. Ok for master? [1] https://github.com/ARM-software/acle/releases/tag/r2021Q3 Thanks, Tamar gcc/ChangeLog: * config/arm/arm_neon.h (vusdotq_s32, vusdot_laneq_s32, vusdotq_laneq_s32, vsudot_laneq_s32, vsudotq_laneq_s32): New * config/arm/arm_neon_builtins.def (usdot): Add V16QI. (usdot_laneq, sudot_laneq): New. * config/arm/neon.md (neon_dot_laneq): New. (neon_dot_lane): Remote unneeded code. gcc/testsuite/ChangeLog: * gcc.target/arm/simd/vdot-2-1.c: Add new tests. * gcc.target/arm/simd/vdot-2-2.c: Likewise and fix output. --- inline copy of patch -- diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index af6ac63dc3b47830d92f199d93153ff510f658e9..2255d600549a2a1e5dbcebc03f7d6a63bab9f5aa 100644 diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index af6ac63dc3b47830d92f199d93153ff510f658e9..2255d600549a2a1e5dbcebc03f7d6a63bab9f5aa 100644 --- a/gcc/config/arm/arm_neon.h +++ b/gcc/config/arm/arm_neon.h @@ -18930,6 +18930,13 @@ vusdot_s32 (int32x2_t __r, uint8x8_t __a, int8x8_t __b) return __builtin_neon_usdotv8qi_ssus (__r, __a, __b); } +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vusdotq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b) +{ + return __builtin_neon_usdotv16qi_ssus (__r, __a, __b); +} + __extension__ extern __inline int32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vusdot_lane_s32 (int32x2_t __r, uint8x8_t __a, @@ -18962,6 +18969,38 @@ vsudotq_lane_s32 (int32x4_t __r, int8x16_t __a, return __builtin_neon_sudot_lanev16qi_sssus (__r, __a, __b, __index); } +__extension__ extern __inline int32x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vusdot_laneq_s32 (int32x2_t __r, uint8x8_t __a, + int8x16_t __b, const int __index) +{ + return __builtin_neon_usdot_laneqv8qi_ssuss (__r, __a, __b, __index); +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vusdotq_laneq_s32 (int32x4_t __r, uint8x16_t __a, + int8x16_t __b, const int __index) +{ + return __builtin_neon_usdot_laneqv16qi_ssuss (__r, __a, __b, __index); +} + +__extension__ extern __inline int32x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vsudot_laneq_s32 (int32x2_t __r, int8x8_t __a, + uint8x16_t __b, const int __index) +{ + return __builtin_neon_sudot_laneqv8qi_sssus (__r, __a, __b, __index); +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vsudotq_laneq_s32 (int32x4_t __r, int8x16_t __a, + uint8x16_t __b, const int __index) +{ + return __builtin_neon_sudot_laneqv16qi_sssus (__r, __a, __b, __index); +} + #pragma GCC pop_options #pragma GCC pop_options diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def index f83dd4327c16c0af68f72eb6d9ca8cf21e2e56b5..1c150ed3b650a003b44901b4d160a7d6f595f057 100644 --- a/gcc/config/arm/arm_neon_builtins.def +++ b/gcc/config/arm/arm_neon_builtins.def @@ -345,9 +345,11 @@ VAR2 (UMAC_LANE, udot_lane, v8qi, v16qi) VAR2 (MAC_LANE, sdot_laneq, v8qi, v16qi) VAR2 (UMAC_LANE, udot_laneq, v8qi, v16qi) -VAR1 (USTERNOP, usdot, v8qi) +VAR2 (USTERNOP, usdot, v8qi, v16qi) VAR2 (USMAC_LANE_QUADTUP, usdot_lane, v8qi, v16qi) VAR2 (SUMAC_LANE_QUADTUP, sudot_lane, v8qi, v16qi) +VAR2 (USMAC_LANE_QUADTUP, usdot_laneq, v8qi, v16qi) +VAR2 (SUMAC_LANE_QUADTUP, sudot_laneq, v8qi, v16qi) VAR4 (BINOP, vcadd90, v4hf, v2sf, v8hf, v4sf) VAR4 (BINOP, vcadd270, v4hf, v2sf, v8hf, v4sf) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 848166311b5f82c5facb66e97c2260a5aba5d302..1707d8e625079b83497a3db44db5e33405bb5fa1 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2977,9 +2977,33 @@ (define_insn "neon_dot_lane" DOTPROD_I8MM) (match_operand:VCVTI 1 "register_operand" "0")))] "TARGET_I8MM" + "vdot.\\t%0, %2, %P3[%c4]" + [(set_attr "type" "neon_dot")] +) + +;; These instructions map to the __builtins for the Dot Product +;; indexed operations in the v8.6 I8MM extension. +(define_insn "neon_dot_laneq" + [(set (match_operand:VCVTI 0 "register_operand" "=w") + (plus:VCVTI + (unspec:VCVTI [(match_operand: 2 "register_operand" "w") + (match_operand:V16QI 3 "register_operand" "t") + (match_operand:SI 4 "immediate_operand" "i")] + DOTPROD_I8MM) + (match_operand:VCVTI 1 "register_operand" "0")))] + "TARGET_I8MM" { - operands[4] = GEN_INT (INTVAL (operands[4])); - return "vdot.\\t%0, %2, %P3[%c4]"; + int lane = INTVAL (operands[4]); + if (lane > GET_MODE_NUNITS (V2SImode) - 1) + { + operands[4] = GEN_INT (lane - GET_MODE_NUNITS (V2SImode)); + return "vdot.\\t%0, %2, %f3[%c4]"; + } + else + { + operands[4] = GEN_INT (lane); + return "vdot.\\t%0, %2, %e3[%c4]"; + } } [(set_attr "type" "neon_dot")] ) diff --git a/gcc/testsuite/gcc.target/arm/simd/vdot-2-1.c b/gcc/testsuite/gcc.target/arm/simd/vdot-2-1.c index 88b80cff2329d9c502f40a31bbef70d26251c909..35d713f6a60d3d5880ddc8b43e238b7403b4f135 100644 --- a/gcc/testsuite/gcc.target/arm/simd/vdot-2-1.c +++ b/gcc/testsuite/gcc.target/arm/simd/vdot-2-1.c @@ -2,7 +2,7 @@ /* { dg-require-effective-target arm_hard_ok } */ /* { dg-require-effective-target arm_v8_2a_i8mm_ok } */ /* { dg-add-options arm_v8_2a_i8mm } */ -/* { dg-additional-options "-O -save-temps -mfloat-abi=hard" } */ +/* { dg-additional-options "-O -save-temps -mfloat-abi=hard -mfpu=auto" } */ /* { dg-final { check-function-bodies "**" "" } } */ #include @@ -20,6 +20,17 @@ int32x2_t usfoo (int32x2_t r, uint8x8_t x, int8x8_t y) return vusdot_s32 (r, x, y); } +/* +**usfooq: +** ... +** vusdot\.s8 q0, q1, q2 +** bx lr +*/ +int32x4_t usfooq (int32x4_t r, uint8x16_t x, int8x16_t y) +{ + return vusdotq_s32 (r, x, y); +} + /* **usfoo_lane: ** ... @@ -66,6 +77,52 @@ int32x4_t sfooq_lane (int32x4_t r, int8x16_t x, uint8x8_t y) return vsudotq_lane_s32 (r, x, y, 1); } +/* +**usfoo_laneq: +** ... +** vusdot\.s8 d0, d1, d3\[0\] +** bx lr +*/ +int32x2_t usfoo_laneq (int32x2_t r, uint8x8_t x, int8x16_t y) +{ + return vusdot_laneq_s32 (r, x, y, 2); +} + +/* +**usfooq_laneq: +** ... +** vusdot\.s8 q0, q1, d5\[1\] +** bx lr +*/ +int32x4_t usfooq_laneq (int32x4_t r, uint8x16_t x, int8x16_t y) +{ + return vusdotq_laneq_s32 (r, x, y, 3); +} + +/* Signed-Unsigned Dot Product instructions. */ + +/* +**sfoo_laneq: +** ... +** vsudot\.u8 d0, d1, d3\[0\] +** bx lr +*/ +int32x2_t sfoo_laneq (int32x2_t r, int8x8_t x, uint8x16_t y) +{ + return vsudot_laneq_s32 (r, x, y, 2); +} + +/* +**sfooq_laneq: +** ... +** vsudot\.u8 q0, q1, d5\[1\] +** bx lr +*/ +int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, uint8x16_t y) +{ + return vsudotq_laneq_s32 (r, x, y, 3); +} + /* **usfoo_untied: ** ... diff --git a/gcc/testsuite/gcc.target/arm/simd/vdot-2-2.c b/gcc/testsuite/gcc.target/arm/simd/vdot-2-2.c index 1c74718ca5644be05b4d4839c3a7ea40bff11e40..c57dd423dbc45b2f9f7890ada0f081f80381b05c 100644 --- a/gcc/testsuite/gcc.target/arm/simd/vdot-2-2.c +++ b/gcc/testsuite/gcc.target/arm/simd/vdot-2-2.c @@ -2,7 +2,7 @@ /* { dg-require-effective-target arm_hard_ok } */ /* { dg-require-effective-target arm_v8_2a_i8mm_ok } */ /* { dg-add-options arm_v8_2a_i8mm } */ -/* { dg-additional-options "-O -save-temps -mbig-endian -mfloat-abi=hard" } */ +/* { dg-additional-options "-O -save-temps -mfloat-abi=hard -mbig-endian -mfpu=auto" } */ /* { dg-final { check-function-bodies "**" "" } } */ #include @@ -20,6 +20,17 @@ int32x2_t usfoo (int32x2_t r, uint8x8_t x, int8x8_t y) return vusdot_s32 (r, x, y); } +/* +**usfooq: +** ... +** vusdot\.s8 q0, q1, q2 +** bx lr +*/ +int32x4_t usfooq (int32x4_t r, uint8x16_t x, int8x16_t y) +{ + return vusdotq_s32 (r, x, y); +} + /* **usfoo_lane: ** ... @@ -66,6 +77,52 @@ int32x4_t sfooq_lane (int32x4_t r, int8x16_t x, uint8x8_t y) return vsudotq_lane_s32 (r, x, y, 1); } +/* +**usfoo_laneq: +** ... +** vusdot\.s8 d0, d1, d3\[0\] +** bx lr +*/ +int32x2_t usfoo_laneq (int32x2_t r, uint8x8_t x, int8x16_t y) +{ + return vusdot_laneq_s32 (r, x, y, 2); +} + +/* +**usfooq_laneq: +** ... +** vusdot\.s8 q0, q1, d5\[1\] +** bx lr +*/ +int32x4_t usfooq_laneq (int32x4_t r, uint8x16_t x, int8x16_t y) +{ + return vusdotq_laneq_s32 (r, x, y, 3); +} + +/* Signed-Unsigned Dot Product instructions. */ + +/* +**sfoo_laneq: +** ... +** vsudot\.u8 d0, d1, d3\[0\] +** bx lr +*/ +int32x2_t sfoo_laneq (int32x2_t r, int8x8_t x, uint8x16_t y) +{ + return vsudot_laneq_s32 (r, x, y, 2); +} + +/* +**sfooq_laneq: +** ... +** vsudot\.u8 q0, q1, d5\[1\] +** bx lr +*/ +int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, uint8x16_t y) +{ + return vsudotq_laneq_s32 (r, x, y, 3); +} + /* **usfoo_untied: ** ... @@ -89,3 +146,4 @@ int32x2_t usfoo_lane_untied (int32x2_t unused, int32x2_t r, uint8x8_t x, int8x8_ { return vusdot_lane_s32 (r, x, y, 0); } +