From patchwork Thu Jan 21 18:54:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1430024 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=NF7bUQra; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DMBQR0VrJz9sT6 for ; Fri, 22 Jan 2021 05:54:38 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3C2123898505; Thu, 21 Jan 2021 18:54:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3C2123898505 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611255276; bh=s7lJI9hoy3j1SzsJbYlYuyV0pIYKSqxtOoS0Q9bOiCQ=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=NF7bUQran1E9sqcKz9XGzED5NE4w1FdfwO03oaUQQh6OKIom+hNQz2WpKTgtrw5gD tPsldWUF30iajpRz8T2CJ//TLfBgiV91A3VYVh1J6oGWJorMSOSwTagvG6BUgFo9y9 eDJVf+JVMOhC91ZsTj40Sb0y6Wz3mzOqWIK7qPpw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2058.outbound.protection.outlook.com [40.107.20.58]) by sourceware.org (Postfix) with ESMTPS id 4245C3851C36 for ; Thu, 21 Jan 2021 18:54:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4245C3851C36 Received: from AM6P191CA0035.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8b::48) by AM0PR08MB2996.eurprd08.prod.outlook.com (2603:10a6:208:5b::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3763.10; Thu, 21 Jan 2021 18:54:27 +0000 Received: from VE1EUR03FT019.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:8b:cafe::53) by AM6P191CA0035.outlook.office365.com (2603:10a6:209:8b::48) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.12 via Frontend Transport; Thu, 21 Jan 2021 18:54:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT019.mail.protection.outlook.com (10.152.18.153) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11 via Frontend Transport; Thu, 21 Jan 2021 18:54:26 +0000 Received: ("Tessian outbound 4d8113405d55:v71"); Thu, 21 Jan 2021 18:54:26 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 7e57ee330e1fe3ea X-CR-MTA-TID: 64aa7808 Received: from 6070ee4c2bdd.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 5BC8B89E-FF6C-4BC8-876E-4A7CBBBB57C1.1; Thu, 21 Jan 2021 18:54:19 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6070ee4c2bdd.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 21 Jan 2021 18:54:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=g5m+er2EdtZIrihrUsS2q7mQp46nD8EvzOdxuyNC3BkbaKuit+R66iJSxkQuFuMhnwDlL7fXmM7K+A73PtNPIlC2GN1LdayHQ5dnxkPY+Uy3oupRnXvvQcx4b0v9rRkhpQBxMftLrQ+OgmNKvap/OK0uffiauZTwmqdeIA6FtayU5e5n+RwLR5ujFzI+lPrVPt8BQ9NuCjNa5+bg23ZafnqTUniR7WYSuns5ffpBTmDWOZJZ5blNzxs1N4cz/tsaMvupggVITZe8eGEIJ9lAWz0kRRSbuWFQ8rBvYwP6KLJSW5yAheWzdlH0IeNHjSmYZ+ToD4eaMKP1FQSrjDp5Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=s7lJI9hoy3j1SzsJbYlYuyV0pIYKSqxtOoS0Q9bOiCQ=; b=J5i1+4LOhMOQHQxSWD9V2uAMnm5Dti2inLBkbhsfrqkdDW/3rcBSgoRx0n77pCE1yFq17U+zDqWwaYFquptws9uA/4b4+pJKiQmKq9VWotZU2aEEg4u8cpTsPc/mSUd9f9xSguuJqCDCSX2j5/27n2vMya2dzxv5Sv/N/3WaPDXMHvXB5StP+v+d8s1DqO/CC0vfB56CyJ3z4tKLRlCD84z/mHTXRa0yr7fFt/AflfTps/1QGADK+NAbV+nHtP5mPEUzPoZZSgQJJMbNQtJffxh+3tbo98VzFEeMXpSF/N75umXCds5ISozQgBAdOtKk11jtoW12my7sgMd5rPJ9KA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR0802MB2574.eurprd08.prod.outlook.com (2603:10a6:800:ae::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11; Thu, 21 Jan 2021 18:54:16 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::ed1e:9499:4501:2118]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::ed1e:9499:4501:2118%7]) with mapi id 15.20.3784.012; Thu, 21 Jan 2021 18:54:16 +0000 Date: Thu, 21 Jan 2021 18:54:06 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH]Arm: Add NEON and MVE complex mul, mla and mls patterns. Message-ID: Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.53] X-ClientProxiedBy: SA9PR13CA0041.namprd13.prod.outlook.com (2603:10b6:806:22::16) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SA9PR13CA0041.namprd13.prod.outlook.com (2603:10b6:806:22::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.5 via Frontend Transport; Thu, 21 Jan 2021 18:54:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: c7341e67-b2a9-45ff-f005-08d8be3df9ed X-MS-TrafficTypeDiagnostic: VI1PR0802MB2574:|AM0PR08MB2996: X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:989;OLM:989; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: zPs95ZmEqi76uSF7tcORApibxkZC9aNRBgEaxrdt1vile/3KP8kdnars2KqgDc8+k7gV+DFwADmYD5mZVtFU2HVKfYWh3EVV/pUCg0ul4ZNNx5sB5Wdlwpqrg1kNQ3Bvpkg59uvVUUVTmfWR5kHMsVZKP4a7lfiGSl4wLQZD5IDnwSO8fozPNKlKRfBigdfxOPYv63Hfa/sJp8VN2U+DUMS05RU6qMG3MzH0w+o1FyUgtJONWWQzKdQOnVk1e3U8pwULQElNDqKV8CFWo+wqTOjwZ18zxHvvoFc+ZL8KHSMLHGX+hwvIslTaG4Kdaa13TrRfe1O2TVuR9aepsTGDKOBfh0R234uExD7HnBoooyhqGkXyHZmY7oQ3B6kw2g2LuvwPAv9ZKeF35Se+Ihlnxm2HvuTiVQC5/CaKfmTGC4n7usR/L4zw/SHV+XIiQNMmtDiIj9HEbvHfN4lrrqMjag== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39850400004)(396003)(136003)(346002)(376002)(366004)(86362001)(316002)(55016002)(6916009)(4743002)(8886007)(5660300002)(6666004)(44832011)(26005)(186003)(235185007)(2616005)(478600001)(956004)(16526019)(4326008)(8676002)(83380400001)(2906002)(8936002)(36756003)(66476007)(66556008)(7696005)(66616009)(66946007)(52116002)(33964004)(44144004)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?q?61Djfia0NFKerNo63IVi6fGtDoHkld?= =?utf-8?q?ZmJ2+EGiIfkeo+RAcPJibcECZCvG+RNKEuN9kz2jPk5sLDO/1lgIgfKHy4/uZIMVl?= =?utf-8?q?uxkmOGgZZZYTO82r8Ps94xuSzTnW5EMtkshXXZ8u/of/5jCoM2cH1mC41Tz9Ssc7v?= =?utf-8?q?QbBXT01poFNQAH8HxTFt4ptdt3INYGQ4EhTXBNI2LEcYUTaIhBdQg2dvLpNHrq3vy?= =?utf-8?q?302kDndIo0bX+dG/NoTcfO/F0r3M/cPFoQj8AdAgfzTuYD9yoRg/dmqoaMKGiMZsG?= =?utf-8?q?ZNe+C49o5XNjA/FNOl5CW5iuJYsFwWpGfihGeJ4NDLtTCsMbDqbcIaMzWdGm7Y9lL?= =?utf-8?q?Af18YLudX2+bYTPtKXzODLI/8RhKG28PulMJasJKVqLCeTrsgMsg1ZG+EwvXwJCUH?= =?utf-8?q?vGsl1cfS4G8k+oBHf3VUBKkqvyFm2e+sFY2aVONqkV79mcaZSoY8fBYe/ibHBHiT9?= =?utf-8?q?gEbH+CRZODGcMA8j7saJD7jD9aNLHksgApV9a8pJSe3uXfRbaYb7COG3WjXoVcd7h?= =?utf-8?q?zvh5mfNX43zNNxuZ7/T6YKxvi+o9+CF7dd2+7qiRTPNpNPe/r6U271EzibdMGWAN+?= =?utf-8?q?2v8fGHwHGgJ+gaSIx8hSHs02/C3terjcdmvZgN1NJL0qwYvArWYLSOEmlPUeHLiEJ?= =?utf-8?q?0ebfU/1XbXuwkd70vNj0Zk6im3CDvQuJwwlGwqyb9vRrxq8bDXe9igwAjPXMxODzl?= =?utf-8?q?x+u1KPzMXOJFsFgbkmmL0x/j/4mAn+JZ2C+vKDiLSe4AtOglnIm67N6qaCfRUA4iS?= =?utf-8?q?N5rzjvGZ47m2T609qXzBLF2M1ovreFSnZ9NVcFVRVc05LaOpL8y2RxX8yY0mXYj1T?= =?utf-8?q?CHNUMllKvtUlJ6J5Xn6VP9OmiGilxDRgFuQjsTYtlFlLyNDUpzswoh5b5FEeq3Pe2?= =?utf-8?q?9tUUMH30Vbg3nBUzZaJQ91iJKF6cSc/bNijoyDjg13y4t+NrEN6dn2L0wgnj+vhpP?= =?utf-8?q?f8jBbxZNnpgLTRm518P?= X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0802MB2574 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT019.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: eca24d4c-4895-4689-b306-08d8be3df314 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gxgWBm+aSnq9lJdGkZfykZ9M7f1EBa+R1mncgtFzjz2BaA5JsS3vTlp7dDn9Dm7WhwOyeCxwW6YYncveuJsmIz2SHTi3LBkI5EZDiHjHhjJEiRsKBTW5NHiXm+zTXjzscAPYv7I9WxdHzLdQCmIjCJAsxKJbgwjt9j47o5qksAh5yxooDRFUwU1ttJTsLjJJrSzWkL6NiiN/NbhoZx8dVe3m96XEpPH7s9U6oQwpAEwQPQ2CRI+MGNAXkn0HCjfHSFvan2xixjtyqm/VxofcT9j7sgYWgJntcdCQlB6tw1etKa0+t+Zj5Sp1bzZSTiiPHozaa/oBxQqulpgb8fvRKRW4YHk/Bh07vHE9XbxpORpINcwmpmP2TL7P4WNlqXd2YkziKhaG0e+LWH/Axd32Y1/TBeQRiP5UL9842KDkrYSDp6F25iUamRQuE+AfTbKa49CAHq1XLRucg2287w1ehI0oiYJL3XrQqkGjy0fTwkn5qJ4L/RkjLY16ORV3qZjTu1GyvR782Cor081OLWa1/g== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(376002)(39850400004)(396003)(136003)(46966006)(86362001)(4743002)(6916009)(44144004)(16526019)(8886007)(2616005)(235185007)(356005)(83380400001)(4326008)(55016002)(26005)(316002)(70206006)(82740400003)(956004)(44832011)(81166007)(66616009)(36756003)(2906002)(82310400003)(8676002)(5660300002)(33964004)(70586007)(6666004)(336012)(47076005)(478600001)(7696005)(186003)(8936002)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jan 2021 18:54:26.7046 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c7341e67-b2a9-45ff-f005-08d8be3df9ed X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT019.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB2996 X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, Ramana.Radhakrishnan@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex operations. With this the following C code: void g (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] * b[i]; } generates NEON: g: vmov.f32 q11, #0.0 @ v4sf add r3, r2, #1600 .L2: vmov q8, q11 @ v4sf vld1.32 {q10}, [r1]! vld1.32 {q9}, [r0]! vcmla.f32 q8, q9, q10, #0 vcmla.f32 q8, q9, q10, #90 vst1.32 {q8}, [r2]! cmp r3, r2 bne .L2 bx lr MVE: g: push {lr} mov lr, #100 dls lr, lr .L2: vldrw.32 q1, [r1], #16 vldrw.32 q2, [r0], #16 vcmul.f32 q3, q2, q1, #0 vcmla.f32 q3, q2, q1, #90 vstrw.32 q3, [r2], #16 le lr, .L2 ldr pc, [sp], #4 instead of g: add r3, r2, #1600 .L2: vld2.32 {d20-d23}, [r0]! vld2.32 {d16-d19}, [r1]! vmul.f32 q14, q11, q9 vmul.f32 q15, q11, q8 vneg.f32 q14, q14 vfma.f32 q15, q10, q9 vfma.f32 q14, q10, q8 vmov q13, q15 @ v4sf vmov q12, q14 @ v4sf vst2.32 {d24-d27}, [r2]! cmp r3, r2 bne .L2 bx lr and g: add r3, r2, #1600 .L2: vld2.32 {d20-d23}, [r0]! vld2.32 {d16-d19}, [r1]! vmul.f32 q15, q10, q8 vmul.f32 q14, q10, q9 vmls.f32 q15, q11, q9 vmla.f32 q14, q11, q8 vmov q12, q15 @ v4sf vmov q13, q14 @ v4sf vst2.32 {d24-d27}, [r2]! cmp r3, r2 bne .L2 bx lr respectively. Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Execution tests verified with QEMU. Generic tests for these are in the mid-end and I will enable them with a different patch. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/arm/iterators.md (rotsplit1, rotsplit2, conj_op, fcmac1, VCMLA_OP, VCMUL_OP): New. * config/arm/mve.md (mve_vcmlaq): Support vec_dup 0. * config/arm/neon.md (cmul3): New. * config/arm/unspecs.md (UNSPEC_VCMLA_CONJ, UNSPEC_VCMLA180_CONJ, UNSPEC_VCMUL_CONJ): New. * config/arm/vec-common.md (cmul3, arm_vcmla, cml4): New. --- inline copy of patch -- diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 2e0aacbd3f742538073e441b53fcffc45e37c790..b9027905307fe19d60d164cef23dac6ab119cd9b 100644 diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 2e0aacbd3f742538073e441b53fcffc45e37c790..b9027905307fe19d60d164cef23dac6ab119cd9b 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -1186,6 +1186,33 @@ (define_int_attr rot [(UNSPEC_VCADD90 "90") (UNSPEC_VCMLA180 "180") (UNSPEC_VCMLA270 "270")]) +;; The complex operations when performed on a real complex number require two +;; instructions to perform the operation. e.g. complex multiplication requires +;; two VCMUL with a particular rotation value. +;; +;; These values can be looked up in rotsplit1 and rotsplit2. as an example +;; VCMUL needs the first instruction to use #0 and the second #90. +(define_int_attr rotsplit1 [(UNSPEC_VCMLA "0") + (UNSPEC_VCMLA_CONJ "0") + (UNSPEC_VCMUL "0") + (UNSPEC_VCMUL_CONJ "0") + (UNSPEC_VCMLA180 "180") + (UNSPEC_VCMLA180_CONJ "180")]) + +(define_int_attr rotsplit2 [(UNSPEC_VCMLA "90") + (UNSPEC_VCMLA_CONJ "270") + (UNSPEC_VCMUL "90") + (UNSPEC_VCMUL_CONJ "270") + (UNSPEC_VCMLA180 "270") + (UNSPEC_VCMLA180_CONJ "90")]) + +(define_int_attr conj_op [(UNSPEC_VCMLA180 "") + (UNSPEC_VCMLA180_CONJ "_conj") + (UNSPEC_VCMLA "") + (UNSPEC_VCMLA_CONJ "_conj") + (UNSPEC_VCMUL "") + (UNSPEC_VCMUL_CONJ "_conj")]) + (define_int_attr mve_rot [(UNSPEC_VCADD90 "_rot90") (UNSPEC_VCADD270 "_rot270") (UNSPEC_VCMLA "") @@ -1200,6 +1227,9 @@ (define_int_attr mve_rot [(UNSPEC_VCADD90 "_rot90") (define_int_iterator VCMUL [UNSPEC_VCMUL UNSPEC_VCMUL90 UNSPEC_VCMUL180 UNSPEC_VCMUL270]) +(define_int_attr fcmac1 [(UNSPEC_VCMLA "a") (UNSPEC_VCMLA_CONJ "a") + (UNSPEC_VCMLA180 "s") (UNSPEC_VCMLA180_CONJ "s")]) + (define_int_attr simd32_op [(UNSPEC_QADD8 "qadd8") (UNSPEC_QSUB8 "qsub8") (UNSPEC_SHADD8 "shadd8") (UNSPEC_SHSUB8 "shsub8") (UNSPEC_UHADD8 "uhadd8") (UNSPEC_UHSUB8 "uhsub8") @@ -1723,3 +1753,13 @@ (define_int_iterator VADCQ_M [VADCQ_M_U VADCQ_M_S]) (define_int_iterator UQRSHLLQ [UQRSHLL_64 UQRSHLL_48]) (define_int_iterator SQRSHRLQ [SQRSHRL_64 SQRSHRL_48]) (define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U]) + +;; Define iterators for VCMLA operations +(define_int_iterator VCMLA_OP [UNSPEC_VCMLA + UNSPEC_VCMLA_CONJ + UNSPEC_VCMLA180 + UNSPEC_VCMLA180_CONJ]) + +;; Define iterators for VCMLA operations as MUL +(define_int_iterator VCMUL_OP [UNSPEC_VCMUL + UNSPEC_VCMUL_CONJ]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 62ff12365ab3f92f177704927d230fefc415f1cb..465f71c4eee5f77e4d5904e8508c4134d1c9573f 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -4101,15 +4101,16 @@ (define_insn "mve_vaddlvaq_p_v4si" (define_insn "mve_vcmlaq" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w,w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0,Dz") - (match_operand:MVE_0 2 "s_register_operand" "w,w") - (match_operand:MVE_0 3 "s_register_operand" "w,w")] - VCMLA)) + (plus:MVE_0 (match_operand:MVE_0 1 "reg_or_zero_operand" "Dz,0") + (unspec:MVE_0 + [(match_operand:MVE_0 2 "s_register_operand" "w,w") + (match_operand:MVE_0 3 "s_register_operand" "w,w")] + VCMLA))) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" "@ - vcmla.f%# %q0, %q2, %q3, # - vcmul.f%# %q0, %q2, %q3, #" + vcmul.f%# %q0, %q2, %q3, # + vcmla.f%# %q0, %q2, %q3, #" [(set_attr "type" "mve_move") ]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index e904db97ea7bd4cb0f32199038ace3d334ffb8f9..fec2cc91d24b6eff7b6fc8fdd54f39b3d646c468 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2952,6 +2952,25 @@ (define_insn "neon_vcmlaq_lane" [(set_attr "type" "neon_fcmla")] ) +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:VDF 0 "register_operand") + (unspec:VDF [(match_operand:VDF 1 "register_operand") + (match_operand:VDF 2 "register_operand")] + VCMUL_OP))] + "TARGET_COMPLEX && !BYTES_BIG_ENDIAN" +{ + rtx res1 = gen_reg_rtx (mode); + rtx tmp = force_reg (mode, CONST0_RTX (mode)); + emit_insn (gen_neon_vcmla (res1, tmp, + operands[2], operands[1])); + emit_insn (gen_neon_vcmla (operands[0], res1, + operands[2], operands[1])); + DONE; +}) + ;; These instructions map to the __builtins for the Dot Product operations. (define_insn "neon_dot" diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 97a803e8da50c0119d15bcd4af47c298d3758c47..c6ebb6fc2b6a8d9e46f126dd857222a892c84093 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -510,10 +510,13 @@ (define_c_enum "unspec" [ UNSPEC_VCMLA90 UNSPEC_VCMLA180 UNSPEC_VCMLA270 + UNSPEC_VCMLA_CONJ + UNSPEC_VCMLA180_CONJ UNSPEC_VCMUL UNSPEC_VCMUL90 UNSPEC_VCMUL180 UNSPEC_VCMUL270 + UNSPEC_VCMUL_CONJ UNSPEC_MATMUL_S UNSPEC_MATMUL_U UNSPEC_MATMUL_US diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md index ff448da126b2250605d772ad423c70c16b753338..692b28ea8ccb18abac016a0c1b45ac7d0bf073d4 100644 --- a/gcc/config/arm/vec-common.md +++ b/gcc/config/arm/vec-common.md @@ -215,6 +215,63 @@ (define_expand "cadd3" && ARM_HAVE__ARITH)) && !BYTES_BIG_ENDIAN" ) +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:VQ_HSF 0 "register_operand") + (unspec:VQ_HSF [(match_operand:VQ_HSF 1 "register_operand") + (match_operand:VQ_HSF 2 "register_operand")] + VCMUL_OP))] + "(TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT)) + && !BYTES_BIG_ENDIAN" +{ + rtx res1 = gen_reg_rtx (mode); + if (TARGET_COMPLEX) + { + rtx tmp = force_reg (mode, CONST0_RTX (mode)); + emit_insn (gen_arm_vcmla (res1, tmp, + operands[2], operands[1])); + } + else + emit_insn (gen_arm_vcmla (res1, CONST0_RTX (mode), + operands[2], operands[1])); + + emit_insn (gen_arm_vcmla (operands[0], res1, + operands[2], operands[1])); + DONE; +}) + +(define_expand "arm_vcmla" + [(set (match_operand:VF 0 "register_operand") + (plus:VF (match_operand:VF 1 "register_operand") + (unspec:VF [(match_operand:VF 2 "register_operand") + (match_operand:VF 3 "register_operand")] + VCMLA)))] + "(TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT + && ARM_HAVE__ARITH)) && !BYTES_BIG_ENDIAN" +) + +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:VF 0 "register_operand") + (plus:VF (match_operand:VF 1 "register_operand") + (unspec:VF [(match_operand:VF 2 "register_operand") + (match_operand:VF 3 "register_operand")] + VCMLA_OP)))] + "(TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT + && ARM_HAVE__ARITH)) && !BYTES_BIG_ENDIAN" +{ + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_arm_vcmla (tmp, operands[1], + operands[3], operands[2])); + emit_insn (gen_arm_vcmla (operands[0], tmp, + operands[3], operands[2])); + DONE; +}) + (define_expand "movmisalign" [(set (match_operand:VDQX 0 "neon_perm_struct_or_reg_operand") (unspec:VDQX [(match_operand:VDQX 1 "neon_perm_struct_or_reg_operand")]