From patchwork Fri Nov 12 12:08:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1554331 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=vzigzJ1/; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HrHSY6wYnz9sPf for ; Fri, 12 Nov 2021 23:09:16 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 12AFD3858033 for ; Fri, 12 Nov 2021 12:09:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12AFD3858033 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1636718954; bh=wKrl+/jr9knKtwd4wAW8X9zQ9UMuEp2GPQ/kyjIGy0c=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=vzigzJ1/MwbFDcbX3f3mzbjxUXm0AA6ZwZYZKzphiA8fz4Poz9BXXbLzzeHYJJos0 ac0S2oe5Z50mOkBSTk7M3M9aZVvIlK0OWgUn8qvpFo/uXFmMnAwlUwtR/sbKoFPL7f 3Se9qFWfRmFiulTaCov0PGzB6Je5bqb8twLZIKs4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20064.outbound.protection.outlook.com [40.107.2.64]) by sourceware.org (Postfix) with ESMTPS id 313C9385840F for ; Fri, 12 Nov 2021 12:08:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 313C9385840F Received: from DB7PR03CA0100.eurprd03.prod.outlook.com (2603:10a6:10:72::41) by VI1PR0801MB1709.eurprd08.prod.outlook.com (2603:10a6:800:59::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.15; Fri, 12 Nov 2021 12:08:25 +0000 Received: from DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:72:cafe::ce) by DB7PR03CA0100.outlook.office365.com (2603:10a6:10:72::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4690.15 via Frontend Transport; Fri, 12 Nov 2021 12:08:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT024.mail.protection.outlook.com (10.152.20.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4690.15 via Frontend Transport; Fri, 12 Nov 2021 12:08:25 +0000 Received: ("Tessian outbound 8133f76bddb7:v109"); Fri, 12 Nov 2021 12:08:25 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8685222fb6986092 X-CR-MTA-TID: 64aa7808 Received: from fc06dfad0ca1.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C0772501-9649-4F66-852F-462D062FDD04.1; Fri, 12 Nov 2021 12:08:17 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id fc06dfad0ca1.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 12 Nov 2021 12:08:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZgNXrGFQfpBE41k3zm/fAsFlzKWDNrQ/A+NR/odOJAJRET2NcrGcpc72foAHeVTz8PqgBWtyebmknfIl3SK2brwt9jVIGGHD8FNvOg+3R5rfCQivjCaAI82FllIdWa989dsZNKjr1GHTHbCg1k07GSIsUFEM7IvCrm+yaDLnZPf2S/VRNipyMX3g6qt2PaYmW4wuvBcOq77pZvO59K1f4H2tpYI77cGh8DR8szXQ0sn1MHDBJk0IdyK9he3YJhlCkMqZwYbd8g+DBRxIzU1BDOl+A/8v+hMFxKvKkcJqg7tNWO2D8jMtXM8/ZqFkQ6H9o7b0IttZS7jY85Adef2oUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wKrl+/jr9knKtwd4wAW8X9zQ9UMuEp2GPQ/kyjIGy0c=; b=bYC/ncYpSnT3+/V10JSDj6CMAv+QaQpiLvpegbs/yiRfMGg89CL1PC7E0/MFU9hsTXzoyp2TfcoOan6sgxh0wbZfIloucCRJJajDaOO7loxOWNZlrfLqvnj9hMIi57fjq5wvCJLMkh8ASwDFzuEL5DDldyIaP/XeARX7FY6wwmKh2XZ1sEyqB0N53DL3o0UUuAU52AH0UHRyIKl/Y81VtThRxw6L/NDqIqnac2bvSVT8Y9oppv5hPK/yosek0qlJd0MPJ5BzNrKPRm260aPHa6CYAylHeg+7X58QGRZX3VbZgthdyVJ6XgAg4TvdNhw/MYGSyg0JkIk6VCL6IWbFWg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) by AM0PR08MB3924.eurprd08.prod.outlook.com (2603:10a6:208:131::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4690.16; Fri, 12 Nov 2021 12:08:13 +0000 Received: from AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::1096:1244:d709:fbee]) by AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::1096:1244:d709:fbee%3]) with mapi id 15.20.4690.019; Fri, 12 Nov 2021 12:08:12 +0000 Date: Fri, 12 Nov 2021 12:08:04 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH]AArch64 Optimize right shift rounding narrowing Message-ID: Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN6PR2101CA0028.namprd21.prod.outlook.com (2603:10b6:805:106::38) To AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) MIME-Version: 1.0 Received: from arm.com (217.140.106.55) by SN6PR2101CA0028.namprd21.prod.outlook.com (2603:10b6:805:106::38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4713.15 via Frontend Transport; Fri, 12 Nov 2021 12:08:10 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ed0a7e58-8ead-405f-b8e7-08d9a5d52124 X-MS-TrafficTypeDiagnostic: AM0PR08MB3924:|VI1PR0801MB1709: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:4714;OLM:4714; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 8WchOhXEMwl4XSsb6rb9Mo8IVoCK26269N2CULzgg7bLwB/yzllhlsbnSGp2gojsxCucWpHYB6KRMZQrwunB/eBavRbQHiTxFkjBnHVn8eN7+Qpkltxub/xSZUSRFXrt6efYn+S3P1SXFk38YGmraqfQCE0rqfdc2dWhwtayldh7vtldiTSePHrVGa54zvlptsPUqzuUDxRnlcFGODwMh2wA46x4AVPlJjEG/evMsoJK3PVDN/RsQ3hiyNNlqr+5W/BWo/i21ptPYwiril+AFcc2/48ggnfq79GXxqZK75BX9vHzupa3+6OQchxCooFRdx65kXNSv6uzEL2Sv5Q2MsAVCiGPoNe7rmrIIdmZiiUXJozu3X2N7eHI3/YR5n41B8leynBWPIfAAHrURBGDfGmuBwF7f4kZI+9nlpvZ0qMR9g34rEX32SaGvOG/lbS4wTpmzQizmXjgxGW6ZDfnDfHZZn2ZKlXN+DYubbMfqyXtqohv4YL6OOk7jNG01jI8ilJGgmkKXGDN8bagthHIbNC3+ETUzfbehtsqbdcGtYUE12mOuModghBxwKZnvN8Bm7SoFDn91YqG6Sp1QS4Kxv5+e26LsnOr2522ExSgs+gVaV0EHTrp4951WnU20giu24pBHDSAeoWYbbKbZ3cg3QXZVQ+vnzqJiT3YiZOyqXwZzdnwGWccngitCN1ipI6Lwh1pHfJaIUmwUJBj4PDVOdDivjZtMtqJ6RMmsaP1r9vuAr2zP8Vej8s6ajtP2k/itRErQLL52P7pHh4fsU3hWb5Pc9unKgrSAe0+IkDilnk9my4RfbgKTuTEezVYprydTYglC9BAFod91IcVrLBTYowDSDkjybvDXl4/KQVTJ0UgHWB9g2tcwrUHgRDCGHVk X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR08MB5316.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(44144004)(33964004)(55016002)(86362001)(8886007)(6666004)(508600001)(6916009)(38350700002)(8936002)(66476007)(66556008)(8676002)(4743002)(52116002)(7696005)(235185007)(36756003)(38100700002)(5660300002)(316002)(2906002)(26005)(44832011)(4326008)(2616005)(66946007)(956004)(186003)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB3924 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 87f3dad0-109a-4580-a3bb-08d9a5d5199e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TYr5nAyedVBlpOfBJYqRjdsqdTJHnC+Rc79yWdoN2OpNhaXZM38ay2t3LIOL+9OsQqsKWdM/GL1mDjUS0EnRZBUh8sRaYfkAMVwOSM0sbLbMx78Ht2+8l4y1DzSc10CsREUIqqYZlRVWI0Oy2wBf0Gma5o60BZkhs7PKrKm/uUkJijtqeWpBDDKZw8fQrlw3MmmAqzMxlEnFgasShDZpBPsFoMCr/Yuf8crNZEuL5caTRUgLwxj2ACFxczpkCJM4jaCsjb9xIqE8/TmY9qRxk9mKopoIjFwcgc9sU/ESZSXGgHkbmmiGmC2wibdb2875rLbjepn31NUO5Hqyp6IEV4Rav5hCGOccPOljcggL3E0Tdk3QW9TvBh54Xz/0UmgArktFVPzQ4LjnJJx/vcXcsSjhDtHhNQeHMZBsQ5uCG17CxKHMqgMXvk2O9nbiW1I18CG9b4cX5VhBIn250lgTfsD4VcilDTTLL+rA+gmAdfynDmYUdekdDRBjUNFTaz+em7SeFqAxJ1y636d9Dnf3ggPgJRWH0UztDdTZzhR6bDPwuNB/f+wdDfAT5qQ854GFZtJIBSpE/F+e9lrFgdJw/iBmBp2aXeMax85+iix7F/dtMP3SBZGDMtKvvQiW0BIVhN8yEQuYIBvexIrwpxhOVTxkM2N9+7pFJTmM4yC1UlAgFn82+/EONSoH9JWDuFvJNvvBPmPiOCOAMA5sBBIGajoAj0RjYbaHMWxmhyPJtwPxOGsYSCDCMNV9YdXk+ddZZKqTstet56SR/sCYMEi9MpSlLgxrynPIe5mu/HajHvUtKP6zGKrw0NYQkbscMa0DTcZhh9czkMhJ5/wbX4xUMBlsXL/h5gYXnSET+pi9CS0Vn7mzmERoUJD8YHOYi67O X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(26005)(956004)(7696005)(336012)(235185007)(5660300002)(8676002)(2906002)(2616005)(508600001)(4326008)(6916009)(70206006)(70586007)(4743002)(44144004)(82310400003)(81166007)(86362001)(44832011)(47076005)(186003)(55016002)(8936002)(316002)(36860700001)(6666004)(356005)(36756003)(8886007)(33964004)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Nov 2021 12:08:25.2217 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ed0a7e58-8ead-405f-b8e7-08d9a5d52124 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1709 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, KAM_STOCKGEN, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This optimizes right shift rounding narrow instructions to rounding add narrow high where one vector is 0 when the shift amount is half that of the original input type. i.e. uint32x4_t foo (uint64x2_t a, uint64x2_t b) { return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32); } now generates: foo: movi v3.4s, 0 raddhn v0.2s, v2.2d, v3.2d raddhn2 v0.4s, v2.2d, v3.2d instead of: foo: rshrn v0.2s, v0.2d, 32 rshrn2 v0.4s, v1.2d, 32 ret On Arm cores this is an improvement in both latency and throughput. Because a vector zero is needed I created a new method aarch64_gen_shareable_zero that creates zeros using V4SI and then takes a subreg of the zero to the desired type. This allows CSE to share all the zero constants. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_gen_shareable_zero): New. * config/aarch64/aarch64-simd.md (aarch64_rshrn, aarch64_rshrn2): * config/aarch64/aarch64.c (aarch64_gen_shareable_zero): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/shrn-1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/shrn-2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/shrn-3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/shrn-4.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index f7887d06139f01c1591c4e755538d94e5e608a52..f7f5cae82bc9198e54d0298f25f7c0f5902d5fb1 100644 diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index f7887d06139f01c1591c4e755538d94e5e608a52..f7f5cae82bc9198e54d0298f25f7c0f5902d5fb1 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -846,6 +846,7 @@ const char *aarch64_output_move_struct (rtx *operands); rtx aarch64_return_addr_rtx (void); rtx aarch64_return_addr (int, rtx); rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT); +rtx aarch64_gen_shareable_zero (machine_mode); bool aarch64_simd_mem_operand_p (rtx); bool aarch64_sve_ld1r_operand_p (rtx); bool aarch64_sve_ld1rq_operand_p (rtx); diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index c71658e2bf52b26bf9fc9fa702dd5446447f4d43..d7f8694add540e32628893a7b7471c08de6f760f 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1956,20 +1956,32 @@ (define_expand "aarch64_rshrn" (match_operand:SI 2 "aarch64_simd_shift_imm_offset_")] "TARGET_SIMD" { - operands[2] = aarch64_simd_gen_const_vector_dup (mode, - INTVAL (operands[2])); - rtx tmp = gen_reg_rtx (mode); - if (BYTES_BIG_ENDIAN) - emit_insn (gen_aarch64_rshrn_insn_be (tmp, operands[1], - operands[2], CONST0_RTX (mode))); + if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode)) + { + rtx tmp0 = aarch64_gen_shareable_zero (mode); + emit_insn (gen_aarch64_raddhn (operands[0], operands[1], tmp0)); + } else - emit_insn (gen_aarch64_rshrn_insn_le (tmp, operands[1], - operands[2], CONST0_RTX (mode))); - - /* The intrinsic expects a narrow result, so emit a subreg that will get - optimized away as appropriate. */ - emit_move_insn (operands[0], lowpart_subreg (mode, tmp, - mode)); + { + rtx tmp = gen_reg_rtx (mode); + operands[2] = aarch64_simd_gen_const_vector_dup (mode, + INTVAL (operands[2])); + if (BYTES_BIG_ENDIAN) + emit_insn ( + gen_aarch64_rshrn_insn_be (tmp, operands[1], + operands[2], + CONST0_RTX (mode))); + else + emit_insn ( + gen_aarch64_rshrn_insn_le (tmp, operands[1], + operands[2], + CONST0_RTX (mode))); + + /* The intrinsic expects a narrow result, so emit a subreg that will + get optimized away as appropriate. */ + emit_move_insn (operands[0], lowpart_subreg (mode, tmp, + mode)); + } DONE; } ) @@ -2049,14 +2061,27 @@ (define_expand "aarch64_rshrn2" (match_operand:SI 3 "aarch64_simd_shift_imm_offset_")] "TARGET_SIMD" { - operands[3] = aarch64_simd_gen_const_vector_dup (mode, - INTVAL (operands[3])); - if (BYTES_BIG_ENDIAN) - emit_insn (gen_aarch64_rshrn2_insn_be (operands[0], operands[1], - operands[2], operands[3])); + if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) + { + rtx tmp = aarch64_gen_shareable_zero (mode); + emit_insn (gen_aarch64_raddhn2 (operands[0], operands[1], + operands[2], tmp)); + } else - emit_insn (gen_aarch64_rshrn2_insn_le (operands[0], operands[1], - operands[2], operands[3])); + { + operands[3] = aarch64_simd_gen_const_vector_dup (mode, + INTVAL (operands[3])); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_rshrn2_insn_be (operands[0], + operands[1], + operands[2], + operands[3])); + else + emit_insn (gen_aarch64_rshrn2_insn_le (operands[0], + operands[1], + operands[2], + operands[3])); + } DONE; } ) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index fdf05505846721b02059df494d6395ae9423a8ef..11201ea3498beb270c0a7f8da5f5009d710535ee 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -20397,6 +20397,18 @@ aarch64_mov_operand_p (rtx x, machine_mode mode) == SYMBOL_TINY_ABSOLUTE; } +/* Create a 0 constant that is based of V4SI to allow CSE to optimally share + the constant creation. */ + +rtx +aarch64_gen_shareable_zero (machine_mode mode) +{ + machine_mode zmode = V4SImode; + rtx tmp = gen_reg_rtx (zmode); + emit_move_insn (tmp, CONST0_RTX (zmode)); + return lowpart_subreg (mode, tmp, zmode); +} + /* Return a const_int vector of VAL. */ rtx aarch64_simd_gen_const_vector_dup (machine_mode mode, HOST_WIDE_INT val) diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-1.c new file mode 100644 index 0000000000000000000000000000000000000000..4bc3aa9563ee7d0dc46557d30d9a29149706229d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */ + +#include + +uint8x16_t foo (uint32x4_t a, uint32x4_t b) +{ + uint16x4_t a1 = vrshrn_n_u32 (a, 16); + uint16x8_t b1 = vrshrn_high_n_u32 (a1, b, 16); + return vrshrn_high_n_u16 (vrshrn_n_u16 (b1, 8), b1, 8); +} + +/* { dg-final { scan-assembler-times {\tmovi\t} 1 } } */ +/* { dg-final { scan-assembler-times {\traddhn\t} 2 } } */ +/* { dg-final { scan-assembler-times {\traddhn2\t} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-2.c new file mode 100644 index 0000000000000000000000000000000000000000..09d913e85524f06367c1c2cf51dda0f57578e9ae --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ + +#include + +uint32x4_t foo (uint64x2_t a, uint64x2_t b) +{ + return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32); +} + +/* { dg-final { scan-assembler-times {\traddhn\t} 1 } } */ +/* { dg-final { scan-assembler-times {\traddhn2\t} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-3.c new file mode 100644 index 0000000000000000000000000000000000000000..bdccbb3410f049d7e45aabdcc3d2964fbabca807 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-3.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ + +#include + +uint16x8_t foo (uint32x4_t a, uint32x4_t b) +{ + return vrshrn_high_n_u32 (vrshrn_n_u32 (a, 16), b, 16); +} + +/* { dg-final { scan-assembler-times {\traddhn\t} 1 } } */ +/* { dg-final { scan-assembler-times {\traddhn2\t} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-4.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-4.c new file mode 100644 index 0000000000000000000000000000000000000000..4b23eddb85891975b8e122060e2a9ebfe56d842c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/shrn-4.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ + +#include + +uint8x16_t foo (uint16x8_t a, uint16x8_t b) +{ + return vrshrn_high_n_u16 (vrshrn_n_u16 (a, 8), b, 8); +} + +/* { dg-final { scan-assembler-times {\traddhn\t} 1 } } */ +/* { dg-final { scan-assembler-times {\traddhn2\t} 1 } } */