From patchwork Fri May 13 17:11:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1630861 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=nH9z43L3; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4L0Fbz4LtZz9sFk for ; Sat, 14 May 2022 03:13:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 77A7F395A463 for ; Fri, 13 May 2022 17:13:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 77A7F395A463 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1652462029; bh=LPmn/BZSkzfM327EDoB1KGGeqHp9dWZps0QIAisjypQ=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=nH9z43L3BplWiYrtC3QlxvSfucpvzVteHoyPOMUtJSjaiYQWGSwI4zBQvOYILEGeD F3x4p7qYORL/c4KXVN4rcClFN5Jl0UMT7yCcBq4OuIqMYfjt7J87KwNlODalwqJvAK nGlKu+R3lR+8l/r3CvA3C1w6WVp9NqxL7YF8olV0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2089.outbound.protection.outlook.com [40.107.22.89]) by sourceware.org (Postfix) with ESMTPS id 596D3395A456 for ; Fri, 13 May 2022 17:12:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 596D3395A456 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Pe9j5ipH75c9KjIuzMmxS20X3BEoEBG1XJ4cwsJKvTv+K4McyCeyfu3xOMiucXm6Vzh9M/HVIhD8qrqHOKEigZ5o/9LoemT4DJ3TAA2DqLU/I9XU1/M8mPFrGvx/p8S8ICbR1alG2RNISdkPcH3xUPmcDmMTGmZUOUj+mdzT44AodNGcX5ow98zg3uqau3pHQ5BwW4m62Z6ac8QxKPcU+bFEwL1A4kUMpvRop9on7tEMOKP1vEfdkTQVuzfkJnCiussZzKSUna1UE6PXcgMnzNwSQmJlYYAdVSppUyHE2XMe4ki1e3yX3Lqao25OdHEMPjGNeDnPbwYHSdkBynrYXA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LPmn/BZSkzfM327EDoB1KGGeqHp9dWZps0QIAisjypQ=; b=TZGdW3Awb3yVCfXa8Zk+eUzL3a/B3vCqMrAPfFcYPBQUMj5p/wl4vrU7Zumj12kR+o3U3T1M/jPB8hcxq1dzlAt5yRi+22gYEaafC418AumEkPbWJGiez9zK8+J+eYV9fBhHZpRnxnep57pNWTMwLO5cCbdfWCHDKXL6yDH6kKuydbdr0dz7MDB3ICYhGsB8garCtU5tLR2mljlp7UYCNZAPedjfANcXcA/iqzFL6Qqe2Y2rFrGpzLLAkp9e6KhTkWcLyuIz2KjqL09ZZ12LT9E8VLYvTlt45FQDOevk1+EAs7bA1p8ZmXbfL0gLkXSzoufqSjiHj+li5oVEbMiHHQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AM6PR02CA0031.eurprd02.prod.outlook.com (2603:10a6:20b:6e::44) by DB6PR0801MB2118.eurprd08.prod.outlook.com (2603:10a6:4:37::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.13; Fri, 13 May 2022 17:11:54 +0000 Received: from VE1EUR03FT037.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:6e:cafe::31) by AM6PR02CA0031.outlook.office365.com (2603:10a6:20b:6e::44) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.15 via Frontend Transport; Fri, 13 May 2022 17:11:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT037.mail.protection.outlook.com (10.152.19.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.13 via Frontend Transport; Fri, 13 May 2022 17:11:54 +0000 Received: ("Tessian outbound 6580ae46f51e:v119"); Fri, 13 May 2022 17:11:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 56908b2ebe8fdac4 X-CR-MTA-TID: 64aa7808 Received: from ea9106e7ace2.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id DCA172DC-8C0B-490E-B042-141492192AB1.1; Fri, 13 May 2022 17:11:47 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ea9106e7ace2.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 13 May 2022 17:11:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dGjJ5Sxmg18hR4J/9u9EHcr8zrNajEm9PvKXocowIiel4Ag2Trc8+dSwseO3PB8uiTegPeFAkXqkS77sUO4SQGD56NH3MovInDa4pvMO+Cu8wdog03nTs7+Uqmdiccm6Bv/zTnD8PE2sLgshp1Zu2udf/gBnuqp6hnYizNbM/Sguwf5ztsvMv7LSzIaTEBxKCNj0tdFpZ3/KWeV3KPYm76zwcTRViF9HX7UvQmPaEDWDcfkbB2WsxT5i1s0Zd1+yrA1kWmMgGlYm8p/AS4sTEKwIj9phb4jhtfMBpUbv7DpkwTK2KBLoSBIp/N4GbcUsJTtPVIUX2uob8fHRRflVrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LPmn/BZSkzfM327EDoB1KGGeqHp9dWZps0QIAisjypQ=; b=XS4fQQfZfr2FNX51ZQt6IXdNWzENdzpQDbZh6WyA4JMEMcbs6CgtAr6IplhHNtIXt5dKBtT44VXMmdjykOffAZ0Lv75udVJxOPBuwvUONcY4JpAEWCMCQWARhEW+OjFfvKsFGoE2WQWw9Jx5hYkRV9uzbhSAblaBu6WGQCirqXqoRkmf1v4Z/ctAvFvp/EG3NxHIMSvS/ysPRC7E63GKsnd03gIlAFma59zNO8gGO2ZFRAal0168jvnKzWzdf/86Cl5wh3YHkEEgzM6tXi+28rsY3DjLUX9dmmbXCMG1EjHG1aJOQD+4+BM4/WMffvj8fMY329iBZ5OvDBQ7q/6aQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB6PR08MB2933.eurprd08.prod.outlook.com (2603:10a6:6:1c::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.13; Fri, 13 May 2022 17:11:45 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::7c18:b406:6441:f7a3]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::7c18:b406:6441:f7a3%5]) with mapi id 15.20.5250.015; Fri, 13 May 2022 17:11:45 +0000 Date: Fri, 13 May 2022 18:11:42 +0100 To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/3]AArch64 Promote function arguments using a paradoxical subreg when beneficial. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0140.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2c4::13) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 89734173-6dde-477c-acdf-08da3503addd X-MS-TrafficTypeDiagnostic: DB6PR08MB2933:EE_|VE1EUR03FT037:EE_|DB6PR0801MB2118:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: AhFSxokDPw0GopbhsMKAYJcVPwuZCwovJWT6DiD+4VwlmZyPrJKOb6rZcr6uRpcb3vMzE2P9AfS0pnsqaadYZN7xtzQ2MtnWEDsostjxOCqLChFAoeNRRhi2PVf3alQEthwUVX7u9uvXfGLqYqUjBL3m4rMLaRFI3sx8fuQ+2Pp2P4+qLndNHfe/fj3VYpUaXNECGrxanretasQ10pbIeHHJrtu5bp05yyjnUg5hA2KBDkZ9XwTxZMfg39opT23xH+wkCWE/bNfClSiacotknEjMkgLya07u5Y79emEIkhnUp+jR5tjeOUcNjAMJHCaGvLP8A0HgLXPfrLvH+sb1ekKC7owo9vpcvzUp87+kwDVijpyCa4O75vlloSc9egS+4x4DMcKmwEyi1Z478oZYQMKdWh8LHWWPFKkIXdPPN8CLYJ+U7GtSksgbLBaI0AHh7Wj33ZHWf2TCWwRFKaptQEvhwhrE9zYMV3jdrVK49v7hwR8qJH7S93eXV3+pGjWSdrAxVJ9YHb5rCE5v3oA4w9nmn1T9w46FU0cExBEWbFe4tVxP3MzfratCiK0NJL02bsD2sBsE30xMrQspLf3iNPe1BPvDKKmMsyjOgGJwg09NMDaC/ILxAlb8ba4J5xv8nZlXLVZXPbChrJw88DryPw5PPTpOtTHdnGdDoc+xZfqyDEcqFJZW8wjMH5mqh8+UUGJXfh9WJyzLR6IWti+OTTwuo2tq44s/2WHmD72AViixUjZP5kX/0bE6x52DYhmlUusUJBUrE+AjsAO1Y7URXbIymXEjt7ZwbdFfuv6Li8Y= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(235185007)(2616005)(186003)(6666004)(44144004)(33964004)(8936002)(83380400001)(5660300002)(6512007)(44832011)(6486002)(84970400001)(508600001)(4743002)(26005)(316002)(36756003)(6506007)(6916009)(66556008)(66946007)(66476007)(86362001)(8676002)(4326008)(2906002)(38100700002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR08MB2933 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT037.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: ee48d632-a7ce-4e4e-b345-08da3503a835 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: sGIlR9QCsZw8Te87gPFhGF3wi2lqD7msTGpccKWVXm5FiHW7Nhk6a9+218E+cdIS0UCCwsZ/SQvyIQ8duj4HceZjo7shptaK/fgxYQh5AQBJgM29sN1C2QjfW5jv32f9qTHuyxADT7Ith1aOnwIurXwVchVGb7iwQdBvDw/vFp3F2FTzeRGPVaVF+Ntq+QftEknPKuiegvJ0Rbnb+tIBLDvi4sJzvOPWZ7n2+LVxysRSIxb8YwHc1MQZFuQ1WRtqfCt2wapSKyi2YukdKR0lifIgxaiPZwP/VqJ08jR6Zi+O9HWnMQ+4quR0NUzSd1sn/afv8RKtHolNJl++2Y4dyIT2fMJs9N45WYPz7JeitfbETvPCcD7Hxv1kwU9OgvKcSELqVBwBdRsgOYPp6DBR8cuA+l3cz+PF/DskhzOHgqmT7GSW7SudAZkJr6WMYUwfVXhCdeTsfiRwkdcPgQKWt6n1mT9cpf4cR0lxhz8nQSuO08ErNEwgqIUsNuz2g/gsndnf2b4jEI10w1srdRizVP9vCyydJGVXm1c1GJf9DUw2+u4WtTkwqJb9kOHBAulH9prJm3LNBY0aYYuK9VvFV0cMO5fCjCNHCIPQYxQSgFEMvQduv19zgK9W492vw9UEaAJxVNJ8ZzcnecSdVGOw5bmUKGvDnGzmAhR6yTnoH5XiMUoLBAPtCBVItJRExkK5VlzztnlYvfMf0JlnuQXQgigyV1Oa0ctNub0Vnx7CgVcagYOugEanSF6FgH3iQd+2XuYeC+BnqpBxvKqQ1w8/+GCwZPteN9RTFQF5YoFIE7I= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230001)(4636009)(40470700004)(36840700001)(46966006)(316002)(6916009)(6666004)(82310400005)(186003)(36756003)(356005)(336012)(36860700001)(2906002)(44832011)(40460700003)(81166007)(4326008)(235185007)(8676002)(70586007)(70206006)(83380400001)(47076005)(2616005)(8936002)(86362001)(6486002)(4743002)(508600001)(26005)(6512007)(84970400001)(6506007)(33964004)(44144004)(5660300002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 May 2022 17:11:54.3100 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 89734173-6dde-477c-acdf-08da3503addd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT037.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2118 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, THIS_AD, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The PROMOTE_MODE always promotes 8 and 16-bit parameters to 32-bits. This promotion is not required for the ABI which states: ``` C.9 If the argument is an Integral or Pointer Type, the size of the argument is less than or equal to 8 bytes and the NGRN is less than 8, the argument is copied to the least significant bits in x[NGRN]. The NGRN is incremented by one. The argument has now been allocated. C.16 If the size of the argument is less than 8 bytes then the size of the argument is set to 8 bytes. The effect is as if the argument was copied to the least significant bits of a 64-bit register and the remaining bits filled with unspecified values ``` That is, the bits in the registers are unspecified and callees cannot assume any particular status. This means that we can avoid the promotion and still get correct code as the language level promotion rules require values to be extended when the bits are significant. So if we are .e.g OR-ing two 8-bit values no extend is needed as the top bits are irrelevant. If we are doing e.g. addition, then the top bits *might* be relevant depending on the result type. But the middle end will always contain the appropriate extend in those cases. The mid-end also has optimizations around this assumption and the AArch64 port actively undoes them. So for instance uint16_t fd (uint8_t xr){ return xr + 1; } uint8_t fd2 (uint8_t xr){ return xr + 1; } should produce fd: // @fd and w8, w0, #0xff add w0, w8, #1 ret fd2: // @fd2 add w0, w0, #1 ret like clang does instead of fd: and w0, w0, 255 add w0, w0, 1 ret fd2: and w0, w0, 255 add w0, w0, 1 ret like we do now. Removing this forced expansion maintains correctness but fixes issues with various codegen defects. It also brings us inline with clang. Note that C, C++ and Fortran etc all correctly specify what should happen w.r.t extends and e.g. array indexing, pointer arith etc so we never get incorrect code. There is however a second reason for doing this promotion: RTL efficiency. The promotion stops us from having to promote the values to SI to be able to use them in instructions and then truncating again afterwards. To get both the efficiency and the simpler RTL we can instead promote to a paradoxical subreg. This patch implements the hook for AArch64 and adds an explicit opt-out for values that feed into comparisons. This is done because: 1. our comparisons patterns already allow us to absorb the zero extend 2. The extension allows us to use cbz/cbnz/tbz etc. In some cases such as int foo (char a, char b) { if (a) if (b) bar1 (); else ... else if (b) bar2 (); else ... } by zero extending the value we can avoid having to repeatedly test the value before a branch. Allowing the zero extend also allows our existing `ands` patterns to work as expected. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. I have to commit this and the last patch together but ease of review I have split them up here. However 209 missed optimization xfails are fixed. No performance difference on SPECCPU 2017 but no failures. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_promote_function_args_subreg_p): (TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P): New. * config/aarch64/aarch64.h (PROMOTE_MODE): Expand doc. gcc/testsuite/ChangeLog: * gcc.target/aarch64/apc-subreg.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index efa46ac0b8799b5849b609d591186e26e5cb37ff..cc74a816fcc6458aa065246a30a4d2184692ad74 100644 --- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index efa46ac0b8799b5849b609d591186e26e5cb37ff..cc74a816fcc6458aa065246a30a4d2184692ad74 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -34,7 +34,8 @@ #define REGISTER_TARGET_PRAGMAS() aarch64_register_pragmas () -/* Target machine storage layout. */ +/* Target machine storage layout. See also + TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P. */ #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 2f559600cff55af9d468e8d0810545583cc986f5..252d6c2af72afc1dfee1a86644a5753784b41f59 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3736,6 +3736,57 @@ aarch64_array_mode_supported_p (machine_mode mode, return false; } +/* Implement target hook TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P to complement + PROMOTE_MODE. If any argument promotion was done, do them as subregs. */ +static bool +aarch64_promote_function_args_subreg_p (machine_mode mode, + machine_mode promoted_mode, + int /* unsignedp */, tree parm) +{ + bool candidate_p = GET_MODE_CLASS (mode) == MODE_INT + && GET_MODE_CLASS (promoted_mode) == MODE_INT + && known_lt (GET_MODE_SIZE (mode), 4) + && promoted_mode == SImode; + + if (!candidate_p) + return false; + + if (!parm || !is_gimple_reg (parm)) + return true; + + tree var = parm; + if (!VAR_P (var)) + { + if (TREE_CODE (parm) == SSA_NAME + && !(var = SSA_NAME_VAR (var))) + return true; + else if (TREE_CODE (parm) != PARM_DECL) + return true; + } + + /* If the variable is used inside a comparison which sets CC then we should + still promote using an extend. By doing this we make it easier to use + cbz/cbnz but also repeatedly having to test the value in certain + circumstances like nested if values that test the same value with calls + in between. */ + tree ssa_var = ssa_default_def (cfun, var); + if (!ssa_var) + return true; + + const ssa_use_operand_t *const head = &(SSA_NAME_IMM_USE_NODE (ssa_var)); + const ssa_use_operand_t *ptr; + + for (ptr = head->next; ptr != head; ptr = ptr->next) + if (USE_STMT(ptr) && is_gimple_assign (USE_STMT (ptr))) + { + tree_code code = gimple_assign_rhs_code (USE_STMT(ptr)); + if (TREE_CODE_CLASS (code) == tcc_comparison) + return false; + } + + return true; +} + /* MODE is some form of SVE vector mode. For data modes, return the number of vector register bits that each element of MODE occupies, such as 64 for both VNx2DImode and VNx2SImode (where each 32-bit value is stored @@ -27490,6 +27541,10 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_ARRAY_MODE_SUPPORTED_P #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p +#undef TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P +#define TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P \ + aarch64_promote_function_args_subreg_p + #undef TARGET_VECTORIZE_CREATE_COSTS #define TARGET_VECTORIZE_CREATE_COSTS aarch64_vectorize_create_costs diff --git a/gcc/testsuite/gcc.target/aarch64/apc-subreg.c b/gcc/testsuite/gcc.target/aarch64/apc-subreg.c new file mode 100644 index 0000000000000000000000000000000000000000..2d7563a11ce11fa677f7ad4bf2a090e6a136e4d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/apc-subreg.c @@ -0,0 +1,103 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** f0: +** mvn w0, w0 +** ret +*/ +uint8_t f0 (uint8_t xr){ + return (uint8_t) (0xff - xr); +} + +/* +** f1: +** mvn w0, w0 +** ret +*/ +int8_t f1 (int8_t xr){ + return (int8_t) (0xff - xr); +} + +/* +** f2: +** mvn w0, w0 +** ret +*/ +uint16_t f2 (uint16_t xr){ + return (uint16_t) (0xffFF - xr); +} + +/* +** f3: +** mvn w0, w0 +** ret +*/ +uint32_t f3 (uint32_t xr){ + return (uint32_t) (0xffFFffff - xr); +} + +/* +** f4: +** mvn x0, x0 +** ret +*/ +uint64_t f4 (uint64_t xr){ + return (uint64_t) (0xffFFffffffffffff - xr); +} + +/* +** f5: +** mvn w0, w0 +** sub w0, w0, w1 +** ret +*/ +uint8_t f5 (uint8_t xr, uint8_t xc){ + return (uint8_t) (0xff - xr - xc); +} + +/* +** f6: +** mvn w0, w0 +** and w0, w0, 255 +** and w1, w1, 255 +** mul w0, w0, w1 +** ret +*/ +uint16_t f6 (uint8_t xr, uint8_t xc){ + return ((uint8_t) (0xff - xr)) * xc; +} + +/* +** f7: +** and w0, w0, 255 +** and w1, w1, 255 +** mul w0, w0, w1 +** ret +*/ +uint16_t f7 (uint8_t xr, uint8_t xc){ + return xr * xc; +} + +/* +** f8: +** mul w0, w0, w1 +** and w0, w0, 255 +** ret +*/ +uint16_t f8 (uint8_t xr, uint8_t xc){ + return (uint8_t)(xr * xc); +} + +/* +** f9: +** and w0, w0, 255 +** add w0, w0, w1 +** ret +*/ +uint16_t f9 (uint8_t xr, uint16_t xc){ + return xr + xc; +} --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -34,7 +34,8 @@ #define REGISTER_TARGET_PRAGMAS() aarch64_register_pragmas () -/* Target machine storage layout. */ +/* Target machine storage layout. See also + TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P. */ #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 2f559600cff55af9d468e8d0810545583cc986f5..252d6c2af72afc1dfee1a86644a5753784b41f59 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3736,6 +3736,57 @@ aarch64_array_mode_supported_p (machine_mode mode, return false; } +/* Implement target hook TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P to complement + PROMOTE_MODE. If any argument promotion was done, do them as subregs. */ +static bool +aarch64_promote_function_args_subreg_p (machine_mode mode, + machine_mode promoted_mode, + int /* unsignedp */, tree parm) +{ + bool candidate_p = GET_MODE_CLASS (mode) == MODE_INT + && GET_MODE_CLASS (promoted_mode) == MODE_INT + && known_lt (GET_MODE_SIZE (mode), 4) + && promoted_mode == SImode; + + if (!candidate_p) + return false; + + if (!parm || !is_gimple_reg (parm)) + return true; + + tree var = parm; + if (!VAR_P (var)) + { + if (TREE_CODE (parm) == SSA_NAME + && !(var = SSA_NAME_VAR (var))) + return true; + else if (TREE_CODE (parm) != PARM_DECL) + return true; + } + + /* If the variable is used inside a comparison which sets CC then we should + still promote using an extend. By doing this we make it easier to use + cbz/cbnz but also repeatedly having to test the value in certain + circumstances like nested if values that test the same value with calls + in between. */ + tree ssa_var = ssa_default_def (cfun, var); + if (!ssa_var) + return true; + + const ssa_use_operand_t *const head = &(SSA_NAME_IMM_USE_NODE (ssa_var)); + const ssa_use_operand_t *ptr; + + for (ptr = head->next; ptr != head; ptr = ptr->next) + if (USE_STMT(ptr) && is_gimple_assign (USE_STMT (ptr))) + { + tree_code code = gimple_assign_rhs_code (USE_STMT(ptr)); + if (TREE_CODE_CLASS (code) == tcc_comparison) + return false; + } + + return true; +} + /* MODE is some form of SVE vector mode. For data modes, return the number of vector register bits that each element of MODE occupies, such as 64 for both VNx2DImode and VNx2SImode (where each 32-bit value is stored @@ -27490,6 +27541,10 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_ARRAY_MODE_SUPPORTED_P #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p +#undef TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P +#define TARGET_PROMOTE_FUNCTION_ARGS_SUBREG_P \ + aarch64_promote_function_args_subreg_p + #undef TARGET_VECTORIZE_CREATE_COSTS #define TARGET_VECTORIZE_CREATE_COSTS aarch64_vectorize_create_costs diff --git a/gcc/testsuite/gcc.target/aarch64/apc-subreg.c b/gcc/testsuite/gcc.target/aarch64/apc-subreg.c new file mode 100644 index 0000000000000000000000000000000000000000..2d7563a11ce11fa677f7ad4bf2a090e6a136e4d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/apc-subreg.c @@ -0,0 +1,103 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** f0: +** mvn w0, w0 +** ret +*/ +uint8_t f0 (uint8_t xr){ + return (uint8_t) (0xff - xr); +} + +/* +** f1: +** mvn w0, w0 +** ret +*/ +int8_t f1 (int8_t xr){ + return (int8_t) (0xff - xr); +} + +/* +** f2: +** mvn w0, w0 +** ret +*/ +uint16_t f2 (uint16_t xr){ + return (uint16_t) (0xffFF - xr); +} + +/* +** f3: +** mvn w0, w0 +** ret +*/ +uint32_t f3 (uint32_t xr){ + return (uint32_t) (0xffFFffff - xr); +} + +/* +** f4: +** mvn x0, x0 +** ret +*/ +uint64_t f4 (uint64_t xr){ + return (uint64_t) (0xffFFffffffffffff - xr); +} + +/* +** f5: +** mvn w0, w0 +** sub w0, w0, w1 +** ret +*/ +uint8_t f5 (uint8_t xr, uint8_t xc){ + return (uint8_t) (0xff - xr - xc); +} + +/* +** f6: +** mvn w0, w0 +** and w0, w0, 255 +** and w1, w1, 255 +** mul w0, w0, w1 +** ret +*/ +uint16_t f6 (uint8_t xr, uint8_t xc){ + return ((uint8_t) (0xff - xr)) * xc; +} + +/* +** f7: +** and w0, w0, 255 +** and w1, w1, 255 +** mul w0, w0, w1 +** ret +*/ +uint16_t f7 (uint8_t xr, uint8_t xc){ + return xr * xc; +} + +/* +** f8: +** mul w0, w0, w1 +** and w0, w0, 255 +** ret +*/ +uint16_t f8 (uint8_t xr, uint8_t xc){ + return (uint8_t)(xr * xc); +} + +/* +** f9: +** and w0, w0, 255 +** add w0, w0, w1 +** ret +*/ +uint16_t f9 (uint8_t xr, uint16_t xc){ + return xr + xc; +}