From patchwork Thu Nov 16 18:09:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864873 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=t2QCtB9M; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=t2QCtB9M; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSkV1vvgz1yRR for ; Fri, 17 Nov 2023 05:10:26 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9BC4B3857C70 for ; Thu, 16 Nov 2023 18:10:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2087.outbound.protection.outlook.com [40.107.21.87]) by sourceware.org (Postfix) with ESMTPS id 6182A384F008 for ; Thu, 16 Nov 2023 18:10:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6182A384F008 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6182A384F008 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.21.87 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158212; cv=pass; b=JU5FpUvWWAwlX4zZiwZBuGHoy7VzkF0OcDmALplHzhAo6uzn9EYBLwE0eHKURTkiTP/Ff1zwzD6Xkn9N0QWPaAai9E2B1PzezkdVk6Lcn4f7c/8br0d49taE5hCnGQEtyhiJlKI/mpFoPWADI7bmh2p8Fjzu+JryuYh1S0rgry4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158212; c=relaxed/simple; bh=5QV2MTOROv3D9zioHU6oHVXTjnBpfFCifcdFQsI9NdU=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=HTlOvDJsNlgTRzUzoNsAWpgLKw19o7UbfLLFNva4WlDap0cNDthJup4+8rD/hcLejmr+d/Taxt99H5NkqslTHje+vkB8C9JMu6mkpCglvOWgk9KnnlHhsaWSAVZ/SPh6qVbUTvaZQf3IF/JIPx0TfYjBJXmPAq8hDBzdIl8r0Jw= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=WcN9KjGpFYzWbWzzqk0OytOzjFr8mBlkw41BZCBQ15T6Lr89qI2bo3Xy4a7dwi0AahHPKdMA1qlqgOxuAO8f2dP1klw43vbzDuPSpU+b8/TYnKmk6CuZ2tLIm+4VX4/YD0rQKZNM9iWCiGwaARzv645oQIyTUmxIhzV5XguJXmyRwLT5x3rJThsnLBN6CtoY5zn/pdXjx0qqxcv02kp02CLbIE0d+Xvp6veSAY5r5OntlIDKiL0KYAfOLOsm61rcBUux/VVvtyWU8j6pXdSpjtYdkDAfuA+lzN5gqX1i19onJ0R7ydYbbmcQrni8rLg2ykPWoMUlKcjwK96+xRPZ3Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=LcYwShdjLWDjBuOgffRShs1ltNorWjEcJbpIHOr7R/r1PmtVXMova+FvWXK2klWz6H3OB7rmPHxfXC8cOrJyxs1pZoitGfzOyM0mRFm1Qslf7InWP8qW949HCvDutrEz0eO3ffd1kU5V3cRPsE/AKwECQMviDyD3irTRTUosGB0EPgK3zWQIydq3VnsNYwb1PA7cszzO7S/BI3FaU0Wb/J2hFHLjNxTLcVAO+It0YwHC6Av4p+/PFPfvEN1RkM89wp+1hOt9F8W/Hw5kqptFIPBNLKOcKxg8hQQAhEUIHMUj6XNha/r1Os9PBQTax0Z4mgZeOWgj+WR/pBczicS2uA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=t2QCtB9McKvxndC3rimuvY6wNJg7U0I0m8I0gOxw6t3LfeGZX+loBNoN+ieUuZDj7ahr+xJb/dTvbPs/vpFTjv4LqwwGcd7xh6tn1R/lu5bpHkJuQBNBIJKE2eBXZMFuSpiG4yn+mfNw9HLmRpFJ/iDfRUUsUuOBwLfj9rKWx8M= Received: from AM4PR05CA0024.eurprd05.prod.outlook.com (2603:10a6:205::37) by PA4PR08MB5935.eurprd08.prod.outlook.com (2603:10a6:102:e4::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:10:06 +0000 Received: from AMS1EPF00000041.eurprd04.prod.outlook.com (2603:10a6:205:0:cafe::3) by AM4PR05CA0024.outlook.office365.com (2603:10a6:205::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.18 via Frontend Transport; Thu, 16 Nov 2023 18:10:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000041.mail.protection.outlook.com (10.167.16.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:10:06 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:10:05 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: bcd3266df9ecde91 X-CR-MTA-TID: 64aa7808 Received: from 4211b2b139af.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D4747342-5466-4D2F-B5B6-1AF4B10DB5B1.1; Thu, 16 Nov 2023 18:09:58 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4211b2b139af.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:09:58 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VJbcfbjuP2UDTKATgnm+0DBI/6vDlOpnx2Y041ZeIJ+VxOveeTP6Em+u+Tm1IYw6LVXrWSSI9PYnzzJbm4+bRoULAX/OKW2XCxJ9dMWaDGZ+Yd2pWgiEoV0N3TJQ1mp26cQLyxbZWz8LgSKHsE8C7JN3GypmSX62/JMVkIYghQhnnqQnFWaXVKFbhAXtPZb7Czb5HkaAM0X6t5EeKaWYzv1MidINakvrasJETq2bxIm1N0CZDydTz1hDVZNvkwhEo98hmw+37ztJcjt3plUB7PhcphLyvHeqqd53Mi5fQAepw62XkCHau6pesceEPv12f1lbhWTudLRfWAx/CBC/dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=OUPy49db84ZfmGnJjHJrSPXQjXqpr/qXi+qdscxyzO35Wk0mf6KhGVbos4yOIH99oUebHWO9MHaRv3y+budhQEWVN8jZBrp6iEL74hLIarBRGiNDXA1FRutnF0KwXf9e+M/yZM08CS28auvPFa7Lznt3/QlEygeJS/qChxpZu4mu7Ui1LDwYmc1s3dO20VQsGiptJgYWzPpf2+Aa3J7W6bKYrPqsK8S77bhwAErhZPxhBqg5wWmvJ5m3qvjjcsx/xK3GgN6uCplnYL4Bej2t3kXh6Ozuomw/pFluRqEyQ3Pf0j/U0UR+bqVFY9/CDYu5q4ZrfxBj9k+RMc231jwjPw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=t2QCtB9McKvxndC3rimuvY6wNJg7U0I0m8I0gOxw6t3LfeGZX+loBNoN+ieUuZDj7ahr+xJb/dTvbPs/vpFTjv4LqwwGcd7xh6tn1R/lu5bpHkJuQBNBIJKE2eBXZMFuSpiG4yn+mfNw9HLmRpFJ/iDfRUUsUuOBwLfj9rKWx8M= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:09:56 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:09:56 +0000 Date: Thu, 16 Nov 2023 18:09:52 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 08/11] aarch64: Generalize writeback ldp/stp patterns Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P123CA0072.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1::36) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|AMS1EPF00000041:EE_|PA4PR08MB5935:EE_ X-MS-Office365-Filtering-Correlation-Id: 348520c9-e56e-405b-5212-08dbe6cf431f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Ng2RRJZcCSvDPBCpckK0NXptItYB7D7DmoomyT8P6am/Jan/4A2FLTfiYWQCjPkduBKaY2Ql93OCDOboZ0rt5OblT8Ib8tr8Bu/mpA5TUe9v7cNspE5nR6atyPTMqOtxKPa521BKesBYfxnpp+pRbdW4Vn0XlCdSPsDAl/gDBjyp6ul7lKrL0ZgFLjg41BBLRJFVJZ4K0vCXZYyNJ0aGnZExAfh00v0/eRfU+I2/CcoxXX5LXOTn9/8X2yDMXx1NtLzEqnLwpxUnPy5JoYnJ4TaMcfYcqgLcow1ZQ0e8uRjd0ByGhlbEGKPusSH9ZjSACtNssvYd+nFZCFX4eOe82UyyjEMdsO6Phr2N2IKs+WBIWD9GDArocPY4sdwvFuEgsY1Qq07gv7/XPPI6/Tyr6uiuAa9++I8gqYbApIdcuNArcDrU00CsblD69fiZfU5KWaroQVn2hmAIUaYM1SKzzdsIUpfoVdCDltBBGOZ1N/8V19zSks/xni/b0Pz2uvDk0qsK/gCVHNq1rslAS470/+IIiWb3MCF0+l0T0e+n/sAkGA2wBGAfLEpedllNE5aTzZLECcE2SkXtPx/dVsiu3H5ZSLUfM5o5qEMLCrh+07SdHAagm5CZvPOu5c1TnuyF X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66574015)(66556008)(478600001)(38100700002)(41300700001)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(6666004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000041.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f99908b6-8e1d-4c7b-ce82-08dbe6cf3d6a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: krbQdSwh0nI1M0nBFs9jNqo4NfYHRrB5rje/eafWKDp2vo8aadgP9KWHThdD1I1uhNisNjQpx/EI60Ecvm8rg+e0eY3WN8Ix2HmvUwHsG/iD6EEb7Ri9tzrNDkfUnuiDAA/x9o12RroeBKFr6MXNgEXhePh94MetC+jrMqKtJVDREZEOW1K8Xjm5gol2TO5AwBB/CDtvOmrGPSylCrM86NI/1UJN5CMcFYINjFvO+i9IbgfYbF8xY4W58cdjipTNKwvLCxh1UJFcj+u0avJaIKaPM7Mlqka/EFNlnVvTjjw63JZt3BHHJY7DKkG6I7Nj31LWfnmLzbggvKikScAYljZSxm1zkRbY9WFagiOw2dUpz99+dQpYEmaxJPQSMDSZtaeT3fd97+KU1tKd+smq1T7gmkx6oAqwcvZe+mi1pypSlXESIi43Xt7U+5s51h5BQGe0uy3obuhNhXyi25FZ319/BUakmy6JVjL4sdYxR1iUvfsZvEuDI7VSE3Ii/ZZK/O2qw+09ds/c2xwpY/JbotybfcM3/I9VR2yFqJ/qNQUrgahxJTsdAxPRN/zRJ+s55zL87dInPYdRhSDo6oa8j5f6CId1EdqQ0OeFdRAXWVqxdtVFGafu6cUMGwxAFFae9hndlmT2TZWFeTvPkZ8sihijf67eSzb+2oDgcV7hmJOOfl/2Nk6Bz+kRuuV6TPs7NRwb2XXb6s2uOeYcpT05jIppJMMFCDOFkyFRUpNJagAZmNGYETaekam5BEc4z5p89rZ382er3/sPT11lnAqXKg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(396003)(346002)(39860400002)(136003)(230922051799003)(451199024)(1800799009)(82310400011)(64100799003)(186009)(36840700001)(46966006)(40470700004)(40460700003)(82740400003)(66574015)(336012)(83380400001)(44144004)(33964004)(6512007)(26005)(6506007)(2616005)(6666004)(316002)(6916009)(54906003)(8676002)(4326008)(70586007)(70206006)(36756003)(8936002)(36860700001)(2906002)(40480700001)(41300700001)(44832011)(5660300002)(235185007)(81166007)(6486002)(356005)(86362001)(47076005)(478600001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:10:06.1176 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 348520c9-e56e-405b-5212-08dbe6cf431f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000041.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB5935 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Thus far the writeback forms of ldp/stp have been exclusively used in prologue and epilogue code for saving/restoring of registers to/from the stack. As such, forms of ldp/stp that weren't needed for prologue/epilogue code weren't supported by the aarch64 backend. This patch generalizes the load/store pair writeback patterns to allow: - Base registers other than the stack pointer. - Modes that weren't previously supported. - Combinations of distinct modes provided they have the same size. - Pre/post variants that weren't previously needed in prologue/epilogue code. We make quite some effort to avoid a combinatorial explosion in the number of patterns generated (and those in the source) by making extensive use of special predicates. An updated version of the upcoming ldp/stp pass can generate the writeback forms, so this patch is motivated by that. This patch doesn't add zero-extending or sign-extending forms of the writeback patterns; that is left for future work. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_ldpstp_operand_mode_p): Declare. * config/aarch64/aarch64.cc (aarch64_gen_storewb_pair): Build RTL directly instead of invoking named pattern. (aarch64_gen_loadwb_pair): Likewise. (aarch64_ldpstp_operand_mode_p): New. * config/aarch64/aarch64.md (loadwb_pair_): Replace with ... (*loadwb_post_pair_): ... this. Generalize as described in cover letter. (loadwb_pair_): Delete (superseded by the above). (*loadwb_post_pair_16): New. (*loadwb_pre_pair_): New. (loadwb_pair_): Delete. (*loadwb_pre_pair_16): New. (storewb_pair_): Replace with ... (*storewb_pre_pair_): ... this. Generalize as described in cover letter. (*storewb_pre_pair_16): New. (storewb_pair_): Delete. (*storewb_post_pair_): New. (storewb_pair_): Delete. (*storewb_post_pair_16): New. * config/aarch64/predicates.md (aarch64_mem_pair_operator): New. (pmode_plus_operator): New. (aarch64_ldp_reg_operand): New. (aarch64_stp_reg_operand): New. --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.cc | 60 +++--- gcc/config/aarch64/aarch64.md | 284 ++++++++++++++++++++-------- gcc/config/aarch64/predicates.md | 38 ++++ 4 files changed, 271 insertions(+), 112 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 36d6c688bc8..e463fd5c817 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1023,6 +1023,7 @@ bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_mem_ok_with_ldpstp_policy_model (rtx, bool, machine_mode); void aarch64_swap_ldrstr_operands (rtx *, bool); +bool aarch64_ldpstp_operand_mode_p (machine_mode); extern void aarch64_asm_output_pool_epilogue (FILE *, const char *, tree, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 4820fac67a1..ccf081d2a16 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8977,23 +8977,15 @@ static rtx aarch64_gen_storewb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_storewb_pairdi_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_DFmode: - return gen_storewb_pairdf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_TFmode: - return gen_storewb_pairtf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_VREG - adjustment)); - default: - gcc_unreachable (); - } + rtx new_base = plus_constant (Pmode, base, -adjustment); + rtx mem = gen_frame_mem (mode, new_base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (mem, reg), + gen_rtx_SET (mem2, reg2))); } /* Push registers numbered REGNO1 and REGNO2 to the stack, adjusting the @@ -9025,20 +9017,15 @@ static rtx aarch64_gen_loadwb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_loadwb_pairdi_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_DFmode: - return gen_loadwb_pairdf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_TFmode: - return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_VREG)); - default: - gcc_unreachable (); - } + rtx mem = gen_frame_mem (mode, base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + rtx new_base = plus_constant (Pmode, base, adjustment); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (reg, mem), + gen_rtx_SET (reg2, mem2))); } /* Pop the two registers numbered REGNO1, REGNO2 from the stack, adjusting it @@ -26688,6 +26675,17 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed) return false; } +bool +aarch64_ldpstp_operand_mode_p (machine_mode mode) +{ + if (!targetm.hard_regno_mode_ok (V0_REGNUM, mode) + || hard_regno_nregs (V0_REGNUM, mode) > 1) + return false; + + const auto size = GET_MODE_SIZE (mode); + return known_eq (size, 4) || known_eq (size, 8) || known_eq (size, 16); +} + /* Return true if MEM1 and MEM2 can be combined into a single access of mode MODE, with the combined access having the same address as MEM1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 7be1de38b1c..c92a51690c5 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1831,102 +1831,224 @@ (define_insn "store_pair_dw_" (set_attr "fp" "yes")] ) +;; Writeback load/store pair patterns. +;; +;; Note that modes in the patterns [SI DI TI] are used only as a proxy for their +;; size; aarch64_ldp_reg_operand and aarch64_mem_pair_operator are special +;; predicates which accept a wide range of operand modes, with the requirement +;; that the contextual (pattern) mode is of the same size as the operand mode. + ;; Load pair with post-index writeback. This is primarily used in function ;; epilogues. -(define_insn "loadwb_pair_" +(define_insn "*loadwb_post_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPI 2 "register_operand" "=r") - (mem:GPI (match_dup 1))) - (set (match_operand:GPI 3 "register_operand" "=r") - (mem:GPI (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "load_")] -) - -(define_insn "loadwb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int )])]))])] + "aarch64_mem_pair_offset (operands[4], mode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =0, 1, =2, =3; attrs: type] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%1], %4 + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%1], %4 + } +) + +;; q-register variant of the above +(define_insn "*loadwb_post_pair_16" + [(parallel + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" + [(match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int 16)])]))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "ldp\t%q2, %q3, [%1], %4" + [(set_attr "type" "neon_ldp_q")] +) + +;; Load pair with pre-index writeback. +(define_insn "*loadwb_pre_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPF 2 "register_operand" "=w") - (mem:GPF (match_dup 1))) - (set (match_operand:GPF 3 "register_operand" "=w") - (mem:GPF (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "neon_load1_2reg")] -) - -(define_insn "loadwb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 7 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))])] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode)) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =&0, 1, =2, =3; attrs: type ] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above +(define_insn "*loadwb_pre_pair_16" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:TX 2 "register_operand" "=w") - (mem:TX (match_dup 1))) - (set (match_operand:TX 3 "register_operand" "=w") - (mem:TX (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%q2, %q3, [%1], %4" + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 7 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "ldp\t%q2, %q3, [%0, %4]!" [(set_attr "type" "neon_ldp_q")] ) ;; Store pair with pre-index writeback. This is primarily used in function ;; prologues. -(define_insn "storewb_pair_" +(define_insn "*storewb_pre_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPI (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPI 2 "register_operand" "r")) - (set (mem:GPI (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPI 3 "register_operand" "r"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "store_")] + [(set (match_operand 0 "pmode_register_operand") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:GPI 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:GPI 3 "aarch64_stp_reg_operand"))])] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode)) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =&0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above. +(define_insn "*storewb_pre_pair_16" + [(parallel + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\\t%q2, %q3, [%0, %4]!" + [(set_attr "type" "neon_stp_q")] ) -(define_insn "storewb_pair_" +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPF (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPF 2 "register_operand" "w")) - (set (mem:GPF (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPF 3 "register_operand" "w"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "neon_store1_2reg")] -) - -(define_insn "storewb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int ) + ])]) + (match_operand 3 "aarch64_stp_reg_operand"))])] + "aarch64_mem_pair_offset (operands[4], mode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0], %4 + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0], %4 + } +) + +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_16" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:TX (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:TX 2 "register_operand" "w")) - (set (mem:TX (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:TX 3 "register_operand" "w"))])] - "TARGET_SIMD - && INTVAL (operands[5]) - == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%q2, %q3, [%0, %4]!" + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int 16) + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\t%q2, %q3, [%0], %4" [(set_attr "type" "neon_stp_q")] ) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index a73724a7fc0..b647e5af7c6 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -257,11 +257,49 @@ (define_predicate "aarch64_mem_pair_offset" (and (match_code "const_int") (match_test "aarch64_offset_7bit_signed_scaled_p (mode, INTVAL (op))"))) +(define_special_predicate "aarch64_mem_pair_operator" + (and + (match_code "mem") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + (define_predicate "aarch64_mem_pair_operand" (and (match_code "mem") (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), false, ADDR_QUERY_LDP_STP)"))) +(define_predicate "pmode_plus_operator" + (and (match_code "plus") + (match_test "GET_MODE (op) == Pmode"))) + +(define_special_predicate "aarch64_ldp_reg_operand" + (and + (match_code "reg,subreg") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + +(define_special_predicate "aarch64_stp_reg_operand" + (ior (match_operand 0 "aarch64_ldp_reg_operand") + (and (ior + (and (match_code "const_int,const,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))")) + (and (match_code "const_double") + (match_test "aarch64_float_const_zero_rtx_p (op)"))) + (ior + (match_test "GET_MODE (op) == VOIDmode") + (and + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))"))))))) + ;; Used for storing two 64-bit values in an AdvSIMD register using an STP ;; as a 128-bit vec_concat. (define_predicate "aarch64_mem_pair_lanes_operand"