From patchwork Sat Jan 13 15:46:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1886361 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=LYrPyph5; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=LYrPyph5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TC2pK5HQtz1yP3 for ; Sun, 14 Jan 2024 02:47:05 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA6A23858433 for ; Sat, 13 Jan 2024 15:47:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2080.outbound.protection.outlook.com [40.107.7.80]) by sourceware.org (Postfix) with ESMTPS id EE6863858D28 for ; Sat, 13 Jan 2024 15:46:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE6863858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE6863858D28 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.7.80 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1705160796; cv=pass; b=ZPzlp6UQ602dd/Vu+YhgxGSRX59CO1QrmGdW15sITL6GqKpGoH0zE6ZMEZaZrswK180HpZq4oHL2hhWUNOr9mv3nO9gTENSAmjCNv/PCa1ldQv+YOuo5dA67HRLYmjg4l7oocvKAzi+xPDn20m4+gFb/RIdHFOQElrAURUry6CM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1705160796; c=relaxed/simple; bh=X08BrXxosggwxMRzrQfMhfmUZOiNTmLRRzbnKKDTfx8=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=DauHZtTLLxru1IHsqeGeWSTyCYLWTiEc2rbcuRJjuJurVgEo1U0JdQe0UZ+jreaQ1k+pWcNw2QloGhUDoYkOvd5n8JsZtZETCPIvw29fV+FryBp8AVj5lDH0z3oDkAZhpSNMPD77dPOvmqm6fPk1x2PdlesnbUa/CLAGlSKiOgg= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=HSKpDP8OrolO1KqmBUWuEQTS0jWGIT8p3W2/fns01fnfZVNIW81JkU32mjKrqPgqXoNStmk1dMw5/6HssPS42elWDbCSMdCWzSWX8mSf5atN6g6zxHHlNPGEMHSFQ2h0Bd5n01jMF/YQDMAMurcem+hss6+ND/Cz/4TnEbQdfzmiMei3IE5X3FT0J27UdAcDXvuvoSBQmFy97+l1K5Xe8m330Zrkll7yqwN99gOPRLTxnpVQS+MjXIsTtNFCSLgwysk/dtTrma6FQguPVDWlDSipaS5w2Z1SQ02U8YQBjUEvu3fxTQ4DTZWF1yMTTIWCCo9FEtQM/PLaCnMk7fOpkw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+KjfU+FSB4YH6PQT9iejcnaFNck5Ijot1+eCo8qqgIc=; b=M5joyLN93ZNYevvFDAK1C99ow7Kq0zZ+CYcVxoyn/hOyMLv/5D+HVcJQW035ciQvLUXNApbYfXDBPb09N5i/J/iDdHaFv/w/SMedv2kDpTobmh+WdZc7iiEQ4w0egi6+dPJ7zjxWnGpMgOb3hJkA9yZvh1pZx4I+RMdY52MeGU4+z1zpKd9M8yiZckbkCWWi/KNeS5zEdHZuFgs9Y6G91KfLZlHjporEusCUX+a879TM8m8/uRgdcPEgpdKwv2nHIZdz+dMeMVVWMD0W9rZPySg83XQFjc5IAWU/W8aCHBeKiF2hNTqOprNMj8P4mZ+PmEPsgvF80Y+O5PzkuSLBpg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+KjfU+FSB4YH6PQT9iejcnaFNck5Ijot1+eCo8qqgIc=; b=LYrPyph5tgj+mtO5RmTjP/lCC7dASy8+8c7gQOupW3BwEFpSWiohIajLudUEIDHjJsOfpsTTp+QXgAVeiWxf5jNJC88cBzpV/5UbKCwWh4Gi0iZw7kgbUnbSdRjC59CnPFmanLfENJf+C9ZSLIKp5roHG3YlzAqol4TEzIag4yE= Received: from AS4P190CA0027.EURP190.PROD.OUTLOOK.COM (2603:10a6:20b:5d0::16) by DBBPR08MB6171.eurprd08.prod.outlook.com (2603:10a6:10:20f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7181.21; Sat, 13 Jan 2024 15:46:30 +0000 Received: from AM4PEPF00025F95.EURPRD83.prod.outlook.com (2603:10a6:20b:5d0:cafe::7) by AS4P190CA0027.outlook.office365.com (2603:10a6:20b:5d0::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7181.23 via Frontend Transport; Sat, 13 Jan 2024 15:46:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00025F95.mail.protection.outlook.com (10.167.16.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7228.0 via Frontend Transport; Sat, 13 Jan 2024 15:46:30 +0000 Received: ("Tessian outbound a297577ee0df:v228"); Sat, 13 Jan 2024 15:46:30 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: bada9bb6ac2baf35 X-CR-MTA-TID: 64aa7808 Received: from b748a1eb9ece.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B61F188E-18AE-48B3-BF35-F2306BAB4CF8.1; Sat, 13 Jan 2024 15:46:23 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b748a1eb9ece.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Sat, 13 Jan 2024 15:46:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=elzgx+HhQPv1A3av5nKeCyWXxlVqP+7mMZV6NJibfkVTa2tyMcNQucWuq3lGkRmyMVORbqxjk+3SUFsISotGsYsX+zypu+mmTPJLkgTstb/Y+6c48doaI6kojLPtxCZu4GP2wFQgXLamBiTO5a4i6SHq5xJASmehNMOlcOMN0SjACZT/YdsSRbW+IVXl+eJ5zuAasEUiiNQFJZFIGQoEXMfE//N7oD7o/H66ICtL+4/qdBFazj0JCMr5FXk91chLHmJcWLEbLicYpmbLzK2oAOMr31uYKj2Xxz7XPBItbtxP2zlRRkxdQ0+eDcuScRnR+BtAN9ILPTZ61VjKiqBODQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+KjfU+FSB4YH6PQT9iejcnaFNck5Ijot1+eCo8qqgIc=; b=NZwa8MHGZePh6kqHyPY3+qEWixDx5bmLRikgHQiUnhi99T1oZVhGMUleBdN+Kzs9ttMkNU6GomnqUMp9rX2I1eU1jWZ77kd3gun/VXMULotDsHVyYz3Yt2NtQLt2unbVriROnQN6hxcil3UMqH923FzeAgNCAb+S06hNM6kDIywZ9iExRs35Wn8AdZAAmWfTy3xebzPHmt1M2BX4TgCR278T1mnIRuPzsq2baMvXmjcpJy0S/0ueOwrZ/M8k0sRT8FWbN1HBHmWnkcbm3uJo+5PFVTdBYB0T4n9IjjnyVa1jg0cERuKrVE8Iq1+RtDGgxu6ngJT2JjZ6ZaltQqU75g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+KjfU+FSB4YH6PQT9iejcnaFNck5Ijot1+eCo8qqgIc=; b=LYrPyph5tgj+mtO5RmTjP/lCC7dASy8+8c7gQOupW3BwEFpSWiohIajLudUEIDHjJsOfpsTTp+QXgAVeiWxf5jNJC88cBzpV/5UbKCwWh4Gi0iZw7kgbUnbSdRjC59CnPFmanLfENJf+C9ZSLIKp5roHG3YlzAqol4TEzIag4yE= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by PAXPR08MB7490.eurprd08.prod.outlook.com (2603:10a6:102:2b7::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7181.17; Sat, 13 Jan 2024 15:46:20 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17%4]) with mapi id 15.20.7181.015; Sat, 13 Jan 2024 15:46:20 +0000 Date: Sat, 13 Jan 2024 15:46:15 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov , Richard Earnshaw Subject: [PATCH 4/4] aarch64: Fix up uses of mem following stp insert [PR113070] Message-ID: Content-Disposition: inline X-ClientProxiedBy: LNXP265CA0023.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5e::35) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|PAXPR08MB7490:EE_|AM4PEPF00025F95:EE_|DBBPR08MB6171:EE_ X-MS-Office365-Filtering-Correlation-Id: 218563d5-62e7-4471-9e67-08dc144ecfd9 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 9G9lSCxOO7rx4NpR7JzkyEQrbhIQtX5/ItbWH7N37sHbO/da6ccxcS8ss8xiAi3gre/ekD7afRBoxfDEoVBAjdgkknMZxo/vM3no5su6l4DY5UNu1feYk+luPTzWI164M/M72EqKUsJ+PKCpgeWeqlGh7/EJh+zU2e6MwrvKtNDGThtEZHK0gvp0a0IXmjvv8QQ2GENeRaMzHR3clkqjEZLpVLV2F8ONuDEKPeQRevLZP9D+mvO1DG3CIP/y2mWqu+cwXNuAGstG1RuhsZMJpR/rgol8P81anA6mqQZhQD+RzehveMurCApuhvl1npFiKXrOX2RveXtENixDV7IY3teb8FI772j8aioSnaY1ri8NYeAcZWCSEMRscppjR9/bBD6rzyCsH2QutY7Krb7MFG8cjx8Z09mC9GUYsjGASkuiebexUxc+OjSRlWWsYA3/WG8275UFLPfnsvv52JONZClScUlj9LPgZCEQJU9twiVWIVp8F+GJH+BmGEzUrVUBhn6xTw3aAFT26D+DWZ/4SMOWniYJCMg0lXLrJbcC6s7XybPkYP0VXo17zdlfpNWhI/rOjb8fg4oAXsmJTnKmKdps7/TD9NuOKNush7hLX2b0OAruUynVOO5jYblYJW+a X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(376002)(396003)(346002)(366004)(39860400002)(230922051799003)(64100799003)(186009)(1800799012)(451199024)(6512007)(66946007)(33964004)(6506007)(2616005)(66476007)(316002)(66556008)(6666004)(478600001)(54906003)(44144004)(6486002)(66899024)(26005)(83380400001)(41300700001)(44832011)(2906002)(5660300002)(8936002)(8676002)(4326008)(235185007)(6916009)(38100700002)(36756003)(86362001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB7490 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00025F95.EURPRD83.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 5072f344-e12c-4f36-802c-08dc144ec932 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8xQLmvL1+KTaOvbarkXG5wf/6MGHsgqk9V6z6UbNkUfdrdtUR3YV2aVfPnumA3FFNvs5oc8e9E6QpXJf+Vcu8kkDdcbCobX2EKDQ2Lo51DFoSHDXipmqafyERdbrfCaCLw3QUvjT4s+8NpMVF6EclxI8c4EuBj1c2waI+7lJDuGRIqf6ZTGUD73ViyKEiVxpCpik8Xfwl5R/6GGMzyw9PY1bdud5Lj3ls/Qqm6Z3ne6MREkAVReOpDPPbZV6dVDvE6Reql418NzHnha6OYk8IOnk+MmOfIBbkY0GDBAlwqyHzonDhrn60/CAz4t8Wx3wAmqtMtwnYT1nlBerpLJjPdQIU+f4NYjPwc3yaNpjjPMtMfqy90AvSPwolv7YOQhtPEnedXyS9r8vC0yC5PWUP2f6HQai+g3kTfZoGLlMbScoz+tCLKWTNb8z9X5OL+z/tuLk0weDJDNMTpVTHA5J5+HYlexfYk/YHdaQX0ysx2nm8yqQd+wQsEsTJX6s6/OwyYg2UFcvua+ukWHfHpjRdCZuGoOiRcU+/exLdascMs7rX5iIyi6MBm+EgDFX+CJinI1Slqq2qnBrVR6ZRGirAhfBByJlCcANue0NpA1yUikxNomBzhD40hq4yb2hjJ+CaHfkHrnH8GfMc2xRjI1iiMiGKtr/DgTxqxj7Lb8QN+H4SaZlDw1OpPxqwrv4qrBWgRONwvsqnc9exg/SN9k0mdA4uPPE6S3JgCWkR02M7T8= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(346002)(376002)(136003)(396003)(39860400002)(230922051799003)(82310400011)(451199024)(64100799003)(186009)(1800799012)(40470700004)(46966006)(36840700001)(86362001)(82740400003)(6916009)(478600001)(70206006)(70586007)(6666004)(6486002)(6506007)(81166007)(356005)(44832011)(2616005)(44144004)(26005)(6512007)(336012)(54906003)(33964004)(4326008)(40480700001)(8936002)(316002)(8676002)(83380400001)(36756003)(41300700001)(235185007)(2906002)(40460700003)(36860700001)(5660300002)(47076005)(66899024)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jan 2024 15:46:30.6292 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 218563d5-62e7-4471-9e67-08dc144ecfd9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00025F95.EURPRD83.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6171 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org As the PR shows (specifically #c7) we are missing updating uses of mem when inserting an stp in the aarch64 load/store pair fusion pass. This patch fixes that. RTL-SSA has a simple view of memory and by default doesn't allow stores to be re-ordered w.r.t. other stores. In the ldp fusion pass, we do our own alias analysis and so can re-order stores over other accesses when we deem this is safe. If neither store can be re-purposed (moved into the required position to form the stp while respecting the RTL-SSA constraints), then we turn both the candidate stores into "tombstone" insns (logically delete them) and insert a new stp insn. As it stands, we implement the insert case separately (after dealing with the candidate stores) in fuse_pair by inserting into the middle of the vector of changes. This is OK when we only have to insert one change, but with this fix we would need to insert the change for the new stp plus multiple changes to fix up uses of mem (note the number of fix-ups is naturally bounded by the alias limit param to prevent quadratic behaviour). If we kept the code structured as is and inserted into the middle of the vector, that would lead to repeated moving of elements in the vector which seems inefficient. The structure of the code would also be a little unwieldy. To improve on that situation, this patch introduces a helper class, stp_change_builder, which implements a state machine that helps to build the required changes directly in program order. That state machine is reponsible for deciding what changes need to be made in what order, and the code in fuse_pair then simply follows those steps. Together with the fix in the previous patch for installing new defs correctly in RTL-SSA, this fixes PR113070. We take the opportunity to rename the function decide_stp_strategy to try_repurpose_store, as that seems more descriptive of what it actually does, since stp_change_builder is now responsible for the overall change strategy. Bootstrapped/regtested as a series with/without the passes enabled on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: PR target/113070 * config/aarch64/aarch64-ldp-fusion.cc (struct stp_change_builder): New. (decide_stp_strategy): Reanme to ... (try_repurpose_store): ... this. (ldp_bb_info::fuse_pair): Refactor to use stp_change_builder to construct stp changes. Fix up uses when inserting new stp insns. --- gcc/config/aarch64/aarch64-ldp-fusion.cc | 248 ++++++++++++++++++----- 1 file changed, 194 insertions(+), 54 deletions(-) diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc index 689a8c884bd..703cfb1228c 100644 --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc @@ -844,11 +844,138 @@ def_upwards_move_range (def_info *def) return range; } +// Class that implements a state machine for building the changes needed to form +// a store pair instruction. This allows us to easily build the changes in +// program order, as required by rtl-ssa. +struct stp_change_builder +{ + enum class state + { + FIRST, + INSERT, + FIXUP_USE, + LAST, + DONE + }; + + enum class action + { + TOMBSTONE, + CHANGE, + INSERT, + FIXUP_USE + }; + + struct change + { + action type; + insn_info *insn; + }; + + bool done () const { return m_state == state::DONE; } + + stp_change_builder (insn_info *insns[2], + insn_info *repurpose, + insn_info *dest) + : m_state (state::FIRST), m_insns { insns[0], insns[1] }, + m_repurpose (repurpose), m_dest (dest), m_use (nullptr) {} + + change get_change () const + { + switch (m_state) + { + case state::FIRST: + return { + m_insns[0] == m_repurpose ? action::CHANGE : action::TOMBSTONE, + m_insns[0] + }; + case state::LAST: + return { + m_insns[1] == m_repurpose ? action::CHANGE : action::TOMBSTONE, + m_insns[1] + }; + case state::INSERT: + return { action::INSERT, m_dest }; + case state::FIXUP_USE: + return { action::FIXUP_USE, m_use->insn () }; + case state::DONE: + break; + } + + gcc_unreachable (); + } + + // Transition to the next state. + void advance () + { + switch (m_state) + { + case state::FIRST: + if (m_repurpose) + m_state = state::LAST; + else + m_state = state::INSERT; + break; + case state::INSERT: + { + def_info *def = memory_access (m_insns[0]->defs ()); + while (*def->next_def ()->insn () <= *m_dest) + def = def->next_def (); + + // Now we know DEF feeds the insertion point for the new stp. + // Look for any uses of DEF that will consume the new stp. + gcc_assert (*def->insn () <= *m_dest + && *def->next_def ()->insn () > *m_dest); + + if (auto set = dyn_cast (def)) + for (auto use : set->nondebug_insn_uses ()) + if (*use->insn () > *m_dest) + { + m_use = use; + break; + } + + if (m_use) + m_state = state::FIXUP_USE; + else + m_state = state::LAST; + break; + } + case state::FIXUP_USE: + m_use = m_use->next_nondebug_insn_use (); + if (!m_use) + m_state = state::LAST; + break; + case state::LAST: + m_state = state::DONE; + break; + case state::DONE: + gcc_unreachable (); + } + } + +private: + state m_state; + + // Original candidate stores. + insn_info *m_insns[2]; + + // If non-null, this is a candidate insn to change into an stp. Otherwise we + // are deleting both original insns and inserting a new insn for the stp. + insn_info *m_repurpose; + + // Destionation of the stp, it will be placed immediately after m_dest. + insn_info *m_dest; + + // Current nondebug use that needs updating due to stp insertion. + use_info *m_use; +}; + // Given candidate store insns FIRST and SECOND, see if we can re-purpose one // of them (together with its def of memory) for the stp insn. If so, return // that insn. Otherwise, return null. static insn_info * -decide_stp_strategy (insn_info *first, +try_repurpose_store (insn_info *first, insn_info *second, const insn_range_info &move_range) { @@ -1253,7 +1380,7 @@ ldp_bb_info::fuse_pair (bool load_p, insn_info *insns[2] = { first, second }; - auto_vec changes (4); + auto_vec changes; auto_vec tombstone_uids (2); rtx pats[2] = { @@ -1455,9 +1582,9 @@ ldp_bb_info::fuse_pair (bool load_p, if (load_p) { - changes.quick_push (make_delete (first)); + changes.safe_push (make_delete (first)); pair_change = make_change (second); - changes.quick_push (pair_change); + changes.safe_push (pair_change); pair_change->move_range = move_range; pair_change->new_defs = merge_access_arrays (attempt, @@ -1474,18 +1601,22 @@ ldp_bb_info::fuse_pair (bool load_p, } else { - insn_info *store_to_change = decide_stp_strategy (first, second, + using Action = stp_change_builder::action; + insn_info *store_to_change = try_repurpose_store (first, second, move_range); - - if (store_to_change && dump_file) - fprintf (dump_file, " stp: re-purposing store %d\n", - store_to_change->uid ()); - + insn_info *stp_dest = move_range.singleton (); + gcc_assert (stp_dest); + stp_change_builder builder (insns, store_to_change, stp_dest); insn_change *change; - for (int i = 0; i < 2; i++) + set_info *new_set = nullptr; + for (; !builder.done (); builder.advance ()) { - change = make_change (insns[i]); - if (insns[i] == store_to_change) + auto action = builder.get_change (); + change = (action.type == Action::INSERT) + ? nullptr : make_change (action.insn); + switch (action.type) + { + case Action::CHANGE: { set_pair_pat (change); change->new_uses = merge_access_arrays (attempt, @@ -1495,67 +1626,76 @@ ldp_bb_info::fuse_pair (bool load_p, auto d2 = drop_memory_access (input_defs[1]); change->new_defs = merge_access_arrays (attempt, d1, d2); gcc_assert (change->new_defs.is_valid ()); - def_info *stp_def = memory_access (store_to_change->defs ()); + def_info *stp_def = memory_access (change->insn ()->defs ()); change->new_defs = insert_access (attempt, stp_def, change->new_defs); gcc_assert (change->new_defs.is_valid ()); change->move_range = move_range; pair_change = change; + break; } - else + case Action::TOMBSTONE: { - // Note that we are turning this insn into a tombstone, - // we need to keep track of these if we go ahead with the - // change. - tombstone_uids.quick_push (insns[i]->uid ()); - rtx_insn *rti = insns[i]->rtl (); + tombstone_uids.quick_push (change->insn ()->uid ()); + rtx_insn *rti = change->insn ()->rtl (); validate_change (rti, &PATTERN (rti), gen_tombstone (), true); validate_change (rti, ®_NOTES (rti), NULL_RTX, true); change->new_uses = use_array (nullptr, 0); + break; } - gcc_assert (change->new_uses.is_valid ()); - changes.quick_push (change); - } + case Action::INSERT: + { + if (dump_file) + fprintf (dump_file, + " stp: cannot re-purpose candidate stores\n"); - if (!store_to_change) - { - // Tricky case. Cannot re-purpose existing insns for stp. - // Need to insert new insn. - if (dump_file) - fprintf (dump_file, - " stp fusion: cannot re-purpose candidate stores\n"); - - auto new_insn = crtl->ssa->create_insn (attempt, INSN, pair_pat); - change = make_change (new_insn); - change->move_range = move_range; - change->new_uses = merge_access_arrays (attempt, - input_uses[0], - input_uses[1]); - gcc_assert (change->new_uses.is_valid ()); - - auto d1 = drop_memory_access (input_defs[0]); - auto d2 = drop_memory_access (input_defs[1]); - change->new_defs = merge_access_arrays (attempt, d1, d2); - gcc_assert (change->new_defs.is_valid ()); - - auto new_set = crtl->ssa->create_set (attempt, new_insn, memory); - change->new_defs = insert_access (attempt, new_set, - change->new_defs); - gcc_assert (change->new_defs.is_valid ()); - changes.safe_insert (1, change); - pair_change = change; + auto new_insn = crtl->ssa->create_insn (attempt, INSN, pair_pat); + change = make_change (new_insn); + change->move_range = move_range; + change->new_uses = merge_access_arrays (attempt, + input_uses[0], + input_uses[1]); + gcc_assert (change->new_uses.is_valid ()); + + auto d1 = drop_memory_access (input_defs[0]); + auto d2 = drop_memory_access (input_defs[1]); + change->new_defs = merge_access_arrays (attempt, d1, d2); + gcc_assert (change->new_defs.is_valid ()); + + new_set = crtl->ssa->create_set (attempt, new_insn, memory); + change->new_defs = insert_access (attempt, new_set, + change->new_defs); + gcc_assert (change->new_defs.is_valid ()); + pair_change = change; + break; + } + case Action::FIXUP_USE: + { + // This use now needs to consume memory from our stp. + if (dump_file) + fprintf (dump_file, + " stp: changing i%d to use mem from new stp " + "(after i%d)\n", + action.insn->uid (), stp_dest->uid ()); + change->new_uses = drop_memory_access (change->new_uses); + gcc_assert (new_set); + auto new_use = crtl->ssa->create_use (attempt, action.insn, + new_set); + change->new_uses = insert_access (attempt, new_use, + change->new_uses); + break; + } + } + changes.safe_push (change); } } if (trailing_add) changes.quick_push (make_delete (trailing_add)); - auto n_changes = changes.length (); - gcc_checking_assert (n_changes >= 2 && n_changes <= 4); - auto is_changing = insn_is_changing (changes); - for (unsigned i = 0; i < n_changes; i++) + for (unsigned i = 0; i < changes.length (); i++) gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i], is_changing)); // Check the pair pattern is recog'd.