From patchwork Tue Oct 1 06:48:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1991339 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=TcHQ62eQ; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=TcHQ62eQ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XHpTH3s3jz1xsq for ; Tue, 1 Oct 2024 16:49:37 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 29A0B38650EE for ; Tue, 1 Oct 2024 06:49:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-VI1-obe.outbound.protection.outlook.com (mail-vi1eur03on20629.outbound.protection.outlook.com [IPv6:2a01:111:f403:260c::629]) by sourceware.org (Postfix) with ESMTPS id 366573861837 for ; Tue, 1 Oct 2024 06:48:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 366573861837 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 366573861837 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260c::629 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727765349; cv=pass; b=qBxZb32mSmj6tX3YUUbifQ/00ZuO96OCpGxOkmszxbl5alD4t/AuSo6ATC5vVbvlBTVLdfB6ydElTX23w0sBvMbOQyNQ0E4xCcSkhKX03jdaAc7ujBRqLwujErVM3y3sBoRbvMeYCfd8ExR5TtgLef8AMliZrX71ewP+6+NiCpM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727765349; c=relaxed/simple; bh=UNEZwmabYtxfFil68iG6wSmCy+gPt79cXvkWsxC9XSA=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=O8JJStZMT7OpVVuHWM5rsTEiuatP700Z7DWtQ4N8G+F5WMAH+AZgVd9qIJYoThG96Febbns2Q2w1exPfd57M4XnLIufcXBpJfWYTQ5A8AlZaPLp7OLruV+2wVTCQN1jE4olpustbWCf/q00HQVhNhzuIboq8q8ypQNCu7TCbGTo= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=PxMcmgAdTH4MzO4lbIwyahu7RwviEX58ma/VsD9Vr0+3krhFj6hgijXfpVwiujJFwDzPzFHINrKtVBe0C+JKtvmK6vDZPUA3w58GHfWMSgP/Q7IKQZnYtBhkEetECgjPtaOSTlQKEJJmwXB0arpoUxWFO+VkvDV5XLunstSAPIWKkuKeLn/N0TNZWNr4Uib1DdOfFbfADQNeJrCxuPq10FU60gJRIUALAWFXX5IIMUxOfVqKZ1PM/QdlsXuMOz0CQoaNc0+8YRdLKzk5Q0w7fkZ0eANhgYZvDyJMtelr9mOYq6ysNAABbLPkCLIL/kopfejiFu2rOlgvq2thDIeddQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tBaEeXFUYo1pdnRAmmAMukCiNIsUjE0ysdgd0M2pC1Q=; b=DotEnj1WJuCeeD1ne+kOjmKOOstcO+4ZPnWB0octqMcq6iVUg39OWmo9z7aY/AzTOIkGUKiR5s9KTTXDb1ODKNbFqtqfbmpm9jvSGA5YduM+vMztY7jGaFCVWoo1wmD0ay/BakBkYlJQJ4PyB1/EAAIexoAg2KqrUFyv4tgWTbuUmflnm0FQ3J7QE1UmCg0vMK8pPj+bjbV3LT/Vrjr9lbq7O1jLCFrHlTE/KAnYGy7g5Va40aTMWYUj6buwBXwRX2KF/zA2WjFqLHeYdYQWK7JVwXJqTZXyKxlXY71zunfmMoLxQURU2WHj/zYFJhixzVqjYYEWMoOtluFEmuBJKQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tBaEeXFUYo1pdnRAmmAMukCiNIsUjE0ysdgd0M2pC1Q=; b=TcHQ62eQd2Hzox1tCDXZGu6BhVVMbjb23DuZNHQqYP1JrPx3lqRGHOBLYPK3YlIdPxju1bNxWhefj0dd5pRc5QjatCMrmhM3Qr8P1Ll2HsCM7NMZR+roFPGHAxT8Ukw5SPNpUoRTbmj/al6Z1XiBZWTT9K7ar5NMcSvkw2kjWoE= Received: from DB8P191CA0007.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:130::17) by GV1PR08MB8009.eurprd08.prod.outlook.com (2603:10a6:150:9b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.14; Tue, 1 Oct 2024 06:48:47 +0000 Received: from DB5PEPF00014B95.eurprd02.prod.outlook.com (2603:10a6:10:130:cafe::80) by DB8P191CA0007.outlook.office365.com (2603:10a6:10:130::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15 via Frontend Transport; Tue, 1 Oct 2024 06:48:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B95.mail.protection.outlook.com (10.167.8.233) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Tue, 1 Oct 2024 06:48:46 +0000 Received: ("Tessian outbound 97428d868d8e:v471"); Tue, 01 Oct 2024 06:48:46 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 076ac49739ed8305 X-TessianGatewayMetadata: pAOd35OGdTtcw326DYVVTB+YMmX8h5G2Ol6tMWRAILWw+vGz4X6dnAhnYjuvz9GlupPUiydBpX755bMQ9GrR2bvECltKw1vzeNiJ0mU7kkp1t7/dVZ0x6yQDrYanGtXiX3EaBS8EFnTDpHIfs6EmBg== X-CR-MTA-TID: 64aa7808 Received: from L0dc22c768fa0.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2ECDF194-9D76-4138-818C-CED6A2F12737.1; Tue, 01 Oct 2024 06:48:40 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L0dc22c768fa0.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Oct 2024 06:48:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FKfs91uI//vB0QTHERer1GLzc+mDWQ+68VgYk3/cfm7wwIFNBiauaL/udSOGD5nza3LJYpvwisUvez3CZ3ndxUAT7+Ap1xZypsaY9LWe902cz1jigDu6RrYoot8xxqiXQ7fRDwrFo8UkeHnQRnpH8SG0VP7rVTv/n1KuAvhZO0ubs2jzpp/CLKgTJrI3rz1Mgb1Ueghyh9BMiL0TgXVbpL2hao6kN/ZYJPGILDsRNIiQtP32pMQr5rzdvNb6J/qkKCwb9dWOClEinL6cmSD1kMt+BBqAuKwtol6B8a2OAGj58489QuEwFetx5jIByNwhgioNvBsKERi/SwHy60PyoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tBaEeXFUYo1pdnRAmmAMukCiNIsUjE0ysdgd0M2pC1Q=; b=RF8TuHcnFpbNaWXiEfhHGLx2VCLMA05EUuo2aiw3FzBeveSnrUdqVdHoiBKG8uz6P/N1GfcOzPns9wucyeAof7Gno0I/UA7lmsQngvJM4TsRAUdPDmK9j2REk/NQBjDfI3MzYahwiOQ2fx7uK59l19uFDwsBFVSH154LMQiP0NVQoN6Igah86EY5fPygj28OVRUyeJiMWYSfpzTnmBoKwSPIsF7OH6qJ7DFex7GpsgIYvJB8fzC8scmCrbQZ+r+i+GJwpdyheHVvwfM6C9xLrwsybnCipvPUbr22AmTXyIqb0J4Xhxj3K+SVgnOLWNmi6LVwtFT7JzB+t2phQwjoSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tBaEeXFUYo1pdnRAmmAMukCiNIsUjE0ysdgd0M2pC1Q=; b=TcHQ62eQd2Hzox1tCDXZGu6BhVVMbjb23DuZNHQqYP1JrPx3lqRGHOBLYPK3YlIdPxju1bNxWhefj0dd5pRc5QjatCMrmhM3Qr8P1Ll2HsCM7NMZR+roFPGHAxT8Ukw5SPNpUoRTbmj/al6Z1XiBZWTT9K7ar5NMcSvkw2kjWoE= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB8901.eurprd08.prod.outlook.com (2603:10a6:20b:5f0::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15; Tue, 1 Oct 2024 06:48:36 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%7]) with mapi id 15.20.8026.014; Tue, 1 Oct 2024 06:48:35 +0000 Date: Tue, 1 Oct 2024 07:48:33 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com Subject: [PATCH]middle-end: support SLP early break Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P123CA0065.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1::29) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS2PR08MB8901:EE_|DB5PEPF00014B95:EE_|GV1PR08MB8009:EE_ X-MS-Office365-Filtering-Correlation-Id: fb353120-3575-43e7-b379-08dce1e51937 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|1800799024|366016|376014; X-Microsoft-Antispam-Message-Info-Original: hHc6WaNq2uMuJCKsbx7GSMmBFgQhMkc4GnsFNf7TSoBdIjYpQoaodgvyudUxNu9J41quRGxIo3gGgvWdZKpi3VjVLj8QJz+Pwm7RLGwyv7ynbkG2scnV9uWrL/1jkfRGEB5ZmxSKOfr6uoVqS5Ct1kbtZXsPjsyD5xqN/hKkgmFJOrL/NfVnYSxJYUR0OCxSNHN2I+vb/DCqybaDlQfqob8H6HJPoaOnY9lxBg4TBYJ3acFemMWX0c6K0aGNZMRGPphwb+jwsjKAlB9d2SOufgtHLImcrEYX2jQKCJWPh4Q8/6v6VYYq9i/LaO0h57KKP2aacmuuht+t/mvUop3bgBVTjzzlT6mj+v8ozxVq3SSYm8ZmmQAOExFqsSb/JuekpxIoF1iZfts5IclNdeNprdFmkFE5O2XqUMKWNTbquJ56/LfZoyvOKy5j5fP5tOXcI/o7nI0iUfDpLpkiBHGOCSXO16V6D9iCVOW3KkdtXcTMTbncwJBQw8S47udex0mhno1JV1909MElUhUswYiVoDhH114ehlHwd0ReKZ22X5NeVX8ZFNC57S4aRW0v15BIEm/BOMrH4LGQIoh9Fm/5otswLv07KYttQ/35TcpXjHgezFAa5Xg/Mebnyrb3Z+/3+JpGOn47biKaGomgzSUkyp5svsi405W6hBNTAPIpuRjpwVfXIJ4lbleKoIXXu9H9oWGCBgPe3Bb5QB7MhYn/dLQiUdifZEP614IvlRGT2Q1Dm2hce1LiJQv8RbV7gAVNFkqWtcgxtndIadXn6I+B3X9kAGV8VIa8ZpYE/Qj3PLs7BhJ0gHk9S+vweSy9dsRAbcPU/9kuAIaBeGJbn3t5jESmuZZ2MqpAZCaglKkxzbeDYivTk7JA39Sd90Wr2vKNWieb7dA+0QNfQs/rLcj/lJl2jzSjkZLfXtGyWeJDiUgv5pSuQeDFeNCxyPff9D+f/pPsM5APnoMB6dKugi0OV63l7f1ik9MG6yHfS4jOOhRDBF7siks4TJirBxIK0SZRNcFRCAVvkfNV38UPwPeNcPbxFkLDl3ghf7stbQgM/crUVMHnRsCejA2XYAXnKX10Wamr0psOOF5YIoryTR4tCgaVnYO27M3IquEyToeKN9DxetXLhWqaxiITtltbulSh1eBRGAuicr7cbaNRjIRtUJyevGdcNiA58K2VQEDtXsxHIea4/bos1MyZxwSadsNh4or+KOXbi+FkvZvd8loJiC1l0Rc5ZmH9u0IPelxdHTc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(1800799024)(366016)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8901 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B95.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: c180afe3-1ead-4869-2c09-08dce1e5128d X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|35042699022|82310400026|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?9t9FMgGyE6IUB1tC5gAP8G6MbWsW9Py?= =?utf-8?q?jk8bQyOXwaqnLz1pKYzUixgg5nJkCZ/ccMIijwY0b0/PK9p/ErpZnT18j5wTz6kq9?= =?utf-8?q?4zxhV+eqVQvt48lNEGyxMqgjs8ZB8+jRMGjP82Xokf6fySNkWaxkxUPDFDwsv4MXS?= =?utf-8?q?GGBpj4S4oczIbfOL9waUP8T3pFJ1V5jTOIn+EOhe7m9yxepCqXdl96UlBrk66BEpE?= =?utf-8?q?o9mvikRUN1thZaCIbX/v0/8pYJN1nFFmwZ8ZJU+bA7lsBM8sV+f3dPpTZ9FSe9IGK?= =?utf-8?q?gEpK7EU9yZJ34/YjKCtr/XqWEBwhAfKhcPyi3tgbB0RMEm9MBFZMO5r2oeVOES5jl?= =?utf-8?q?rKkLUHeXamkCM0HK3T1vCrBMZslRWpsKz0jGpgOf5CeLcH2E/yzIibNpDSvxmF7pU?= =?utf-8?q?xjl4LLoZFNVSK0MSimCnqtyAlFS2Mlz7Vt3AqBcmhPilUa8OQte/v2QFIxaeyQffH?= =?utf-8?q?7g6EaoXiJjFY+PKgfxnMeGT+MU9EW1bevDqjQkj4ZezmmCm+m8xaluy1MNtLybAKH?= =?utf-8?q?cZX7GOQqXkmFYUBOPPYZcjO1a/Wl/JwgT73JHANy1JzDL3vYICrn4l37bj9pYM+0/?= =?utf-8?q?yH8Y9RPHimIifTA5yB8Z6L6FN1L7q3ZKwoDODQD42mtwtxP3XdHWIqZLizOTXaqSh?= =?utf-8?q?bWydUztEY1//wlY2O/SrWO2dCWQiXdqhE5xlqylKDM6wHxOVK3teTZxbMNHx1f4fv?= =?utf-8?q?k9yRKRfYGS+AC4yGU5MjzTt0xFe4BMEo5WyQge1m+LSZd7GvuTK9y5ZR24HVpyQUa?= =?utf-8?q?B43Joph6lzP4IBUz8qwThsiaN1f1tr+U2uPwl272tE5CG+1DU0rQX1Bp4RJ092qx1?= =?utf-8?q?0c699Y51qJ4F6e9tOBk2vkbhY9gfdTx9EKsccFA1Gho6Aq8iuToT9kBKyy8RhBU57?= =?utf-8?q?vUX3qWnoZquYpnNNYUoXjnLgPM3CrVb4Jl95NiWbSo7k+genyv3zKzm3faDWOw6kp?= =?utf-8?q?GdJeW6MpoG/IzNjFJ3IXCPCQpq9ydLeBbGwdN+vFby8r7hiEfAdAW37qqfYxePi6V?= =?utf-8?q?x7JRYB1VbzxtlbUl2hokLFYShKyArAhZ9IKIxXlKzyiSqMtMg7yEyn6Qtb6rpxY5U?= =?utf-8?q?KXB73gbzaxcLtXtg6ysPqZO2Ky9SYo2pX7V0JwPc2gPGfEdlljlQlIbebVQrBBttb?= =?utf-8?q?OVL6A+Aey2sXWcz1Hm7KebHTMeBe47cElqYzuH97USo9CX8Fxrt4e90lfsZGW9+Rl?= =?utf-8?q?cekppK613CF3CRvyk5E58CMIxvB8oqkMrrTxoQVfUKf3Aevu0IUUPC4IFMfn5a4xr?= =?utf-8?q?sbOQkLOCw8bTbo8jVCVtjiqy/RBEwaILWucGLr8bgejS5gWe+pshyvq++wezf/SZ6?= =?utf-8?q?3Bqfc12/JNfZVpeuwO5opQn3B3eqQHkMqw=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(36860700013)(35042699022)(82310400026)(376014)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Oct 2024 06:48:46.6452 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fb353120-3575-43e7-b379-08dce1e51937 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B95.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8009 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi all, This patch introduces feature parity for early break int the SLP only vectorizer. The approach taken here is to treat the early exits as root statements for an SLP tree. This means that we don't need any changes to build_slp to support gconds. Codegen for the gcond itself now has to be done out of line but the body of the SLP blocks itself is simply driven by SLP scheduling. There is a slight awkwardness in having re-used vectorizable_early_exit for both SLP and non-SLP but I've documented the differences and when I did try to refactor it it wasn't really worth it given that this is a temporary state anyway. This version is restricted to lane = 1, as such we can re-use the existing move_early_break function instead of having to do safety update through scheduling. I have a branch where I'm working on that but lane > 1 is out of scope for GCC 15 anyway. The only reason I will try to get moving through scheduling done as a stretch goal is so we get epilogue vectorization back for early break. The example: unsigned test4(unsigned x) { unsigned ret = 0; for (int i = 0; i < N; i++) { vect_b[i] = x + i; if (vect_a[i]*2 != x) break; vect_a[i] = x; } return ret; } builds the following SLP instance for early break: note: Analyzing vectorizable control flow: if (patt_6 != 0) note: Starting SLP discovery for note: patt_6 = _4 != x_9(D); note: starting SLP discovery for node 0x63abc80 note: Build SLP for patt_6 = _4 != x_9(D); note: precomputed vectype: vector(4) note: nunits = 4 note: vect_is_simple_use: operand x_9(D), type of def: external note: vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff _3 * 2, type of def: internal note: starting SLP discovery for node 0x63abdc0 note: Build SLP for _4 = _3 * 2; note: precomputed vectype: vector(4) unsigned int note: nunits = 4 note: vect_is_simple_use: operand # vect_aD.4416[i_15], type of def: internal note: vect_is_simple_use: operand 2, type of def: constant note: starting SLP discovery for node 0x63abe60 note: Build SLP for _3 = vect_a[i_15]; note: precomputed vectype: vector(4) unsigned int note: nunits = 4 note: SLP discovery for node 0x63abe60 succeeded note: SLP discovery for node 0x63abdc0 succeeded note: SLP discovery for node 0x63abc80 succeeded note: SLP size 3 vs. limit 10. note: Final SLP tree for instance 0x6474190: note: node 0x63abc80 (max_nunits=4, refcnt=2) vector(4) note: op template: patt_6 = _4 != x_9(D); note: stmt 0 patt_6 = _4 != x_9(D); note: children 0x63abd20 0x63abdc0 note: node (external) 0x63abd20 (max_nunits=1, refcnt=1) note: { x_9(D) } note: node 0x63abdc0 (max_nunits=4, refcnt=2) vector(4) unsigned int note: op template: _4 = _3 * 2; note: stmt 0 _4 = _3 * 2; note: children 0x63abe60 0x63abf00 note: node 0x63abe60 (max_nunits=4, refcnt=2) vector(4) unsigned int note: op template: _3 = vect_a[i_15]; note: stmt 0 _3 = vect_a[i_15]; note: load permutation { 0 } note: node (constant) 0x63abf00 (max_nunits=1, refcnt=1) note: { 2 } and during codegen: note: ------>vectorizing SLP node starting from: patt_6 = _4 != x_9(D); note: vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff _3 * 2, type of def: internal note: add new stmt: mask_patt_6.18_58 = _53 != vect__4.17_57; note: === vectorizable_early_exit === note: transform early-exit. note: vectorizing stmts using SLP. note: Vectorizing SLP tree: note: node 0x63abfa0 (max_nunits=4, refcnt=1) vector(4) int note: op template: i_12 = i_15 + 1; note: stmt 0 i_12 = i_15 + 1; note: children 0x63aba00 0x63ac040 note: node 0x63aba00 (max_nunits=4, refcnt=2) vector(4) int note: op template: i_15 = PHI note: [l] stmt 0 i_15 = PHI note: children (nil) (nil) note: node (constant) 0x63ac040 (max_nunits=1, refcnt=1) vector(4) int note: { 1 } Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues. Also bootstrapped --with-build-config='bootstrap-O3 bootstrap-lto' --enable-checking=release,yes,rtl,extra on aarch64-none-linux-gnu and x86_64-pc-linux-gnu -m32, -m64 and no issues. Ok for master? gcc/ChangeLog: * tree-vectorizer.h (enum slp_instance_kind): Add slp_inst_kind_gcond. (LOOP_VINFO_EARLY_BREAKS_LIVE_STMTS): New. (vectorizable_early_exit): Expose. (class _loop_vec_info): Add early_break_live_stmts. * tree-vect-slp.cc (vect_build_slp_instance, vect_analyze_slp_instance): Support gcond instances. (vect_analyze_slp): Analyze gcond roots and early break live statements. (maybe_push_to_hybrid_worklist): Don't sink gconds. (vect_slp_analyze_node_operations): Support gconds. (vect_slp_check_for_roots): Update comments. (vectorize_slp_instance_root_stmt): Support gconds. (vect_schedule_slp): Pass vinfo to vectorize_slp_instance_root_stmt. * tree-vect-stmts.cc (vect_stmt_relevant_p): Record early break live statements. (vectorizable_early_exit): Support SLP. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-early-break_126.c: New test. * gcc.dg/vect/vect-early-break_127.c: New test. * gcc.dg/vect/vect-early-break_128.c: New test. --- -- diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_126.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_126.c new file mode 100644 index 0000000000000000000000000000000000000000..4bfc9880f9fc869bf616123ff509d13be17ffacf --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_126.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target vect_int } */ + +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */ + +#define N 1024 +unsigned vect_a[N]; +unsigned vect_b[N]; + +unsigned test4(unsigned x) +{ + unsigned ret = 0; + for (int i = 0; i < N; i++) + { + vect_b[i] = x + i; + if (vect_a[i] > x) + { + ret *= vect_a[i]; + return vect_a[i]; + } + vect_a[i] = x; + ret += vect_a[i] + vect_b[i]; + } + return ret; +} diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_127.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_127.c new file mode 100644 index 0000000000000000000000000000000000000000..67cb5d34a77192e5d7d72c35df8e83535ef184ab --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_127.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target vect_int } */ + +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */ + +#ifndef N +#define N 800 +#endif +unsigned vect_a[N]; +unsigned vect_b[N]; + +unsigned test4(unsigned x) +{ + unsigned ret = 0; + for (int i = 0; i < N; i++) + { + vect_b[i] = x + i; + if (vect_a[i]*2 != x) + break; + vect_a[i] = x; + + } + return ret; +} diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_128.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_128.c new file mode 100644 index 0000000000000000000000000000000000000000..6d7fb920ec2de529a4aa1de2c4a04286989204fd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_128.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target vect_int } */ + +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */ + +#ifndef N +#define N 800 +#endif +unsigned vect_a[N]; +unsigned vect_b[N]; + +unsigned test4(unsigned x) +{ + unsigned ret = 0; + for (int i = 0; i < N; i+=2) + { + vect_b[i] = x + i; + vect_b[i+1] = x + i+1; + if (vect_a[i]*2 != x) + break; + if (vect_a[i+1]*2 != x) + break; + vect_a[i] = x; + vect_a[i+1] = x; + + } + return ret; +} diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 600987dd6e5d506aa5fbb02350f9dab77793d382..7e765df466a59249feb999c24d8f2dad232948ae 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -3697,6 +3697,13 @@ vect_build_slp_instance (vec_info *vinfo, "Analyzing vectorizable constructor: %G\n", root_stmt_infos[0]->stmt); } + else if (kind == slp_inst_kind_gcond) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Analyzing vectorizable control flow: %G", + root_stmt_infos[0]->stmt); + } if (dump_enabled_p ()) { @@ -4143,6 +4150,12 @@ vect_analyze_slp_instance (vec_info *vinfo, STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)) = STMT_VINFO_REDUC_DEF (vect_orig_stmt (scalar_stmts.last ())); } + else if (kind == slp_inst_kind_gcond) + { + /* Collect the stores and store them in scalar_stmts. */ + scalar_stmts.create (1); + scalar_stmts.quick_push (vect_stmt_to_vectorize (next_info)); + } else gcc_unreachable (); @@ -4742,6 +4755,56 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size, bst_map, NULL, force_single_lane); } } + + /* Find SLP sequences starting from gconds. */ + for (auto cond : LOOP_VINFO_LOOP_CONDS (loop_vinfo)) + { + auto cond_info = loop_vinfo->lookup_stmt (cond); + vec stmts; + vec roots = vNULL; + vec remain = vNULL; + + cond_info = vect_stmt_to_vectorize (cond_info); + roots.safe_push (cond_info); + stmts.create (2); + tree args0 = gimple_cond_lhs (STMT_VINFO_STMT (cond_info)); + tree args1 = gimple_cond_rhs (STMT_VINFO_STMT (cond_info)); + /* An argument without a loop def will be codegened from vectorizing the + root gcond itself. As such we don't need to try to build an SLP tree + from them. It's highly likely that the resulting SLP tree here if both + arguments have a def will be incompatible, but we rely on it being split + later on. */ + if (auto varg = loop_vinfo->lookup_def (args0)) + stmts.quick_push (vect_stmt_to_vectorize (varg)); + + if (auto varg = loop_vinfo->lookup_def (args1)) + stmts.quick_push (vect_stmt_to_vectorize (varg)); + + if (!stmts.is_empty ()) + vect_build_slp_instance (vinfo, slp_inst_kind_gcond, + stmts, roots, remain, + max_tree_size, &limit, + bst_map, NULL, force_single_lane); + } + + /* Find and create slp instances for inductions that have been forced + live due to early break. */ + edge latch_e = loop_latch_edge (LOOP_VINFO_LOOP (loop_vinfo)); + for (auto stmt_info : LOOP_VINFO_EARLY_BREAKS_LIVE_STMTS (loop_vinfo)) + { + vec stmts; + vec roots = vNULL; + vec remain = vNULL; + gphi *lc_phi = as_a (STMT_VINFO_STMT (stmt_info)); + tree def = gimple_phi_arg_def_from_edge (lc_phi, latch_e); + stmt_vec_info lc_info = loop_vinfo->lookup_def (def); + stmts.create (1); + stmts.quick_push (vect_stmt_to_vectorize (lc_info)); + vect_build_slp_instance (vinfo, slp_inst_kind_reduc_group, + stmts, roots, remain, + max_tree_size, &limit, + bst_map, NULL, force_single_lane); + } } hash_set visited_patterns; @@ -7157,8 +7220,9 @@ maybe_push_to_hybrid_worklist (vec_info *vinfo, } } } - /* No def means this is a loo_vect sink. */ - if (!any_def) + /* No def means this is a loop_vect sink. Gimple conditionals also don't have a + def but shouldn't be considered sinks. */ + if (!any_def && STMT_VINFO_DEF_TYPE (stmt_info) != vect_condition_def) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, @@ -7542,9 +7606,27 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, return true; visited_vec.safe_push (node); + /* If early break also check the root statement as we need to both analyze + and trigger codegen for it. The analysis will check whether can actually + vectorize it. At the memoment splitting off the analsysi bit from inside + it duplicates a lot of the setup code so it's not worth while to do so. + However when either the non-SLP loop vect goes away or we split vectorizable_* + functions then we can call the analysis only part from here instead. */ bool res = true; - unsigned visited_rec_start = visited_vec.length (); unsigned cost_vec_rec_start = cost_vec->length (); + if (SLP_INSTANCE_KIND (node_instance) == slp_inst_kind_gcond) + { + auto root_stmt_info = SLP_INSTANCE_ROOT_STMTS (node_instance)[0]; + res = vectorizable_early_exit (vinfo, root_stmt_info, NULL, NULL, NULL, + cost_vec); + if (!res) + { + cost_vec->truncate (cost_vec_rec_start); + return res; + } + } + + unsigned visited_rec_start = visited_vec.length (); bool seen_non_constant_child = false; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) { @@ -8612,6 +8694,8 @@ vect_slp_check_for_roots (bb_vec_info bb_vinfo) !gsi_end_p (gsi); gsi_next (&gsi)) { gassign *assign = dyn_cast (gsi_stmt (gsi)); + /* This can be used to start SLP discovery for early breaks for BB early breaks + when we get that far. */ if (!assign) continue; @@ -10758,7 +10842,7 @@ vect_remove_slp_scalar_calls (vec_info *vinfo, slp_tree node) /* Vectorize the instance root. */ void -vectorize_slp_instance_root_stmt (slp_tree node, slp_instance instance) +vectorize_slp_instance_root_stmt (vec_info *vinfo, slp_tree node, slp_instance instance) { gassign *rstmt = NULL; @@ -10862,6 +10946,21 @@ vectorize_slp_instance_root_stmt (slp_tree node, slp_instance instance) update_stmt (gsi_stmt (rgsi)); return; } + else if (instance->kind == slp_inst_kind_gcond) + { + /* Only support a single root for now as we can't codegen CFG yet and so we + can't support lane > 1 at this time. */ + gcc_assert (instance->root_stmts.length () == 1); + auto root_stmt_info = instance->root_stmts[0]; + auto last_stmt = vect_find_first_scalar_stmt_in_slp (node)->stmt; + gimple_stmt_iterator rgsi = gsi_for_stmt (last_stmt); + gimple *vec_stmt = NULL; + gcc_assert (SLP_TREE_NUMBER_OF_VEC_STMTS (node) != 0); + bool res = vectorizable_early_exit (vinfo, root_stmt_info, &rgsi, + &vec_stmt, node, NULL); + gcc_assert (res); + return; + } else gcc_unreachable (); @@ -11080,7 +11179,7 @@ vect_schedule_slp (vec_info *vinfo, const vec &slp_instances) vect_schedule_scc (vinfo, node, instance, scc_info, maxdfs, stack); if (!SLP_INSTANCE_ROOT_STMTS (instance).is_empty ()) - vectorize_slp_instance_root_stmt (node, instance); + vectorize_slp_instance_root_stmt (vinfo, node, instance); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b72b54d666879d8485f8d972b4e8d9dc64bc86b3..8f3f35989879199ffd0eb24729cb7ade856a3c4d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -411,6 +411,7 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, dump_printf_loc (MSG_NOTE, vect_location, "vec_stmt_relevant_p: induction forced for " "early break.\n"); + LOOP_VINFO_EARLY_BREAKS_LIVE_STMTS (loop_vinfo).safe_push (stmt_info); *live_p = true; } @@ -12933,7 +12934,7 @@ vectorizable_comparison (vec_info *vinfo, /* Check to see if the current early break given in STMT_INFO is valid for vectorization. */ -static bool +bool vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, gimple **vec_stmt, slp_tree slp_node, stmt_vector_for_cost *cost_vec) @@ -12958,7 +12959,7 @@ vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, tree op0; enum vect_def_type dt0; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0, &dt0, - &vectype)) + &vectype)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -12966,6 +12967,13 @@ vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, return false; } + /* For SLP we don't want to use the type of the operands of the SLP node, when + vectorizing using SLP slp_node will be the children of the gcond and we want to + use the type of the direct children which since the gcond is root will be the + current node, rather than a child node as vect_is_simple_use assumes. */ + if (slp_node) + vectype = SLP_TREE_VECTYPE (slp_node); + if (!vectype) return false; @@ -13060,9 +13068,18 @@ vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "transform early-exit.\n"); - if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi, - vec_stmt, slp_node, cost_vec)) - gcc_unreachable (); + /* For SLP we don't do codegen of the body starting from the gcond, the gconds are + roots and so by the time we get to them we have already codegened the SLP tree + and so we shouldn't try to do so again. The arguments have already been + vectorized. It's not very clean to do this here, But the masking code below is + complex and this keeps it all in one place to ease fixes and backports. Once we + drop the non-SLP loop vect or split vectorizable_* this can be simplified. */ + if (!slp_node) + { + if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi, + vec_stmt, slp_node, cost_vec)) + gcc_unreachable (); + } gimple *stmt = STMT_VINFO_STMT (stmt_info); basic_block cond_bb = gimple_bb (stmt); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 490061aea2f6d465d9589eb97bbd34a920d76b1c..53483303c4ac3482760fe722354f602e0243e5a2 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -252,7 +252,8 @@ enum slp_instance_kind { slp_inst_kind_reduc_group, slp_inst_kind_reduc_chain, slp_inst_kind_bb_reduc, - slp_inst_kind_ctor + slp_inst_kind_ctor, + slp_inst_kind_gcond }; /* SLP instance is a sequence of stmts in a loop that can be packed into @@ -977,6 +978,10 @@ public: /* Statements whose VUSES need updating if early break vectorization is to happen. */ auto_vec early_break_vuses; + + /* Record statements that are needed to be live for early break vectorization + but may not have an LC PHI node materialized yet in the exits. */ + auto_vec early_break_live_stmts; } *loop_vec_info; /* Access Functions. */ @@ -1036,6 +1041,8 @@ public: #define LOOP_VINFO_EARLY_BRK_STORES(L) (L)->early_break_stores #define LOOP_VINFO_EARLY_BREAKS_VECT_PEELED(L) \ (single_pred ((L)->loop->latch) != (L)->vec_loop_iv_exit->src) +#define LOOP_VINFO_EARLY_BREAKS_LIVE_STMTS(L) \ + (L)->early_break_live_stmts #define LOOP_VINFO_EARLY_BRK_DEST_BB(L) (L)->early_break_dest_bb #define LOOP_VINFO_EARLY_BRK_VUSES(L) (L)->early_break_vuses #define LOOP_VINFO_LOOP_CONDS(L) (L)->conds @@ -2522,6 +2529,9 @@ extern bool vectorizable_phi (vec_info *, stmt_vec_info, gimple **, slp_tree, stmt_vector_for_cost *); extern bool vectorizable_recurr (loop_vec_info, stmt_vec_info, gimple **, slp_tree, stmt_vector_for_cost *); +extern bool vectorizable_early_exit (vec_info *, stmt_vec_info, + gimple_stmt_iterator *, gimple **, + slp_tree, stmt_vector_for_cost *); extern bool vect_emulated_vector_p (tree); extern bool vect_can_vectorize_without_simd_p (tree_code); extern bool vect_can_vectorize_without_simd_p (code_helper);