From patchwork Mon Oct 14 10:55:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1996807 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=ObJroyxx; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=ObJroyxx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XRvLN24bsz1xvK for ; Mon, 14 Oct 2024 21:56:44 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 68E873856DFF for ; Mon, 14 Oct 2024 10:56:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-VI1-obe.outbound.protection.outlook.com (mail-vi1eur03on20628.outbound.protection.outlook.com [IPv6:2a01:111:f403:260c::628]) by sourceware.org (Postfix) with ESMTPS id 902C3385AC34 for ; Mon, 14 Oct 2024 10:55:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 902C3385AC34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 902C3385AC34 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260c::628 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1728903365; cv=pass; b=o60PDRNv5uZ5wZl1K9YnoLtjXdu+eOYsIN0wwi6OLslwggKevxg6KzThcMzvutiSnPNi/0pHBHrSDCQCLShfkhCPL8X5rf6caLHqrQhxjHV5Ze/xxBO4x/HZ01lZzjany8aC5ROMyHjx6A4wnQif/ITmjBAch3Pclkc4clpCz1E= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1728903365; c=relaxed/simple; bh=lHUIJDFrqD7C/e0TAT5vZLCV/z/r4XWswICtvBHjLjA=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=BAmRKTdx2IJ6BZUPHgdii+IZw+8MoLslXPGOUGvifsNeDNz+Si3aBey4ECK+ZcB4Nl6F5NyArNDY6/2SJ6cxi9GsHJUVbR3GVaiwo/llLkYvhaTK24UTv8iXcfYXld1uv8FxZlnefiamaG9HPS3eHrMHtYCkGamzF/rkEub6CDE= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=li9wJIJcqmltHNPktKvT50eOH7v44UsXSPA9pLxjgvgjYzAmAu1T00mJnF57nJiL+1LSfEZaHwNKRF82hxer+wcctFU/jx2AC+EdVTpskEsTYUI+G0HJDDl9GctrV9/xJGHQQWNULUVVidp0eS9SQx0ZH0Kr1iMloHoq+ovgkDvh0e5lJk1voqkmeUXr1ivNtKktg2EABuKNLOTObJsuy6MFoBGWwGGYYGrFMTSs73ibRausXS2cpJ+7SYRvpRiDfk06Vm1t4sflIdW0E1UNGUdykaulghL5i4w4gdJgerG1WiP7MILnNHJNmbzHsfYR01LAGC5rjYvw5NLuNkQbHg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CYWpFkc3IXRjILWuV3IWDhL3dOocl5F1OC89dZoAM74=; b=aueCma7gMnKgZKqw+ZnmpFYJci3tFRJ46mPk6zR5pELzAx05n1Z/C5n7hJso9s/19F/QVua97DjLFvBbd3MTa4lXayva5M0v0LmrVWAwxvrNuGERPQe5NCNw2939RnCBPko5mSbw2bWQQOWJwuaA539FblqEXxUr6t5c5A03JxNwxIKE+5aV2v47yEgK+/tCbQVOFzy4a/vuNRdqGaNNwImPdRSUNL3pu1WObWD4McqAPL1ybcM4ss0aH7a7hkZ55AQVsmkImo89/7sGAy7D9GDm+KCYbEmXIRJT/4/8q8t5LIs5q4QRwBjZERsi6gQ69VvF/P9a49T6gz9DHZ8r7w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CYWpFkc3IXRjILWuV3IWDhL3dOocl5F1OC89dZoAM74=; b=ObJroyxxb4YhH++Qk5ItE/LK3Tzi2RSVE0XMz3eRfhORjZISeGar3ftdUa2WV+GHw98IBHePnzdeWjJDrbY+HsokndScusIV7SHZvz+tzGb6xjHdah6JwiH0Kato8cFdOKKxBwpxvE7/cSq6HuzlQ5EKPLCmx6op6AOmd5vi7zA= Received: from DB8PR03CA0010.eurprd03.prod.outlook.com (2603:10a6:10:be::23) by DU0PR08MB9080.eurprd08.prod.outlook.com (2603:10a6:10:474::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8048.26; Mon, 14 Oct 2024 10:55:53 +0000 Received: from DB1PEPF000509E5.eurprd03.prod.outlook.com (2603:10a6:10:be:cafe::57) by DB8PR03CA0010.outlook.office365.com (2603:10a6:10:be::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8048.26 via Frontend Transport; Mon, 14 Oct 2024 10:55:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB1PEPF000509E5.mail.protection.outlook.com (10.167.242.55) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8069.17 via Frontend Transport; Mon, 14 Oct 2024 10:55:51 +0000 Received: ("Tessian outbound 40ef283ec771:v473"); Mon, 14 Oct 2024 10:55:51 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: c46d92370449540a X-TessianGatewayMetadata: fcKUdioW+GWQDueD/3D/6Kz/JduUerhmgYOMY3Xt8V48P08pRr9IKRWp5ej4kPj1R8DMB3ylZlbLFW0sAglWlmd2laJpHTzdr9pGafWb38qC8adRvV5FnCazGW6JxIAo6S9WvS9u9b91FKXOGRMkfg== X-CR-MTA-TID: 64aa7808 Received: from La5b8bccaacf5.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B34BB2DB-656A-478B-991A-BEFCA5D81645.1; Mon, 14 Oct 2024 10:55:44 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id La5b8bccaacf5.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 14 Oct 2024 10:55:44 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=f0izJOcWKeTMyFHyDQ1NNbtdLvSvlZdvl/CNNrJB0LADLuBHDeRMqxQivJnMZFNY/IKhWdJPvxqpaH7JOupGvKMf8hW+EQab+QEXpB28Kc7ITn6AjUPC0Y3rbkIMH1ecLEIs/YkqUx/JE1Oa3boNejDwms8u5q2hVtVt+AslgLs3Uaj9DYO8XrlX/Zu1RGBH7y2al4i9ue5nXCtiOxA7FUW5zXZAJC6arzIfv+eewEcEPdsQ2hWAb8cyNZDp9SZgMayZRH+nouunNzueA+v8MED0M6pGWqRTxLNVoIX7FvPP2nZ1aS6mC5DzJzk7kISw7sVPQlXW2TNMgRarR+Tw0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CYWpFkc3IXRjILWuV3IWDhL3dOocl5F1OC89dZoAM74=; b=C+5WnFYsrIg7J3XB6Y1cvsQOQhkyjMqEaopJJ8Zulsn/SGOU2D9B3SpeErQLSXnQUhGCElfmOe/n/jedeKakwQ4zTytJ5I1gVb5BgTCTJFV1C9BKXpUOn+M2AWRTXEsvo/mI4fp+7ZQjXFNeEkYmMK92ADLO+HLr8U1DAwmhLS6Wn+QkQuPDjaR9iSeWi52T53CM9E5mjQGqD97krI4591Smq3zkCfP1Os02XPEKD87AriINHoA2+yKjhCFBRImBI75t0/4uw52rHqVbgFkXJXyIcGwjMgRFHW6Yr8paOUP7yNmGVrTzOYMQvjBoC6IR3DmLboDF1QUXwZdZTA82Sw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CYWpFkc3IXRjILWuV3IWDhL3dOocl5F1OC89dZoAM74=; b=ObJroyxxb4YhH++Qk5ItE/LK3Tzi2RSVE0XMz3eRfhORjZISeGar3ftdUa2WV+GHw98IBHePnzdeWjJDrbY+HsokndScusIV7SHZvz+tzGb6xjHdah6JwiH0Kato8cFdOKKxBwpxvE7/cSq6HuzlQ5EKPLCmx6op6AOmd5vi7zA= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAWPR08MB9806.eurprd08.prod.outlook.com (2603:10a6:102:2e2::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8048.27; Mon, 14 Oct 2024 10:55:42 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%7]) with mapi id 15.20.8048.020; Mon, 14 Oct 2024 10:55:42 +0000 Date: Mon, 14 Oct 2024 11:55:40 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de Subject: [PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0082.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:190::15) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|PAWPR08MB9806:EE_|DB1PEPF000509E5:EE_|DU0PR08MB9080:EE_ X-MS-Office365-Filtering-Correlation-Id: 50ba9d02-af9a-4538-e794-08dcec3ec4ef x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|366016|1800799024; X-Microsoft-Antispam-Message-Info-Original: 9SIAJZLJZjx9m9pmakNJAcWJPXqLEolfO9Mo8YcQlNjnHb1G+w8Sa1VSbmY0cwHNB5fgHm6BT2rxASsaxNrUEF6hKZuMzNMWDPMJfvAuW4UQYq/I4nY7v3k7Fbv52bvu80NK0RH2kU5ZUVU11bz6krg5EF8urTvTLCYlSrW5dGanNfmw09rTc+Tj+eygMgZsDZIhmlNseCGCvU0lr9m9j0ZMdoPkf3y5KguuXMO2Oxuw9xJxNZNmXR/7yvdfxrcmSic4SFixAS6/A9LlE+vMgd7zHJcw57wsO8ZBJvcqD5SBdRi9nUnLjFPC0+ubgy928GrfJtXEUWZqSCoB2ZaosOzCxqkKnjpRTyxLyflPKvT2vpCsq/7DdwX5XkZwPdyzbuhY7YbIF227ku0w1Vesqu6kCb++aKC/4wCQ3VGqQslATgV7jLM28y/XOod9Hqcu+Hna1JFhvo7hjzEs+/XOJHhgqaU6p0VQWrFvBtNNIULF089oW+csMKKcp9xEbYXMVLGsAaCWM8ioVFtDGKVr30EVhu2tRKGQD5VNIX7vnIhW4lbf+xP0d8Q43fGir9qDrSGlUEiqhNxLdoOoEwLM/RGyIosKEVGDkO7VPiRoPJPNaoPAX2nv19VtH7OcisCxC7yTIeoiz5N6qST0JFZD2xMTvOsNcy3B0nXln80Nx0gq/32ErFNDTEItWHBcuIh320TpCBrDb8zM2zgfk+u/R5uCgjs+o24/Z0eM0Z0Du8EKuSRaiQ3ZhuV6A1KRY8X0UOc6B9WkqIz+vdS65Z1525WidEWh7aWNbkCAy1hdkoITJ2yR238LqAk6lRGaNDWTgWOYgL+tn/JEXWah5Ro+KSr0G9D9aqgWS/PRVjce2j+G9kZqFpnIasA1CqDqaygocomoxVeSdanUlcOatxQxSeOtcOWQ6P02oE1kNWp8qwzGVEr72xPLrjh2E2bg9LcbW6S+II+gRHkQzHODMyfM4zyXSnJbMdV/DCPZKBTFj2FkWTItuqYpKWF67C3YUu8aOeU/VDSyOocKLEfbr+eqfeQ3ErPzbseNGkMI85CS+qgpJlRPO9+lQQiBqEREnrA1fVFK9dQMh7hmvY6P8PxawIhu5s/po+oJ81B0nVNyRpSzX3PQsMTplrzVcggjLmvuf2FRYo4A6t8MVDuQ+f07vMznD2Vz2KlfhqNNWpcbJ9llR9am3mU55IodVJ238J4V9/b+QR0gpg+0Y/3WQr12jbmqpBjHPBCELGISpnanEYfK+NahaL7oaKmZryB8Y7UrAdCEKN8XiN5n5YvpGTKJMQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(366016)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB9806 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB1PEPF000509E5.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 98dbebda-79b6-4d6a-bca8-08dcec3ebf17 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|35042699022|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?EA+wL4Onrdxro6I6/m7znyCN6mkeTbM?= =?utf-8?q?Jh5UOSnpQzcOZy1K883XztnCWjn9oHBg4MMw69QsOo4guUymYa2FEIQ58/l+YsfM6?= =?utf-8?q?me5KV4hS8Z+vDt2AzuCnSDJf0FjixhdKRozqM2irrVOPViX1q5LzScKww3ADX5cr9?= =?utf-8?q?7+tO7VL527url5Da2z+JlXrVTgwek8pEQw1dYibFlwcXPVR45Tt+Lxm0oXO16moxe?= =?utf-8?q?FbTo2BwpP0l+RDjCcxXoyDAX6swJ/wFFXrHT41q/lxCZYiLDpsghIedcdI2oS8aL/?= =?utf-8?q?2ia25eQUG9e12AU0MTzGBcReU7lbtrcRy5wM3hiNHndF5ywZpjGvSr2bOeqjmSXA9?= =?utf-8?q?145qIkdU8A77g8bSoG8/ne/eETLVtcV/tHsP7r7079Uofz2+MqxQqBQEXM2q1/Foq?= =?utf-8?q?8iT3xUEu/IqMHTlm7Yw0HNol23QoMgEUXto34AXpRzv9xFuKsJFSsnnzdVcAuoBLi?= =?utf-8?q?LZLcDxwm20bkar/SOy7BnpMqavPRK7p02y0LtVpg09S4/JhKIGbcoY0voguM8aNKn?= =?utf-8?q?V7Eb5H5OHbB3jQN9payy1em4AkClPHKPo4iOgaLOs302JqnQ9dOyWvoTaMfAK3el/?= =?utf-8?q?i8pqofwb+tUQpxFW2TPxT8iwavoUaTQE5h6CUhYTpjmhXjWe5tdcq8bDW7O9shKgT?= =?utf-8?q?GARH03XGgVpoOSbmW2uPxlVGeHWfprqzAH0xCINSPq1qp04tKkA7K12hC/K+4O0GP?= =?utf-8?q?X0Y/N+L4tQyAUBWl8Xky8wrrOpQkvlK8iLPl4aB7tpZeGTBVMQTRoiiKOWXlcYMQr?= =?utf-8?q?dtEQXTvfbFCtBEvvFgfv+bQ8SGryFyF+FQkZt97IHs+uyyOWoQlKjaHgcDUksP+lU?= =?utf-8?q?Jjl98AyvROMU0D2PdKU8QFQc9kgH0X2O+VtfcKs2Ngd51UcLA7LryxmFtZ6GsOK63?= =?utf-8?q?OjxCjPEJ1SmIHl57seq7+aYE7FIOq8QXNBCeIGi7qIQvw2z9vDRwUxK5dcT4ezIvo?= =?utf-8?q?Yd55M5GeeVDLYntiaQFWx/bhpG/5i+i82ppbCDO2pTQMoAMuSqJvep5TzzweFsQ4V?= =?utf-8?q?1YcSuFIrIk58FgimWZQYiThZgcbSqENCThFAxc6Ik0VAURnHOzDVMFfmrTKVoGW4S?= =?utf-8?q?K1INT9d89WU7v5WPuHu6t2Uj3nPyQrFwIQThdB9/Ws7zBuatGXHN3YmBt8y3mTjfx?= =?utf-8?q?yoVK9oq4AIt9pgCceitvSuCkiJxBeI//HqeWj/DwVxLVeVGN/k1cfy3HLf0MLOQNv?= =?utf-8?q?mooIZTCSorNbdgXKphvQmHHfkXvPwg4HXGaf1JH8pesOLj/FXTDznwg7ujyLWfa8+?= =?utf-8?q?oZQDLqnGHZI/PlTDu4zdu9SujxwfuJey4g/3NMq3/YlLsTSkUrKD9Gcg=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(35042699022)(376014)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Oct 2024 10:55:51.5795 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 50ba9d02-af9a-4538-e794-08dcec3ec4ef X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF000509E5.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9080 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This patch series adds support for a target to do a direct convertion for zero extends using permutes. To do this it uses a target hook use_permute_for_promotio which must be implemented by targets. This hook is used to indicate: 1. can a target do this for the given modes. 2. is it profitable for the target to do it. 3. can the target convert between various vector modes with a VIEW_CONVERT. Using permutations have a big benefit of multi-step zero extensions because they both reduce the number of needed instructions, but also increase throughput as the dependency chain is removed. Concretely on AArch64 this changes: void test4(unsigned char *x, long long *y, int n) { for(int i = 0; i < n; i++) { y[i] = x[i]; } } from generating: .L4: ldr q30, [x4], 16 add x3, x3, 128 zip1 v1.16b, v30.16b, v31.16b zip2 v30.16b, v30.16b, v31.16b zip1 v2.8h, v1.8h, v31.8h zip1 v0.8h, v30.8h, v31.8h zip2 v1.8h, v1.8h, v31.8h zip2 v30.8h, v30.8h, v31.8h zip1 v26.4s, v2.4s, v31.4s zip1 v29.4s, v0.4s, v31.4s zip1 v28.4s, v1.4s, v31.4s zip1 v27.4s, v30.4s, v31.4s zip2 v2.4s, v2.4s, v31.4s zip2 v0.4s, v0.4s, v31.4s zip2 v1.4s, v1.4s, v31.4s zip2 v30.4s, v30.4s, v31.4s stp q26, q2, [x3, -128] stp q28, q1, [x3, -96] stp q29, q0, [x3, -64] stp q27, q30, [x3, -32] cmp x4, x5 bne .L4 and instead we get: .L4: add x3, x3, 128 ldr q23, [x4], 16 tbl v5.16b, {v23.16b}, v31.16b tbl v4.16b, {v23.16b}, v30.16b tbl v3.16b, {v23.16b}, v29.16b tbl v2.16b, {v23.16b}, v28.16b tbl v1.16b, {v23.16b}, v27.16b tbl v0.16b, {v23.16b}, v26.16b tbl v22.16b, {v23.16b}, v25.16b tbl v23.16b, {v23.16b}, v24.16b stp q5, q4, [x3, -128] stp q3, q2, [x3, -96] stp q1, q0, [x3, -64] stp q22, q23, [x3, -32] cmp x4, x5 bne .L4 Tests are added in the AArch64 patch introducing the hook. The testsuite also already had about 800 runtime tests that get affected by this. Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * target.def (use_permute_for_promotion): New. * doc/tm.texi.in: Document it. * doc/tm.texi: Regenerate. * targhooks.cc (default_use_permute_for_promotion): New. * targhooks.h (default_use_permute_for_promotion): New. (vectorizable_conversion): Support direct convertion with permute. * tree-vect-stmts.cc (vect_create_vectorized_promotion_stmts): Likewise. (supportable_widening_operation): Likewise. (vect_gen_perm_mask_any): Allow vector permutes where input registers are half the width of the result per the GCC 14 relaxation of VEC_PERM_EXPR. --- -- diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 4deb3d2c283a2964972b94f434370a6f57ea816a..e8192590ac14005bf7cb5f731c16ee7eacb78143 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6480,6 +6480,15 @@ type @code{internal_fn}) should be considered expensive when the mask is all zeros. GCC can then try to branch around the instruction instead. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VECTORIZE_USE_PERMUTE_FOR_PROMOTION (const_tree @var{in_type}, const_tree @var{out_type}) +This hook returns true if the operation promoting @var{in_type} to +@var{out_type} should be done as a vector permute. If @var{out_type} is +a signed type the operation will be done as the related unsigned type and +converted to @var{out_type}. If the target supports the needed permute, +is able to convert unsigned(@var{out_type}) to @var{out_type} and it is +beneficial to the hook should return true, else false should be returned. +@end deftypefn + @deftypefn {Target Hook} {class vector_costs *} TARGET_VECTORIZE_CREATE_COSTS (vec_info *@var{vinfo}, bool @var{costing_for_scalar}) This hook should initialize target-specific data structures in preparation for modeling the costs of vectorizing a loop or basic block. The default diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 9f147ccb95cc6d4e79cdf5b265666ad502492145..c007bc707372dd374e8effc52d29b76f5bc283a1 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4303,6 +4303,8 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE +@hook TARGET_VECTORIZE_USE_PERMUTE_FOR_PROMOTION + @hook TARGET_VECTORIZE_CREATE_COSTS @hook TARGET_VECTORIZE_BUILTIN_GATHER diff --git a/gcc/target.def b/gcc/target.def index b31550108883c5c3f5ffc7e46a1e8a7b839ebe83..58545d5ef4248da5850edec8f4db9f2636973598 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2056,6 +2056,20 @@ all zeros. GCC can then try to branch around the instruction instead.", (unsigned ifn), default_empty_mask_is_expensive) +/* Function to say whether a target supports and prefers to use permutes for + zero extensions or truncates. */ +DEFHOOK +(use_permute_for_promotion, + "This hook returns true if the operation promoting @var{in_type} to\n\ +@var{out_type} should be done as a vector permute. If @var{out_type} is\n\ +a signed type the operation will be done as the related unsigned type and\n\ +converted to @var{out_type}. If the target supports the needed permute,\n\ +is able to convert unsigned(@var{out_type}) to @var{out_type} and it is\n\ +beneficial to the hook should return true, else false should be returned.", + bool, + (const_tree in_type, const_tree out_type), + default_use_permute_for_promotion) + /* Target builtin that implements vector gather operation. */ DEFHOOK (builtin_gather, diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 2704d6008f14d2aa65671f002af886d3b802effa..723f8f4fda7808b6899f10f8b3fafad74d3c536f 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -124,6 +124,7 @@ extern opt_machine_mode default_vectorize_related_mode (machine_mode, extern opt_machine_mode default_get_mask_mode (machine_mode); extern bool default_empty_mask_is_expensive (unsigned); extern bool default_conditional_operation_is_expensive (unsigned); +extern bool default_use_permute_for_promotion (const_tree, const_tree); extern vector_costs *default_vectorize_create_costs (vec_info *, bool); /* OpenACC hooks. */ diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index dc040df9fcd1182b62d83088ee7fb3a248c99f51..a487eab794fe9f1089ecb58fdfc881fdb19d28f3 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1615,6 +1615,14 @@ default_conditional_operation_is_expensive (unsigned ifn) return ifn == IFN_MASK_STORE; } +/* By default no targets prefer permutes over multi step extension. */ + +bool +default_use_permute_for_promotion (const_tree, const_tree) +{ + return false; +} + /* By default consider masked stores to be expensive. */ bool diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 4f6905f15417f90c6f36e1711a7a25071f0f507c..f2939655e4ec34111baa8894eaf769d29b1c5b82 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5129,6 +5129,111 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, gimple *new_stmt1, *new_stmt2; vec vec_tmp = vNULL; + /* If we're using a VEC_PERM_EXPR then we're widening to the final type in + one go. */ + if (ch1 == VEC_PERM_EXPR + && op_type == unary_op) + { + vec_tmp.create (vec_oprnds0->length () * 2); + bool failed_p = false; + + /* Extending with a vec-perm requires 2 instructions per step. */ + FOR_EACH_VEC_ELT (*vec_oprnds0, i, vop0) + { + tree vectype_in = TREE_TYPE (vop0); + tree vectype_out = TREE_TYPE (vec_dest); + machine_mode mode_in = TYPE_MODE (vectype_in); + machine_mode mode_out = TYPE_MODE (vectype_out); + unsigned bitsize_in = element_precision (vectype_in); + unsigned tot_in, tot_out; + unsigned HOST_WIDE_INT count; + + /* We can't really support VLA here as the indexes depend on the VL. + VLA should really use widening instructions like widening + loads. */ + if (!GET_MODE_BITSIZE (mode_in).is_constant (&tot_in) + || !GET_MODE_BITSIZE (mode_out).is_constant (&tot_out) + || !TYPE_VECTOR_SUBPARTS (vectype_in).is_constant (&count) + || !TYPE_UNSIGNED (vectype_in) + || !targetm.vectorize.use_permute_for_promotion (vectype_in, + vectype_out)) + { + failed_p = true; + break; + } + + unsigned steps = tot_out / bitsize_in; + tree zero = build_zero_cst (vectype_in); + + unsigned chunk_size + = exact_div (TYPE_VECTOR_SUBPARTS (vectype_in), + TYPE_VECTOR_SUBPARTS (vectype_out)).to_constant (); + unsigned step_size = chunk_size * (tot_out / tot_in); + unsigned nunits = tot_out / bitsize_in; + + vec_perm_builder sel (steps, 1, 1); + sel.quick_grow (steps); + + /* Flood fill with the out of range value first. */ + for (unsigned long i = 0; i < steps; ++i) + sel[i] = count; + + tree var; + tree elem_in = TREE_TYPE (vectype_in); + machine_mode elem_mode_in = TYPE_MODE (elem_in); + unsigned long idx = 0; + tree vc_in = get_related_vectype_for_scalar_type (elem_mode_in, + elem_in, nunits); + + for (unsigned long j = 0; j < chunk_size; j++) + { + if (WORDS_BIG_ENDIAN) + for (int i = steps - 1; i >= 0; i -= step_size, idx++) + sel[i] = idx; + else + for (int i = 0; i < (int)steps; i += step_size, idx++) + sel[i] = idx; + + vec_perm_indices indices (sel, 2, steps); + + tree perm_mask = vect_gen_perm_mask_checked (vc_in, indices); + auto vec_oprnd = make_ssa_name (vc_in); + auto new_stmt = gimple_build_assign (vec_oprnd, VEC_PERM_EXPR, + vop0, zero, perm_mask); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + + tree intvect_out = unsigned_type_for (vectype_out); + var = make_ssa_name (intvect_out); + new_stmt = gimple_build_assign (var, build1 (VIEW_CONVERT_EXPR, + intvect_out, + vec_oprnd)); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + + gcc_assert (ch2.is_tree_code ()); + + var = make_ssa_name (vectype_out); + if (ch2 == VIEW_CONVERT_EXPR) + new_stmt = gimple_build_assign (var, + build1 (VIEW_CONVERT_EXPR, + vectype_out, + vec_oprnd)); + else + new_stmt = gimple_build_assign (var, (tree_code)ch2, + vec_oprnd); + + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + vec_tmp.safe_push (var); + } + } + + if (!failed_p) + { + vec_oprnds0->release (); + *vec_oprnds0 = vec_tmp; + return; + } + } + vec_tmp.create (vec_oprnds0->length () * 2); FOR_EACH_VEC_ELT (*vec_oprnds0, i, vop0) { @@ -5495,6 +5600,20 @@ vectorizable_conversion (vec_info *vinfo, || GET_MODE_SIZE (lhs_mode) <= GET_MODE_SIZE (rhs_mode)) goto unsupported; + /* Check to see if the target can use a permute to perform the zero + extension. */ + intermediate_type = unsigned_type_for (vectype_out); + if (TYPE_UNSIGNED (vectype_in) + && VECTOR_TYPE_P (intermediate_type) + && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant () + && targetm.vectorize.use_permute_for_promotion (vectype_in, + intermediate_type)) + { + code1 = VEC_PERM_EXPR; + code2 = FLOAT_EXPR; + break; + } + fltsz = GET_MODE_SIZE (lhs_mode); FOR_EACH_2XWIDER_MODE (rhs_mode_iter, rhs_mode) { @@ -9804,7 +9923,8 @@ vect_gen_perm_mask_any (tree vectype, const vec_perm_indices &sel) tree mask_type; poly_uint64 nunits = sel.length (); - gcc_assert (known_eq (nunits, TYPE_VECTOR_SUBPARTS (vectype))); + gcc_assert (known_eq (nunits, TYPE_VECTOR_SUBPARTS (vectype)) + || known_eq (nunits, TYPE_VECTOR_SUBPARTS (vectype) * 2)); mask_type = build_vector_type (ssizetype, nunits); return vec_perm_indices_to_tree (mask_type, sel); @@ -14397,8 +14517,20 @@ supportable_widening_operation (vec_info *vinfo, break; CASE_CONVERT: - c1 = VEC_UNPACK_LO_EXPR; - c2 = VEC_UNPACK_HI_EXPR; + { + tree cvt_type = unsigned_type_for (vectype_out); + if (TYPE_UNSIGNED (vectype_in) + && VECTOR_TYPE_P (cvt_type) + && TYPE_VECTOR_SUBPARTS (cvt_type).is_constant () + && targetm.vectorize.use_permute_for_promotion (vectype_in, cvt_type)) + { + *code1 = VEC_PERM_EXPR; + *code2 = VIEW_CONVERT_EXPR; + return true; + } + c1 = VEC_UNPACK_LO_EXPR; + c2 = VEC_UNPACK_HI_EXPR; + } break; case FLOAT_EXPR: