From patchwork Fri Oct 18 15:12:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Akram Ahmad X-Patchwork-Id: 1999267 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=hgpleS+j; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=hgpleS+j; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVStM1g8kz1xw2 for ; Sat, 19 Oct 2024 02:14:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3887A3858D37 for ; Fri, 18 Oct 2024 15:14:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2061d.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::61d]) by sourceware.org (Postfix) with ESMTPS id CC1F03858C48 for ; Fri, 18 Oct 2024 15:13:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CC1F03858C48 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CC1F03858C48 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::61d ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1729264404; cv=pass; b=DCoiLEHrc4eJkV8IWyJkX/TgmMGPtCpdMHNsqZ6IUEY2PpTpjCBTFURmq6VzuLVznio9+xOacJVuFIaiPK1nlUGQU3swsFg9Pm8QTecmjrK0nYIeApdjoOkWLY0RZdatW/UVX4G5Vd0nHFUYQEkK4GMQ2zEBsKxxJUtdepwASXA= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1729264404; c=relaxed/simple; bh=Jrg/l7f83OnzrpEYR9BYueO2D93C/aeeNCflKJ+15OY=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=FeIqEWGTRLAfgvOJD8HTxbYmKoRgSOBClpyfM9lDOYSANVLlJqxhw5+WNf9GgqEfIGaYQhl+ff94/edxyd61P7jDzG0UsCl8hwrOxo/2WWOxskGoyiDhxOE27vAivmMiHkMzST1x3/qeFgHGdBzHCRanu5/OzJ2kED39T0la5BU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=G6E+cPKp7UIvjCgKRyZ5dQUm412v2tZTMKJ79pOcmtR8dhBYkDeboRbbIYAdvr53rCIxzpAqIssBG4DGfJTB1H7NmDkv9zQKY3DNNATWtIrJ3RBdoBYLbEzgB19q3QF+57748MlXR/24IFjfOlC3HcHEjiCyYj5QcvzuhPOM/Nt1cbK49AYL3xaV1m2DyEG7RflDvlj3zvQSMueYPHNLbXllgeFOnEwR8ux/R8O7NvkQnuIySx660BbEK309yZm2Kt+0ZFbIBlo0+dgtRmkLZXlEr1tMT+7wolNc9WvGMqboJIf+0PrmWR/FqN3fflRSq3qjSa0aHNqOnEMdG3tHUQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O46Dq+kN/rd/2ta89I2Ag3whbFFyUC+qfC17BdGp3CY=; b=JGU1YjVLBkNFojV/E+UF4dUn1zHk8s8d1OhR81mCHjjmjWIfSN1t64ujnqxF9efIjtT6ei1kGRilpzOFe06I4uKRZtJ48GMpPjZvO/dzkTLHCGUJyiIe2SZPRZsVvzVhk+DCj4UB2bokgd2B+m37P4KJ3LOo7UgZAEX/A3rXN6KBNJUcrPyvgXSisOgLHdDhDJZglC+l6ngWYnEKnlH87tESHV0Kmr5vOjHztYsk04FnFHb3bXIineoyQ5dOuZcazUlX7qx29OhOt/5JiKlfzTHNq0Fvn9kTccr9kDU6mK/hSf5eVtnotSopVrlQBLgnIwKri1LYD1ykEbzxF1o1tQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=O46Dq+kN/rd/2ta89I2Ag3whbFFyUC+qfC17BdGp3CY=; b=hgpleS+jg812blwwtudCat721qMkxlJUkKENjcLVsWMgV8xcjPH/3MWCzXBmcjSmwD5DeX0XS/DpwGlJsL/4KbJ9VTVIQ+3N/BZ23a1tyly/JowBFxdaCeOcUTuo8pPSP9Fju5E8i7OxxdbvBm0394E8GMsSA/DyplaBi5YvKh4= Received: from DB3PR06CA0007.eurprd06.prod.outlook.com (2603:10a6:8:1::20) by VI1PR08MB10101.eurprd08.prod.outlook.com (2603:10a6:800:1ca::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.19; Fri, 18 Oct 2024 15:13:10 +0000 Received: from DU2PEPF00028D0A.eurprd03.prod.outlook.com (2603:10a6:8:1:cafe::1c) by DB3PR06CA0007.outlook.office365.com (2603:10a6:8:1::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.20 via Frontend Transport; Fri, 18 Oct 2024 15:13:10 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028D0A.mail.protection.outlook.com (10.167.242.170) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8069.17 via Frontend Transport; Fri, 18 Oct 2024 15:13:10 +0000 Received: ("Tessian outbound 40ef283ec771:v473"); Fri, 18 Oct 2024 15:13:10 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: df85427778a08bf7 X-TessianGatewayMetadata: SsTDy3iB9l3FO+/GrtD3B+/kIafgamGRtPR2tRLTzEc9URDBYhbfAbXJoImC6YRQTgjVKqjPB5XRNCSjEVIywAoQ8D5Ik+edmcDV+jLgw0S4b6WLDcOA2gKjseaeq6taSLgq3SRBwoVjbt99BBcSxlMZyJRZjtkdY7JhbCWXIYQ= X-CR-MTA-TID: 64aa7808 Received: from Lf2afa8333a5e.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C9689F57-0050-4B80-AE08-151F28D1F639.1; Fri, 18 Oct 2024 15:12:58 +0000 Received: from EUR03-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lf2afa8333a5e.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 18 Oct 2024 15:12:58 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dvfcUu5bIGa/yHpWl6vpFXTu/X3g32VJ3hYfUlEQHerQ8SvhhvJP60J3Gppy/3k1/U6WvVsyDlynB844bid+Zdu+EiXUcjzEKg6Edps+zJl9WIQTkAEq9V9rFqN/kEeC4Zui8I+SAnqTt4KZEzO9aZYm3TKA6v6H9xW1Kzk5DJb42RUkgQSCoOUx8nuJJVu2JpM5tMfbH6e49dyT01HAhqjen+V76rMC1SW2qkQyJtea2YHNe7HuzoGVadG6Jz0NUjCqP0ZJX4ZExf+DgAZUY/ZzV07FXxlW9IcSE8J4rO1tZdDKm8TdyGi39YZKqmEMpOrGMUWMT/CNpNI/VGlltQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O46Dq+kN/rd/2ta89I2Ag3whbFFyUC+qfC17BdGp3CY=; b=CVehH9+kyW4ED04S55CtdUB0Un+1os5BfpbOhEz3QA6Y/Rngz0eB5KCIA/asKS1LnNQ1cuF8mkW4s6AOoxWj4kyamF37fNgG4JG9qKk+BVz5F/hRss3C3EX8JOPr0a+4P8MGZge8jMra8zW7ej+gWfy32x2YIfdx9wrT+7PkHtmsJMz/FIRz0u5Squp7Xs/yXGtt0O9nExO5iH4Zz9fAmri1P2hqlnvErzqI1bQwV1fznRG9Ik+FiQ8/jrTpbzRFzTsWtirh6owfjRCXYdwbkSAiLHPx6hKlchSz300rn7gAVTRqTbUToccTSo8wOXnIZvHbrEKeNovAc+wIRYSM4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=O46Dq+kN/rd/2ta89I2Ag3whbFFyUC+qfC17BdGp3CY=; b=hgpleS+jg812blwwtudCat721qMkxlJUkKENjcLVsWMgV8xcjPH/3MWCzXBmcjSmwD5DeX0XS/DpwGlJsL/4KbJ9VTVIQ+3N/BZ23a1tyly/JowBFxdaCeOcUTuo8pPSP9Fju5E8i7OxxdbvBm0394E8GMsSA/DyplaBi5YvKh4= Received: from AS4P195CA0001.EURP195.PROD.OUTLOOK.COM (2603:10a6:20b:5e2::8) by AS2PR08MB9738.eurprd08.prod.outlook.com (2603:10a6:20b:606::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.18; Fri, 18 Oct 2024 15:12:53 +0000 Received: from AMS0EPF000001A4.eurprd05.prod.outlook.com (2603:10a6:20b:5e2:cafe::a9) by AS4P195CA0001.outlook.office365.com (2603:10a6:20b:5e2::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.21 via Frontend Transport; Fri, 18 Oct 2024 15:12:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF000001A4.mail.protection.outlook.com (10.167.16.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8069.17 via Frontend Transport; Fri, 18 Oct 2024 15:12:53 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 18 Oct 2024 15:12:52 +0000 Received: from ip-10-248-139-139.eu-west-1.compute.internal (10.252.78.54) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39 via Frontend Transport; Fri, 18 Oct 2024 15:12:52 +0000 From: Akram Ahmad To: CC: Akram Ahmad Subject: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic Date: Fri, 18 Oct 2024 15:12:18 +0000 Message-ID: <20241018151219.308512-2-Akram.Ahmad@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241018151219.308512-1-Akram.Ahmad@arm.com> References: <20241018151219.308512-1-Akram.Ahmad@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF000001A4:EE_|AS2PR08MB9738:EE_|DU2PEPF00028D0A:EE_|VI1PR08MB10101:EE_ X-MS-Office365-Filtering-Correlation-Id: ff5a5752-8841-40e6-de6a-08dcef8760cb x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info-Original: QrTQAakgXNnOyPwKzJ7xwnuXGYwuAsfaQwUNJJ6Z/qLhrqP34o5O9GrbXxTqvJGRlCkt9r5o9pxGgT3xRlLhxuv/ZG46FKBo4tIFHstQBM0htJPMti4MEaYyF3mV5yXXqSUjd8eDTfputs6bPctdRkFsMtxFHRizvlO3ZvAJJ5umbT9BtzLckBZqGFePJE5vQWNGjZhPPbpHGPeFbqiUUepzgT1xkMl/Opcz5MWj4farp6enbztC6j7EhifXD0oWl6HnsrVenE4hk87AgOnqAOPjR513EwgHvd0fS3d8r4PVd5pWAli93Fnt3LEPSBcItDaxA7iZhotjl0lKy2VYUstojHakp/l0mZXkDoaPC/fUbgj90IFP3XR9lptOzo93ZjS7qsJov2Prr/qL9S5YLk0sU0bMR09pfDQZnljCnYsYQKj4ZjqdaBVHRj6W1/b5cgl36Bf63karSxrxtyzUOCDZXQb6GvBzTzrbH+2TAt3bh5vSxNX6QnOa2PWnR9xn7Yhp8ZmVrhQJWTLkohbXG6eQqWWpH+Cj4Z5nODxWjozEnLQ64jbBbBOTLQ0le3SQxN9qJu6pP1IQAFjwRBbD28mSBilgE7Yx0TVYia1At5TMYJnAsaizuzy0zzYTV4DwpurOuTpusaoez0CDIef0b1kx/j+j5paYMvTRyaHbXeIjCRzDT/z64kA3v7mqb4R2PqAzpm0TxfFlsQHzEvOiXEtkHqY/6jUbUqjK1ygUcylFSDgvWf2GmTpT29UMnh8oadHmzhgbwcwAeiRl0zemg5+GYJGg32Uo19oQTxIYokoh7vni+qakCv5AqMGv9Dh03WFpeG9c79+qCOEOczU/vGrO/b8IsAn73nKq0jc3u12lrAnpiI1JEsEDLwsrfZ2Hlsg840wkMNH1G5iPjihtYwhmrg9I2Z3o/ZqN21zK7yOnHtjGbuZOiz1bahAmZNuLEJOziUhXLhY9NIOjqPp6STBS4Jvm1Utw03QxSsnNTrZzgQGHZ4c7SwMid+ZF455+3PwXGG08PerwpNJuQzSoDBISGKgeGF2UYx36+YZFXGduez9/M8Z8T4J9YoBYMN8ABOPVXigYdQwL6z6tUI4bEaAU5gAjYl53bSLHBcaWIyb3Tn8iDr1sdUAVfzRO8t1gn+sUe5UT5eGAnt7xxmqtshGYqqtmXbIsHGQOgxBUmKZGqc+eNf2vHAGQ+qjTYNsO7l3hzXC21PxRvLMmZAXL2+oDiVWzb91lSveMIYoET7t0ztKW6as25b0shpaAoZyFch9i1O4l52VeLOWNcMnXTnBsiNV/mDfGuHbrEukTPokhWlelIFj6+CWLMDFDk4DYf4d7Xy+7GngbtORzVz3Jo35EMfoV94pVIn2v32JU044= X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9738 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:5e2::8]; domain=AS4P195CA0001.EURP195.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028D0A.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 96b54310-4193-4509-f3bb-08dcef8756d1 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|36860700013|82310400026|35042699022|1800799024; X-Microsoft-Antispam-Message-Info: h1IpzFPIEAJhxfN0QOt1phVBgt4bQvXX4sBYZPHFeDq/7SFhm0tIKuF4XT6DVyOkIFmj51/zlIPuuFV5WmPe+B/pRVVgqMIiNnSQ3QBdv5LwBO3zHIUPIwgljiB+Amj6mppm8ndIgU6NU6Uymj78LNXv+6+kdR7UV/QhU238Qsa4EvBgbfDWBFaHYydDX+WqNDDhNiXWONa+7Y0BPycxTooPmdlloKX+e81wk3jaIzEM4RxkiP/xGCoijVRprIOgyhhiB/rEUVZI9koJosmbWXrtdfsBdhEgiOYi4cxviBPRtWPEIlfz0Ma1iuqNs7OXwewiS3UadPR2MJqNq7hnl1Z/U2UndlfvibRdEeqdbPg3HRwJoRv+69wSt3lZ9iLY94tOsu2PVI5hTmEnMI3rU5j8DycbgqJB8OdUgQ6LZK3wBynLjqm99WNRNPDzTFkG4vzw5SAGayFohGvcUjRzmOmbgMNtWyaMmNlUlsdT2unFs+i7Bx4l5x3BynmHAaWR8RG+XXYXi8PoPYp2cVnvUQPKBZJeOaABVGGllOuh10KZj2JHNKUjXijrRxtBlKRv8HI627RKVZPW5d32JXdUxGFDVX6KoD/QSgADX2YQQOmO6jJf/wExx7Ot8KZXfPkz2Sj0USRLh04U9X76OZebjR1VHUSEJSoZUivLYIpf/rRGPspiqc49Ey9zbu1ChKHEYAC9oXCFAVlFIY7dZ1pP+TqZteXSgf/9NBYwTtyqhDbPZRUhBA+BSRd7SsTr5k+cHM1l3G+w/fv53zEO/c6lslCvyPQbNI2dN8nfWvZQS9+jE3LZJVEtPBPUGHZOl2/D77op9s28hGorQIJ8UP2m5Teh1uveG0fwfeluaipQ8Jil8RhrjmnxzGbtOoMiF504b4rtikYqeezhHe3TV5fUAzOrmtWI4F07EeEC9xQYlc3B96CnZ7B+03iP7avNCZ0rmL8gsiTxznE64sTC6/yVG92IeicUJmyO9j+Yvd9Jj3jg7W+kHmNSfPcHBoxikOnz9hSNIjxSlXhc802sCpmGX7FdINPsz/lHsjlFrlSqbaYm1VJPvA95M13S1zKHegLJTadjJVz7ejl6VI9HxYK2HizyevRHIiUPEZ/mTLhLEg0HQzkGr10Z5MOps9WM5dLeB2/d/iUNZ4VV0NLUlHEheLLE5g+4u7adhXQ5yP5R16BtFKkkwFcFTAiMECH4xvv6fwJLHKAHB3c2KttftSBJ/a6y3eVxKOBaG9yQghuaFYQG03mx9Ic8fMOsesgTQIr61gvsdcExSvXvvOxPH0jUS/c28G/6Y/MdcbVFwvm4+5vJh5CKzJEGjlr823rbX6z2FjGTEDha4mEdCdj3PfWBOfZW+dCsFvObWoGPahvHea0= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(376014)(36860700013)(82310400026)(35042699022)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2024 15:13:10.3208 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ff5a5752-8841-40e6-de6a-08dcef8760cb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D0A.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB10101 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This renames the existing {s,u}q{add,sub} instructions to use the standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and IFN_SAT_SUB. The NEON intrinsics for saturating arithmetic and their corresponding builtins are changed to use these standard names too. Using the standard names for the instructions causes 32 and 64-bit unsigned scalar saturating arithmetic to use the NEON instructions, resulting in an additional (and inefficient) FMOV to be generated when the original operands are in GP registers. This patch therefore also restores the original behaviour of using the adds/subs instructions in this circumstance. Additional tests are written for the scalar and Adv. SIMD cases to ensure that the correct instructions are used. The NEON intrinsics are already tested elsewhere. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc: Expand iterators. * config/aarch64/aarch64-simd-builtins.def: Use standard names * config/aarch64/aarch64-simd.md: Use standard names, split insn definitions on signedness of operator and type of operands. * config/aarch64/arm_neon.h: Use standard builtin names. * config/aarch64/iterators.md: Add VSDQ_I_QI_HI iterator to simplify splitting of insn for unsigned scalar arithmetic. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc: Template file for unsigned vector saturating arithmetic tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c: 8-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c: 16-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c: 32-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c: 64-bit vector type tests. * gcc.target/aarch64/saturating_arithmetic.inc: Template file for scalar saturating arithmetic tests. * gcc.target/aarch64/saturating_arithmetic_1.c: 8-bit tests. * gcc.target/aarch64/saturating_arithmetic_2.c: 16-bit tests. * gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests. * gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests. --- gcc/config/aarch64/aarch64-builtins.cc | 13 +++ gcc/config/aarch64/aarch64-simd-builtins.def | 8 +- gcc/config/aarch64/aarch64-simd.md | 93 +++++++++++++++++- gcc/config/aarch64/arm_neon.h | 96 +++++++++---------- gcc/config/aarch64/iterators.md | 4 + .../saturating_arithmetic_autovect.inc | 58 +++++++++++ .../saturating_arithmetic_autovect_1.c | 79 +++++++++++++++ .../saturating_arithmetic_autovect_2.c | 79 +++++++++++++++ .../saturating_arithmetic_autovect_3.c | 75 +++++++++++++++ .../saturating_arithmetic_autovect_4.c | 77 +++++++++++++++ .../aarch64/saturating_arithmetic.inc | 39 ++++++++ .../aarch64/saturating_arithmetic_1.c | 41 ++++++++ .../aarch64/saturating_arithmetic_2.c | 41 ++++++++ .../aarch64/saturating_arithmetic_3.c | 30 ++++++ .../aarch64/saturating_arithmetic_4.c | 30 ++++++ 15 files changed, 707 insertions(+), 56 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/saturating_arithmetic.inc create mode 100644 gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_4.c diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index 7d737877e0b..f2a1b6ddbf6 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -3849,6 +3849,19 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt, new_stmt = gimple_build_assign (gimple_call_lhs (stmt), LSHIFT_EXPR, args[0], args[1]); break; + + /* lower saturating add/sub neon builtins to gimple. */ + BUILTIN_VSDQ_I (BINOP, ssadd, 3, NONE) + BUILTIN_VSDQ_I (BINOPU, usadd, 3, NONE) + new_stmt = gimple_build_call_internal (IFN_SAT_ADD, 2, args[0], args[1]); + gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); + break; + BUILTIN_VSDQ_I (BINOP, sssub, 3, NONE) + BUILTIN_VSDQ_I (BINOPU, ussub, 3, NONE) + new_stmt = gimple_build_call_internal (IFN_SAT_SUB, 2, args[0], args[1]); + gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); + break; + BUILTIN_VSDQ_I_DI (BINOP, sshl, 0, NONE) BUILTIN_VSDQ_I_DI (BINOP_UUS, ushl, 0, NONE) { diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 0814f8ba14f..43a0a62caee 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -71,10 +71,10 @@ BUILTIN_VSDQ_I (BINOP, sqrshl, 0, NONE) BUILTIN_VSDQ_I (BINOP_UUS, uqrshl, 0, NONE) /* Implemented by aarch64_. */ - BUILTIN_VSDQ_I (BINOP, sqadd, 0, NONE) - BUILTIN_VSDQ_I (BINOPU, uqadd, 0, NONE) - BUILTIN_VSDQ_I (BINOP, sqsub, 0, NONE) - BUILTIN_VSDQ_I (BINOPU, uqsub, 0, NONE) + BUILTIN_VSDQ_I (BINOP, ssadd, 3, NONE) + BUILTIN_VSDQ_I (BINOPU, usadd, 3, NONE) + BUILTIN_VSDQ_I (BINOP, sssub, 3, NONE) + BUILTIN_VSDQ_I (BINOPU, ussub, 3, NONE) /* Implemented by aarch64_qadd. */ BUILTIN_VSDQ_I (BINOP_SSU, suqadd, 0, NONE) BUILTIN_VSDQ_I (BINOP_UUS, usqadd, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index bf272bc0b4e..f6cf37c3231 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -5221,15 +5221,100 @@ ) ;; q -(define_insn "aarch64_q" - [(set (match_operand:VSDQ_I 0 "register_operand" "=w") - (BINQOPS:VSDQ_I (match_operand:VSDQ_I 1 "register_operand" "w") - (match_operand:VSDQ_I 2 "register_operand" "w")))] +(define_insn "s3" + [(set (match_operand:VSDQ_I_QI_HI 0 "register_operand" "=w") + (BINQOPS:VSDQ_I_QI_HI (match_operand:VSDQ_I_QI_HI 1 "register_operand" "w") + (match_operand:VSDQ_I_QI_HI 2 "register_operand" "w")))] "TARGET_SIMD" "q\\t%0, %1, %2" [(set_attr "type" "neon_q")] ) +(define_insn "s3" + [(set (match_operand:GPI 0 "register_operand" "=w") + (SBINQOPS:GPI (match_operand:GPI 1 "register_operand" "w") + (match_operand:GPI 2 "register_operand" "w")))] + "TARGET_SIMD" + "q\\t%0, %1, %2" + [(set_attr "type" "neon_q")] +) + +;; If this is an unsigned saturating arithmetic and the operands arrive in GP +;; registers, then it is possible to perform this arithmetic without using the +;; NEON instructions. This avoids using unnecessary fmov instructions to move +;; either the operands or the result to and from GP regs to FP regs. This is +;; only possible with SImode and DImode. + +(define_insn_and_split "s3" + [(set (match_operand:GPI 0 "register_operand") + (UBINQOPS:GPI (match_operand:GPI 1 "register_operand") + (match_operand:GPI 2 "aarch64_plus_operand")))] + "" + {@ [ cons: =0, 1 , 2 ; attrs: type, arch, length ] + [ w , w , w ; neon_q, *, 4 ] q\\t%0, %1, %2 + [ r , r , JIr ; * , *, 8 ] # + } + "&& reload_completed && GP_REGNUM_P (REGNO (operands[0]))" + [(set (match_dup 0) + (if_then_else:GPI + (match_operator 3 "comparison_operator" [(reg:CC CC_REGNUM) (const_int 0)]) + (match_dup 0) + (match_operand:GPI 4 "immediate_operand" "i")))] + { + + if (REG_P (operands[2])) + { + switch () + { + case US_MINUS: + emit_insn (gen_sub3_compare1 (operands[0], operands[1], + operands[2])); + break; + case US_PLUS: + emit_insn (gen_add3_compare0 (operands[0], operands[1], + operands[2])); + break; + default: + break; + } + } + else + { + unsigned long imm = UINTVAL (operands[2]); + gcc_assert (imm != 0); + rtx neg_imm = gen_int_mode (-imm, mode); + switch () + { + case US_MINUS: + emit_insn (gen_sub3_compare1_imm (operands[0], operands[1], + operands[2], neg_imm)); + break; + case US_PLUS: + emit_insn (gen_sub3_compare1_imm (operands[0], operands[1], + neg_imm, operands[2])); + break; + default: + break; + } + } + + rtx ccin = gen_rtx_REG (CC_Cmode, CC_REGNUM); + switch () + { + case US_PLUS: + operands[3] = gen_rtx_LTU (mode, ccin, const0_rtx); + operands[4] = gen_int_mode (-1, mode); + break; + case US_MINUS: + operands[3] = gen_rtx_GEU (mode, ccin, const0_rtx); + operands[4] = const0_rtx; + break; + default: + break; + } + } +) + ;; suqadd and usqadd (define_insn "aarch64_qadd" diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index e376685489d..3acf12fd7a1 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -1904,35 +1904,35 @@ __extension__ extern __inline int8x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_s8 (int8x8_t __a, int8x8_t __b) { - return (int8x8_t) __builtin_aarch64_sqaddv8qi (__a, __b); + return (int8x8_t) __builtin_aarch64_ssaddv8qi (__a, __b); } __extension__ extern __inline int16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_s16 (int16x4_t __a, int16x4_t __b) { - return (int16x4_t) __builtin_aarch64_sqaddv4hi (__a, __b); + return (int16x4_t) __builtin_aarch64_ssaddv4hi (__a, __b); } __extension__ extern __inline int32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_s32 (int32x2_t __a, int32x2_t __b) { - return (int32x2_t) __builtin_aarch64_sqaddv2si (__a, __b); + return (int32x2_t) __builtin_aarch64_ssaddv2si (__a, __b); } __extension__ extern __inline int64x1_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_s64 (int64x1_t __a, int64x1_t __b) { - return (int64x1_t) {__builtin_aarch64_sqadddi (__a[0], __b[0])}; + return (int64x1_t) {__builtin_aarch64_ssadddi (__a[0], __b[0])}; } __extension__ extern __inline uint8x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_u8 (uint8x8_t __a, uint8x8_t __b) { - return __builtin_aarch64_uqaddv8qi_uuu (__a, __b); + return __builtin_aarch64_usaddv8qi_uuu (__a, __b); } __extension__ extern __inline int8x8_t @@ -2191,189 +2191,189 @@ __extension__ extern __inline uint16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_u16 (uint16x4_t __a, uint16x4_t __b) { - return __builtin_aarch64_uqaddv4hi_uuu (__a, __b); + return __builtin_aarch64_usaddv4hi_uuu (__a, __b); } __extension__ extern __inline uint32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_u32 (uint32x2_t __a, uint32x2_t __b) { - return __builtin_aarch64_uqaddv2si_uuu (__a, __b); + return __builtin_aarch64_usaddv2si_uuu (__a, __b); } __extension__ extern __inline uint64x1_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadd_u64 (uint64x1_t __a, uint64x1_t __b) { - return (uint64x1_t) {__builtin_aarch64_uqadddi_uuu (__a[0], __b[0])}; + return (uint64x1_t) {__builtin_aarch64_usadddi_uuu (__a[0], __b[0])}; } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_s8 (int8x16_t __a, int8x16_t __b) { - return (int8x16_t) __builtin_aarch64_sqaddv16qi (__a, __b); + return (int8x16_t) __builtin_aarch64_ssaddv16qi (__a, __b); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_s16 (int16x8_t __a, int16x8_t __b) { - return (int16x8_t) __builtin_aarch64_sqaddv8hi (__a, __b); + return (int16x8_t) __builtin_aarch64_ssaddv8hi (__a, __b); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_s32 (int32x4_t __a, int32x4_t __b) { - return (int32x4_t) __builtin_aarch64_sqaddv4si (__a, __b); + return (int32x4_t) __builtin_aarch64_ssaddv4si (__a, __b); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_s64 (int64x2_t __a, int64x2_t __b) { - return (int64x2_t) __builtin_aarch64_sqaddv2di (__a, __b); + return (int64x2_t) __builtin_aarch64_ssaddv2di (__a, __b); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_u8 (uint8x16_t __a, uint8x16_t __b) { - return __builtin_aarch64_uqaddv16qi_uuu (__a, __b); + return __builtin_aarch64_usaddv16qi_uuu (__a, __b); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_u16 (uint16x8_t __a, uint16x8_t __b) { - return __builtin_aarch64_uqaddv8hi_uuu (__a, __b); + return __builtin_aarch64_usaddv8hi_uuu (__a, __b); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_u32 (uint32x4_t __a, uint32x4_t __b) { - return __builtin_aarch64_uqaddv4si_uuu (__a, __b); + return __builtin_aarch64_usaddv4si_uuu (__a, __b); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddq_u64 (uint64x2_t __a, uint64x2_t __b) { - return __builtin_aarch64_uqaddv2di_uuu (__a, __b); + return __builtin_aarch64_usaddv2di_uuu (__a, __b); } __extension__ extern __inline int8x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_s8 (int8x8_t __a, int8x8_t __b) { - return (int8x8_t) __builtin_aarch64_sqsubv8qi (__a, __b); + return (int8x8_t) __builtin_aarch64_sssubv8qi (__a, __b); } __extension__ extern __inline int16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_s16 (int16x4_t __a, int16x4_t __b) { - return (int16x4_t) __builtin_aarch64_sqsubv4hi (__a, __b); + return (int16x4_t) __builtin_aarch64_sssubv4hi (__a, __b); } __extension__ extern __inline int32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_s32 (int32x2_t __a, int32x2_t __b) { - return (int32x2_t) __builtin_aarch64_sqsubv2si (__a, __b); + return (int32x2_t) __builtin_aarch64_sssubv2si (__a, __b); } __extension__ extern __inline int64x1_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_s64 (int64x1_t __a, int64x1_t __b) { - return (int64x1_t) {__builtin_aarch64_sqsubdi (__a[0], __b[0])}; + return (int64x1_t) {__builtin_aarch64_sssubdi (__a[0], __b[0])}; } __extension__ extern __inline uint8x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_u8 (uint8x8_t __a, uint8x8_t __b) { - return __builtin_aarch64_uqsubv8qi_uuu (__a, __b); + return __builtin_aarch64_ussubv8qi_uuu (__a, __b); } __extension__ extern __inline uint16x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_u16 (uint16x4_t __a, uint16x4_t __b) { - return __builtin_aarch64_uqsubv4hi_uuu (__a, __b); + return __builtin_aarch64_ussubv4hi_uuu (__a, __b); } __extension__ extern __inline uint32x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_u32 (uint32x2_t __a, uint32x2_t __b) { - return __builtin_aarch64_uqsubv2si_uuu (__a, __b); + return __builtin_aarch64_ussubv2si_uuu (__a, __b); } __extension__ extern __inline uint64x1_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsub_u64 (uint64x1_t __a, uint64x1_t __b) { - return (uint64x1_t) {__builtin_aarch64_uqsubdi_uuu (__a[0], __b[0])}; + return (uint64x1_t) {__builtin_aarch64_ussubdi_uuu (__a[0], __b[0])}; } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_s8 (int8x16_t __a, int8x16_t __b) { - return (int8x16_t) __builtin_aarch64_sqsubv16qi (__a, __b); + return (int8x16_t) __builtin_aarch64_sssubv16qi (__a, __b); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_s16 (int16x8_t __a, int16x8_t __b) { - return (int16x8_t) __builtin_aarch64_sqsubv8hi (__a, __b); + return (int16x8_t) __builtin_aarch64_sssubv8hi (__a, __b); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_s32 (int32x4_t __a, int32x4_t __b) { - return (int32x4_t) __builtin_aarch64_sqsubv4si (__a, __b); + return (int32x4_t) __builtin_aarch64_sssubv4si (__a, __b); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_s64 (int64x2_t __a, int64x2_t __b) { - return (int64x2_t) __builtin_aarch64_sqsubv2di (__a, __b); + return (int64x2_t) __builtin_aarch64_sssubv2di (__a, __b); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_u8 (uint8x16_t __a, uint8x16_t __b) { - return __builtin_aarch64_uqsubv16qi_uuu (__a, __b); + return __builtin_aarch64_ussubv16qi_uuu (__a, __b); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_u16 (uint16x8_t __a, uint16x8_t __b) { - return __builtin_aarch64_uqsubv8hi_uuu (__a, __b); + return __builtin_aarch64_ussubv8hi_uuu (__a, __b); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_u32 (uint32x4_t __a, uint32x4_t __b) { - return __builtin_aarch64_uqsubv4si_uuu (__a, __b); + return __builtin_aarch64_ussubv4si_uuu (__a, __b); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubq_u64 (uint64x2_t __a, uint64x2_t __b) { - return __builtin_aarch64_uqsubv2di_uuu (__a, __b); + return __builtin_aarch64_ussubv2di_uuu (__a, __b); } __extension__ extern __inline int8x8_t @@ -17583,56 +17583,56 @@ __extension__ extern __inline int8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddb_s8 (int8_t __a, int8_t __b) { - return (int8_t) __builtin_aarch64_sqaddqi (__a, __b); + return (int8_t) __builtin_aarch64_ssaddqi (__a, __b); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddh_s16 (int16_t __a, int16_t __b) { - return (int16_t) __builtin_aarch64_sqaddhi (__a, __b); + return (int16_t) __builtin_aarch64_ssaddhi (__a, __b); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadds_s32 (int32_t __a, int32_t __b) { - return (int32_t) __builtin_aarch64_sqaddsi (__a, __b); + return (int32_t) __builtin_aarch64_ssaddsi (__a, __b); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddd_s64 (int64_t __a, int64_t __b) { - return __builtin_aarch64_sqadddi (__a, __b); + return __builtin_aarch64_ssadddi (__a, __b); } __extension__ extern __inline uint8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddb_u8 (uint8_t __a, uint8_t __b) { - return (uint8_t) __builtin_aarch64_uqaddqi_uuu (__a, __b); + return (uint8_t) __builtin_aarch64_usaddqi_uuu (__a, __b); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddh_u16 (uint16_t __a, uint16_t __b) { - return (uint16_t) __builtin_aarch64_uqaddhi_uuu (__a, __b); + return (uint16_t) __builtin_aarch64_usaddhi_uuu (__a, __b); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqadds_u32 (uint32_t __a, uint32_t __b) { - return (uint32_t) __builtin_aarch64_uqaddsi_uuu (__a, __b); + return (uint32_t) __builtin_aarch64_usaddsi_uuu (__a, __b); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqaddd_u64 (uint64_t __a, uint64_t __b) { - return __builtin_aarch64_uqadddi_uuu (__a, __b); + return __builtin_aarch64_usadddi_uuu (__a, __b); } /* vqdmlal */ @@ -19282,56 +19282,56 @@ __extension__ extern __inline int8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubb_s8 (int8_t __a, int8_t __b) { - return (int8_t) __builtin_aarch64_sqsubqi (__a, __b); + return (int8_t) __builtin_aarch64_sssubqi (__a, __b); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubh_s16 (int16_t __a, int16_t __b) { - return (int16_t) __builtin_aarch64_sqsubhi (__a, __b); + return (int16_t) __builtin_aarch64_sssubhi (__a, __b); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubs_s32 (int32_t __a, int32_t __b) { - return (int32_t) __builtin_aarch64_sqsubsi (__a, __b); + return (int32_t) __builtin_aarch64_sssubsi (__a, __b); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubd_s64 (int64_t __a, int64_t __b) { - return __builtin_aarch64_sqsubdi (__a, __b); + return __builtin_aarch64_sssubdi (__a, __b); } __extension__ extern __inline uint8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubb_u8 (uint8_t __a, uint8_t __b) { - return (uint8_t) __builtin_aarch64_uqsubqi_uuu (__a, __b); + return (uint8_t) __builtin_aarch64_ussubqi_uuu (__a, __b); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubh_u16 (uint16_t __a, uint16_t __b) { - return (uint16_t) __builtin_aarch64_uqsubhi_uuu (__a, __b); + return (uint16_t) __builtin_aarch64_ussubhi_uuu (__a, __b); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubs_u32 (uint32_t __a, uint32_t __b) { - return (uint32_t) __builtin_aarch64_uqsubsi_uuu (__a, __b); + return (uint32_t) __builtin_aarch64_ussubsi_uuu (__a, __b); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vqsubd_u64 (uint64_t __a, uint64_t __b) { - return __builtin_aarch64_uqsubdi_uuu (__a, __b); + return __builtin_aarch64_ussubdi_uuu (__a, __b); } /* vqtbl2 */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index efba78375c2..9d239179c07 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -93,6 +93,10 @@ ;; integer modes; 64-bit scalar integer mode. (define_mode_iterator VSDQ_I_DI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI DI]) +;; Advanced SIMD and scalar, 64 & 128-bit container; 8 and 16-bit scalar +;; integer modes. +(define_mode_iterator VSDQ_I_QI_HI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI HI QI]) + ;; Double vector modes. (define_mode_iterator VD [V8QI V4HI V4HF V2SI V2SF V4BF]) diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc new file mode 100644 index 00000000000..1fadfd58755 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc @@ -0,0 +1,58 @@ +/* Template file for vector saturating arithmetic validation. + + This file defines saturating addition and subtraction functions for a given + scalar type, testing the auto-vectorization of these two operators. This + type, along with the corresponding minimum and maximum values for that type, + must be defined by any test file which includes this template file. */ + +#ifndef SAT_ARIT_AUTOVEC_INC +#define SAT_ARIT_AUTOVEC_INC + +#include +#include + +#ifndef UT +#define UT unsigned int +#define VT uint32x4_t +#define UMAX UINT_MAX +#define UMIN 0 +#endif + + +UT uadd_lane (UT a, VT b) +{ + UT sum = a + b[0]; + return sum < a ? UMAX : sum; +} + +void uaddq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] + b[i]; + out[i] = sum < a[i] ? UMAX : sum; + } +} + +void uaddq2 (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum; + if (!__builtin_add_overflow(a[i], b[i], &sum)) + out[i] = sum; + else + out[i] = UMAX; + } +} + +void usubq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] - b[i]; + out[i] = sum > a[i] ? UMIN : sum; + } +} + +#endif \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c new file mode 100644 index 00000000000..63eb21e438b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c @@ -0,0 +1,79 @@ +/* { dg-do assemble { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +/* +** uadd_lane: { xfail *-*-* } +** dup\tv([0-9]+).8b, w0 +** uqadd\tb([0-9]+), b\1, b0 +** umov\tw0, v\2.b\[0] +** ret +*/ +/* +** uaddq: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.16b, v\1.16b, v\2.16b +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqadd\tv\4.8b, v\3.8b, v\4.8b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\6, b\5, b\6 +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\8, b\7, b\8 +** ... +*/ +/* +** uaddq2: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.16b, v\1.16b, v\2.16b +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqadd\tv\4.8b, v\3.8b, v\4.8b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\6, b\5, b\6 +** ... +** uqadd\tb([0-9]+), b([0-9]+), b\7 +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqsub\tv\2.16b, v\1.16b, v\2.16b +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqsub\tv\4.8b, v\3.8b, v\4.8b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqsub\tb\6, b\5, b\6 +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqsub\tb\8, b\7, b\8 +** ... +*/ + +#include +#include + +#define UT unsigned char +#define VT uint8x8_t +#define UMAX UCHAR_MAX +#define UMIN 0 + +#include "saturating_arithmetic_autovect.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c new file mode 100644 index 00000000000..8e74a8a8db2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c @@ -0,0 +1,79 @@ +/* { dg-do assemble { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +/* +** uadd_lane: { xfail *-*-* } +** dup\tv([0-9]+).4h, w0 +** uqadd\th([0-9]+), h\1, h0 +** umov\tw0, v\2.h\[0] +** ret +*/ +/* +** uaddq: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.8h, v\1.8h, v\2.8h +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqadd\tv\4.4h, v\3.4h, v\4.4h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqadd\th\6, h\5, h\6 +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqadd\th\8, h\7, h\8 +** ... +*/ +/* +** uaddq2: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.8h, v\1.8h, v\2.8h +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqadd\tv\4.4h, v\3.4h, v\4.4h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqadd\th\6, h\5, h\6 +** ... +** uqadd\th([0-9]+), h([0-9]+), h\7 +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqsub\tv\2.8h, v\1.8h, v\2.8h +** ... +** ldr\td([0-9]+), .* +** ldr\td([0-9]+), .* +** uqsub\tv\4.4h, v\3.4h, v\4.4h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqsub\th\6, h\5, h\6 +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqsub\th\8, h\7, h\8 +** ... +*/ + +#include +#include + +#define UT unsigned short +#define VT uint16x4_t +#define UMAX USHRT_MAX +#define UMIN 0 + +#include "saturating_arithmetic_autovect.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c new file mode 100644 index 00000000000..a38c0e9a387 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c @@ -0,0 +1,75 @@ +/* { dg-do assemble { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +/* +** uadd_lane: +** fmov\tw([0-9]+), s0 +** adds\tw([0-9]+), (?:w\1, w0|w0, w\1) +** csinv\tw0, w\2, wzr, cc +** ret +*/ +/* +** uaddq: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.4s, v\1.4s, v\2.4s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\3, w\3, w\4 +** csinv\tw\3, w\3, wzr, cc +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\5, w\5, w\6 +** csinv\tw\5, w\5, wzr, cc +** ... +*/ +/* +** uaddq2: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.4s, v\1.4s, v\2.4s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\3, w\3, w\4 +** csinv\tw\3, w\3, wzr, cc +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\5, w\5, w\6 +** csinv\tw\5, w\5, wzr, cc +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqsub\tv\2.4s, v\1.4s, v\2.4s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** subs\tw\3, w\3, w\4 +** csel\tw\3, w\3, wzr, cs +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** subs\tw\5, w\5, w\6 +** csel\tw\5, w\5, wzr, cs +** ... +*/ + +#include +#include + +#define UT unsigned int +#define VT uint32x2_t +#define UMAX UINT_MAX +#define UMIN 0 + +#include "saturating_arithmetic_autovect.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c new file mode 100644 index 00000000000..a56e7461e32 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c @@ -0,0 +1,77 @@ +/* { dg-do assemble { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +/* +** uadd_lane: +** ... +** (?:fmov|ldr)\tx([0-9]+), .* +** ... +** adds\tx([0-9]+), (?:x\1, x0|x0, x\1) +** csinv\tx0, x\2, xzr, cc +** ret +*/ +/* +** uaddq: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.2d, v\1.2d, v\2.2d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\3, x\3, x\4 +** csinv\tx\3, x\3, xzr, cc +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\5, x\5, x\6 +** csinv\tx\5, x\5, xzr, cc +** ... +*/ +/* +** uaddq2: +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqadd\tv\2.2d, v\1.2d, v\2.2d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\3, x\3, x\4 +** csinv\tx\3, x\3, xzr, cc +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\5, x\5, x\6 +** csinv\tx\5, x\5, xzr, cc +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ldr\tq([0-9]+), .* +** ldr\tq([0-9]+), .* +** uqsub\tv\2.2d, v\1.2d, v\2.2d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** subs\tx\3, x\3, x\4 +** csel\tx\3, x\3, xzr, cs +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** subs\tx\5, x\5, x\6 +** csel\tx\5, x\5, xzr, cs +** ... +*/ + +#include +#include + +#define UT unsigned long +#define VT uint64x2_t +#define UMAX ULONG_MAX +#define UMIN 0 + +#include "saturating_arithmetic_autovect.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic.inc b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic.inc new file mode 100644 index 00000000000..e979d535405 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic.inc @@ -0,0 +1,39 @@ +/* Template file for scalar saturating arithmetic validation. + + This file defines scalar saturating addition and subtraction functions for a + given type. This type, along with the corresponding minimum and maximum + values for that type, must be defined by any test file which includes this + template file. */ + +#ifndef SAT_ARIT_INC +#define SAT_ARIT_INC + +#include + +#ifndef UT +#define UT unsigned int +#define UMAX UINT_MAX +#define UMIN 0 +#endif + +UT uadd (UT a, UT b) +{ + UT sum = a + b; + return sum < a ? UMAX : sum; +} + +UT uadd2 (UT a, UT b) +{ + UT c; + if (!__builtin_add_overflow(a, b, &c)) + return c; + return UMAX; +} + +UT usub (UT a, UT b) +{ + UT sum = a - b; + return sum > a ? UMIN : sum; +} + +#endif \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_1.c b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_1.c new file mode 100644 index 00000000000..9dc9a9e2211 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_1.c @@ -0,0 +1,41 @@ +/* { dg-do-compile } */ +/* { dg-options "-O2 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uadd: +** dup v([0-9]+).8b, w0 +** dup v([0-9]+).8b, w1 +** uqadd b\2, b\2, b\1 +** umov w0, v\2.b\[0\] +** ret +*/ +/* +** uadd2: +** dup v([0-9]+).8b, w0 +** dup v([0-9]+).8b, w1 +** uqadd b\2, b\2, b\1 +** umov w0, v\2.b\[0\] +** ret +*/ +/* +** usub: { xfail *-*-* } +** dup v([0-9]+).8b, w0 +** dup v([0-9]+).8b, w1 +** ( +** uqsub b\2, (?:b\2, b\1|b\1. b\2) +** umov w0, v\2.b\[0\] +** | +** uqsub b\1, (?:b\2, b\1|b\1. b\2) +** umov w0, v\1.b\[0\] +** ) +** ret +*/ + +#include + +#define UT unsigned char +#define UMAX UCHAR_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_2.c b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_2.c new file mode 100644 index 00000000000..aa4dee82765 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_2.c @@ -0,0 +1,41 @@ +/* { dg-do-compile } */ +/* { dg-options "-O2 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uadd: +** dup v([0-9]+).4h, w0 +** dup v([0-9]+).4h, w1 +** uqadd h\2, h\2, h\1 +** umov w0, v\2.h\[0\] +** ret +*/ +/* +** uadd2: +** dup v([0-9]+).4h, w0 +** dup v([0-9]+).4h, w1 +** uqadd h\2, h\2, h\1 +** umov w0, v\2.h\[0\] +** ret +*/ +/* +** usub: { xfail *-*-* } +** dup v([0-9]+).4h, w0 +** dup v([0-9]+).4h, w1 +** ( +** uqsub h\2, (?:h\2, h\1|h\1. h\2) +** umov w0, v\2.h\[0\] +** | +** uqsub h\1, (?:h\2, h\1|h\1. h\2) +** umov w0, v\1.h\[0\] +** ) +** ret +*/ + +#include + +#define UT unsigned short +#define UMAX USHRT_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_3.c b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_3.c new file mode 100644 index 00000000000..21517254519 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_3.c @@ -0,0 +1,30 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uadd: +** adds\tw([0-9]+), w([0-9]+), w([0-9]+) +** csinv\tw\1, w\1, wzr, cc +** ret +*/ +/* +** uadd2: +** adds\tw([0-9]+), w([0-9]+), w([0-9]+) +** csinv\tw\1, w\1, wzr, cc +** ret +*/ +/* +** usub: +** subs\tw([0-9]+), w([0-9]+), w([0-9]+) +** csel\tw\1, w\1, wzr, cs +** ret +*/ + +#include + +#define UT unsigned int +#define UMAX UINT_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_4.c b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_4.c new file mode 100644 index 00000000000..363d0a79a73 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saturating_arithmetic_4.c @@ -0,0 +1,30 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uadd: +** adds\tx([0-9]+), x([0-9]+), x([0-9]+) +** csinv\tx\1, x\1, xzr, cc +** ret +*/ +/* +** uadd2: +** adds\tx([0-9]+), x([0-9]+), x([0-9]+) +** csinv\tx\1, x\1, xzr, cc +** ret +*/ +/* +** usub: +** subs\tx([0-9]+), x([0-9]+), x([0-9]+) +** csel\tx\1, x\1, xzr, cs +** ret +*/ + +#include + +#define UT unsigned long +#define UMAX ULONG_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file From patchwork Fri Oct 18 15:12:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Akram Ahmad X-Patchwork-Id: 1999265 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=fFamOBZF; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=fFamOBZF; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVSs45g1Dz1xw2 for ; Sat, 19 Oct 2024 02:13:44 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E8A8E385828B for ; Fri, 18 Oct 2024 15:13:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on20627.outbound.protection.outlook.com [IPv6:2a01:111:f403:260d::627]) by sourceware.org (Postfix) with ESMTPS id C3F4F3858D37 for ; Fri, 18 Oct 2024 15:13:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C3F4F3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C3F4F3858D37 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260d::627 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1729264396; cv=pass; b=Q/0yDKBAwcELjHg4ROokfvkpFGUvU+dX8bov7u0et8gA0FwOucoI+K7gAQHM/WXEkyicInNmHdQ9AU4WcP/G6CVkkBe6rbrercIydg0aLslKmqVqzOnZvZ6545PUk6dhKh9wBo2GY/vP5yogF8SjNNg7JEzviMXl0TQSJKx0jEU= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1729264396; c=relaxed/simple; bh=wpjELa5cP8wy95CaNI244GAOqzA0+PuRYJ9Wo+sFgwU=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=S4EQsiTGMiF5ezL+hymk2c+WQ+uTmcIwdIDs/m/QyIL/ydplmFDh6BkwvBFrkPKeaFVHVqH52B5Ds4e8IJchMuQy78fOFLcGOkCrPqEOvUMdNldiIwfitYtq6GdaE0mPmaCY1pXwYBjFfzwX00A0xlBSEyBrcVqsQfB4WvlVYxc= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=PV5VHovXIN/xnbusAtXjRkNGjD6IEgdNd3aPlDoAG3kzQToha6ZaLp05exj9JK4f7xvwjFPOoDVwmr19wHrDz1jGIgo1em8p+KFEG6fMqjX5Qr28Kjmhy427qFDsxs+mSv/Q/A/8XCgpqcH+QvqyVI6uzsH2tR8JH/PXJcgloJEHfsYsGfCiCtVc8RURc72gQIYedjtGeNwasZlYH2usAFHjTQyVMPnxLwHAUkERAs2dki+LFwecL6Fs0HdJJAyjBl3m+ebr7RoSI9PYbT20OTSQ/s0svG0nWKeOIh9btsJQL2eUNzGTV0jLvCE3/jeZA8rTk7fLXPFnItu2WF9SPQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QDiPXZDWdKXhm9mT2X0BFLwQ8h8yrOEoWZ4tPgS1UiU=; b=t19pqZCIsUR/iDLup8k48AoYNuBk2o7gU5WxP+zGRakbXzQuvWC3yOwQq4LspYtwluwA9iL3rEAf1j9+/zIEE7qxZ/BJpL5JsYLw82stZ9KJnT7+/4rI1vHHnwfAxGoacXzzlGgtpFJMCtmxcCCwfs7hRDqn3JXUZk2UZs/Q8FhWYOGuNWhSiMVmY9o28p1AuHRo1F9i/pqzTqjdr4gzlGjHHyf359kSCM1MY02haJ3lg5O2TI9gRhYz+vLm33dwSh1IBksTrev7lQrf1NJuKD+4PefocWxoRnN5L5hGLUVQHo30UgWmWviRS/w5LYYI5xbE565L/HIwTb1N76RQ8Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QDiPXZDWdKXhm9mT2X0BFLwQ8h8yrOEoWZ4tPgS1UiU=; b=fFamOBZF23WQSpxsGBl3vxWInrgJHxPX6dbszzSdHi4bwWZUHHIsj50EyUorBza3gB+EtERVX1svWPbNN+c/o59CrsX5mbwtOvpwwgwWA/op/NU742tFE9F+lgJ80ScCpMgCWQul9/dzDySVdTb3AI5V9KDQz+C6lX5aUim61DU= Received: from DB8PR04CA0012.eurprd04.prod.outlook.com (2603:10a6:10:110::22) by PR3PR08MB5866.eurprd08.prod.outlook.com (2603:10a6:102:85::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.24; Fri, 18 Oct 2024 15:13:07 +0000 Received: from DU6PEPF00009526.eurprd02.prod.outlook.com (2603:10a6:10:110:cafe::e7) by DB8PR04CA0012.outlook.office365.com (2603:10a6:10:110::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.21 via Frontend Transport; Fri, 18 Oct 2024 15:13:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF00009526.mail.protection.outlook.com (10.167.8.7) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8069.17 via Frontend Transport; Fri, 18 Oct 2024 15:13:07 +0000 Received: ("Tessian outbound cd6aa7fa963a:v473"); Fri, 18 Oct 2024 15:13:06 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 070d92e60f841cc4 X-TessianGatewayMetadata: 3le+QscVpLYjl2vsBO6K6O+FwFytsI27XQ88gtd7y1ReQyGSG2Dp/dfMlSa1+oeX7Ga8O03V9q0QbPr+mqGPeia4vjfdUhMO63qzzDuLxx5m+h/nzN9a36vHvTVLn7rble2yjGYY8JbveR6+GfgVTXnenfE0HQUWo3z+8XiQxh8= X-CR-MTA-TID: 64aa7808 Received: from L9b6eff03156f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 34454EBD-7635-4307-9968-36A45C6AC730.1; Fri, 18 Oct 2024 15:13:00 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L9b6eff03156f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 18 Oct 2024 15:13:00 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=E6sQ2RjIkJKCKtAJBOMoPh2131LEHObMscc0/8KOPx+zEUsRYthKAquubdyGcEFE7plHh34o1YMiw2wxzxPDk5AgUjyULsXHeRMOtgvxwH6cS3u8VSOG87G/O3p9EHqvTlpOMMAP9aam+x9+QYIMk5QWeE2DDeaJNogEwclhsX0iDPb0cFe92xUg7du5akz8ZVKS6tIV0TrESauBfMZ5USSADP6USYwTAP/VcJSk3CSYCrjqY9L1aVunof/AYilKmxZ4mUYP8ruSpDdt8JapPBSjiTeCqZJPutyeuOaz3OWWHl8eWGWs3EcNUYqHSfG/pokMe48fWMzBAeYNuLoeOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QDiPXZDWdKXhm9mT2X0BFLwQ8h8yrOEoWZ4tPgS1UiU=; b=ZLSXiVmxdj1NFtnV0iojjjsmddCw08LQnudIrQDLiyGKYiTeLtL/lJOnjte4a5UIIH3U7990iW5yxWTb1Q7GdaZMFEDp6ZSwc8z/4/Zj/eVYF9aksRWjcmCLD7YB2D/8Tw5z/B1O6E1piREpbcA7woqKA3q7ykx/WDBiGxs50i0Y8a7eWcOqeTzPDMb+MrSaZPS3ZDrt4dj0k2EjWQfBJDSRRIDArwffwQYeAtXytve6JPFiEkvKB49103apgki0BsEa+ZgPa+/qW8dUD6JPiN3gHiXZDM6ibou9qxgNBR7EA4iq06rxXRz+/PgD5BYaNCZT5QYkiStEbT1mEPYaOg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QDiPXZDWdKXhm9mT2X0BFLwQ8h8yrOEoWZ4tPgS1UiU=; b=fFamOBZF23WQSpxsGBl3vxWInrgJHxPX6dbszzSdHi4bwWZUHHIsj50EyUorBza3gB+EtERVX1svWPbNN+c/o59CrsX5mbwtOvpwwgwWA/op/NU742tFE9F+lgJ80ScCpMgCWQul9/dzDySVdTb3AI5V9KDQz+C6lX5aUim61DU= Received: from AS4P189CA0067.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:659::25) by AS8PR08MB9043.eurprd08.prod.outlook.com (2603:10a6:20b:5c1::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.18; Fri, 18 Oct 2024 15:12:54 +0000 Received: from AMS0EPF000001A5.eurprd05.prod.outlook.com (2603:10a6:20b:659:cafe::28) by AS4P189CA0067.outlook.office365.com (2603:10a6:20b:659::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.24 via Frontend Transport; Fri, 18 Oct 2024 15:12:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF000001A5.mail.protection.outlook.com (10.167.16.232) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8069.17 via Frontend Transport; Fri, 18 Oct 2024 15:12:54 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 18 Oct 2024 15:12:53 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 18 Oct 2024 15:12:53 +0000 Received: from ip-10-248-139-139.eu-west-1.compute.internal (10.252.78.54) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39 via Frontend Transport; Fri, 18 Oct 2024 15:12:53 +0000 From: Akram Ahmad To: CC: Akram Ahmad Subject: [PATCH 2/2] aarch64: Use standard names for SVE saturating arithmetic Date: Fri, 18 Oct 2024 15:12:19 +0000 Message-ID: <20241018151219.308512-3-Akram.Ahmad@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241018151219.308512-1-Akram.Ahmad@arm.com> References: <20241018151219.308512-1-Akram.Ahmad@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF000001A5:EE_|AS8PR08MB9043:EE_|DU6PEPF00009526:EE_|PR3PR08MB5866:EE_ X-MS-Office365-Filtering-Correlation-Id: 17a7248f-733c-4782-0044-08dcef875edb x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|376014|30052699003|82310400026|1800799024; X-Microsoft-Antispam-Message-Info-Original: qrYTA9M76Xwmss1af4Qx3lGYh7/ZtoIN16Qcoz+SNAa1PxlqQ2d+OefiHsN2dlsqXyEItbj4lH6kcNP9ekQe11yk3o1mx1xz1wAmCYzB8Npu3hEKX3yx6v93MEjWSpIbmslea0j8CKMONoiQeCqTLpKWZ7sKMx96gTxTDHdPgSjJroImruo66EIzwo+wATmR+dK0+I+LKpUEckJ3zyXoHTbcqCtUNTOJUbu3qUzITFjQV3CRgeLOWuOj2ZJ4sIbhAsXs2sQn1Fq5jpEck1GO4OJ3T4uWV4OhLraPxz32B42TmTtpi+DlD/fQwEyFvWQUDOlFu+GK90CIHNtInAZLdqgSa9KZLcJocxO8cNjxKLwFwrG4x64bKlcdiGWN6D72GIyETsUto+vFubo+hikSC7yno8AWbtvucLK+dCHynV6NQE9gE0dajsjeaVsTdiMKO0t/p4PDlGgFvpHfIeLmUwfyVGk3bstGhQ9wJjlCoRD8hepElhWxmTBbVhb1cHLoUE2a7pz5H9ncCCvyr1cUABPo/fgs4vFk7FhMaLd6fXCzQo6j9KdYNtv99XwnK4jbG21FgeW+3+E4fP3jDLTFyV/dmUq9o0khkJOZ3K1anOOHUbgXbc7SIXiHMoQ59sd37pUAzrnvPbfjC87F3rY9G8k8Nq/X3EdIHSIW7bGfjcM+bjrBhS607/v8W/O7nWP4kUWCXENiSZ1SUcaqwLVr+9DPNkoLaEFvH3Z4m1X9Iwsy82EHBHCOP3L0ICO7RztjW2Cj9JTkkOXpqHdcigTckEB/aFpsctfONeNx0qpoRrkNGJ2dWxIjrrLrz8CLkXFnyx53bH4y3IYmT3N0ek39qLfy3eAc93pH3q7a9sbR2jfAC4rw2IT89VM+CgSYUXk+AWaX7JDb3I3KTV4W5ssxGjMjx6LIJwpCoNBi+h2nN6zpFPPZyl9iwQiplAb1EFPCF6qH8Ikz+tUy6RTEOV8u0ORKoKvhZ1qWpJQ90Q3DaSaOrSiFyTzyaf0ijiifVGxj3SCtNMzdU/C/Z4lEOuD6TWcDUEEAJJuKeswZ9kRsAellg6fAjpTwQtIyWLXXq+Q2IGpFSk3DqoMkpvUAbABpjTd45t88ZUvfa97mgq+Eq0MQORFuIOqD9/am1sKcZOiFoOWPtkYL1X0c4szIr9ouTD3/3zf0zH/JdrRImJewsB3pPAFiSqW3VSU4JB5acbPKwYtd8scZ7hJBxfP1Khk6LhuA7NvPJL9mCBLTi7EbdqvEjT5ldBaN0AiwCw+VkYR6+1fp0xYYRK8eOoDsZMY7I/gNdqjQbXg9D0BNZNWAWK4NALZF3CgLjaZhl5QPqD/7zI0EadJzdqGfixZlol22zcxNMgvL2M99FMnvth/kMJH9V5HZQz0Ahx+Po43eyhQAt3uFu7QRWWvULchP98au8Q== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(376014)(30052699003)(82310400026)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9043 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:659::25]; domain=AS4P189CA0067.EURP189.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF00009526.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b58e965f-7bf1-4041-5315-08dcef875744 X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|376014|82310400026|1800799024|35042699022|30052699003; X-Microsoft-Antispam-Message-Info: xW0iLySgtMRHRU4UqcmWZBoz5MVyP2qShFj2w45d1cRBdZzgelZdLVc+413bgcL0vqie6M965qMu5WrGsBhso9adftsV4bJzaqPeI9VfeddEsU8xZZ9x4ZGok9b02eoOyTcfMKDwKpQYDtRr7m8yuXXMOvFm0ejsZdoNP2+BBkLGDIp5XUs4jB0wZ2FF4ICJBW1M4pi1ZhW1/TlWu1FOZ/dmjz7f2GntLLKHgjFeOOBh/ar3l+BdMP3Eoe+vZxwjLbmUJEyM4PFRL8ShsIsLXyJHiwg1PXCsa/NXbeZLiw+hdwKO+lo8K/tkjaQUXIqIvt0JpH9L0WeLtmU49xKmlNydG3l/oGh/WWyRC5xBK4O86tXdbH9/TknbxE7uVZFcz9oqnVjgNEgUfOevauDoe9ushSksHLlx2CRfbuLbYEFaGuc9+ePWmOmMETT2n9udZbiXvz9Bnfqe/Z1xufwT83u6a2GCphHejDw6XJvrC4dehQSO+B58b1e7Rt05Wby1r3hKqj3F/fKyuWh7g/vsWERY8O33sWrrN5DuoMFyofDTtb8IUOgMFPkNxhqCZswsdtP7IM2IS8ECfV+9qvTfQyjoT+b9TvyATmW5W63R0u7bNehtLXLQLNkRg0ZWIiMLajGDYLrOl5uD6Srrdg4wvQJLLk7ZQqsWw/7bdciyfAL1Svt4+m7VKzA4oZZ1c9h3Q8b8afqaVs0IYKZ9izZ5Pgi1G4fmrjprydB9zEbOWAlhYAAgeV9yuYgAypTu0ME/GW6SFCimsbUzigx2oufn+V2cfVl4uR7l220Hr1Pc+hgoBj2TCET8QzpAKBjG6pzEjW9ER+Zy6el0K/eER9DsBuTcu5POO7qsPCSPr1lA4oG8w4d5r4fd7hVKYonz4Px5SCS9EXB5yWqfwHOdynwLBmah+5vARKF9rwcN7k63WUWHUiYtlAQdvh6RSGKj5JrHb9yXAZ4wEozlA+RRF1b9nD+nCptAxzp/Q+h3i4KwCbamINtUW2RWVTo6jgZqVCA/xkPLytN+PMY0tmIcmd6X7FpC51SGNp1Q0cN0QrxqQ2tTtdlEis57zppXwkeyh62F+sgeUqur58WQvDgq+e7W+RMfINqkieM6NVJkf7oPHWfSkgX910mhAPc5YS8QkzcfkdD+PGihAROJUsmh0BwuPF7Z6Wja9we7/nh2351Yj262z9UkArZcZlNa7/Io0ud/zzK3kbaWP0Vac5Xt3cQnCcKeFeKPpoxCZMlXiQgqlesgQEZUMgLBe02UhSj2VzPrcCIQB+nkbNRPyQAphn60KifqFBW75gC6sW6URCcfbu8TgHdJ7h62KHapXZ947g75Jfq6xJhEvOjQ/rWQMfjH3DkOk5I5vGtZZGqC5BFBAGQL2OzgMrehiDAR1xClTCEackGY0dBrdxPaZiJuLlG3Sw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(36860700013)(376014)(82310400026)(1800799024)(35042699022)(30052699003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2024 15:13:07.0801 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 17a7248f-733c-4782-0044-08dcef875edb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF00009526.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR08MB5866 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Rename the existing SVE unpredicated saturating arithmetic instructions to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB. gcc/ChangeLog: * config/aarch64/aarch64-sve.md: Rename insns gcc/testsuite/ChangeLog: * gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc: Template file for auto-vectorizer tests. * gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c: Instantiate 8-bit vector tests. * gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_2.c: Instantiate 16-bit vector tests. * gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_3.c: Instantiate 32-bit vector tests. * gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_4.c: Instantiate 64-bit vector tests. --- gcc/config/aarch64/aarch64-sve.md | 4 +- .../aarch64/sve/saturating_arithmetic.inc | 68 +++++++++++++++++++ .../aarch64/sve/saturating_arithmetic_1.c | 60 ++++++++++++++++ .../aarch64/sve/saturating_arithmetic_2.c | 60 ++++++++++++++++ .../aarch64/sve/saturating_arithmetic_3.c | 62 +++++++++++++++++ .../aarch64/sve/saturating_arithmetic_4.c | 62 +++++++++++++++++ 6 files changed, 314 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_4.c diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 06bd3e4bb2c..b987b292b20 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -4379,7 +4379,7 @@ ;; ------------------------------------------------------------------------- ;; Unpredicated saturating signed addition and subtraction. -(define_insn "@aarch64_sve_" +(define_insn "s3" [(set (match_operand:SVE_FULL_I 0 "register_operand") (SBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") @@ -4395,7 +4395,7 @@ ) ;; Unpredicated saturating unsigned addition and subtraction. -(define_insn "@aarch64_sve_" +(define_insn "s3" [(set (match_operand:SVE_FULL_I 0 "register_operand") (UBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc new file mode 100644 index 00000000000..0b3ebbcb0d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc @@ -0,0 +1,68 @@ +/* Template file for vector saturating arithmetic validation. + + This file defines saturating addition and subtraction functions for a given + scalar type, testing the auto-vectorization of these two operators. This + type, along with the corresponding minimum and maximum values for that type, + must be defined by any test file which includes this template file. */ + +#ifndef SAT_ARIT_AUTOVEC_INC +#define SAT_ARIT_AUTOVEC_INC + +#include +#include + +#ifndef UT +#define UT uint32_t +#define UMAX UINT_MAX +#define UMIN 0 +#endif + +void uaddq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] + b[i]; + out[i] = sum < a[i] ? UMAX : sum; + } +} + +void uaddq2 (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum; + if (!__builtin_add_overflow(a[i], b[i], &sum)) + out[i] = sum; + else + out[i] = UMAX; + } +} + +void uaddq_imm (UT *out, UT *a, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] + 50; + out[i] = sum < a[i] ? UMAX : sum; + } +} + +void usubq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] - b[i]; + out[i] = sum > a[i] ? UMIN : sum; + } +} + +void usubq_imm (UT *out, UT *a, int n) +{ + for (int i = 0; i < n; i++) + { + UT sum = a[i] - 50; + out[i] = sum > a[i] ? UMIN : sum; + } +} + +#endif \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c new file mode 100644 index 00000000000..6936e9a2704 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c @@ -0,0 +1,60 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uaddq: +** ... +** ld1b\tz([0-9]+)\.b, .* +** ld1b\tz([0-9]+)\.b, .* +** uqadd\tz\2.b, z\1\.b, z\2\.b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\4, b\3, b\4 +** ... +*/ +/* +** uaddq2: +** ... +** ld1b\tz([0-9]+)\.b, .* +** ld1b\tz([0-9]+)\.b, .* +** uqadd\tz\2.b, z\1\.b, z\2\.b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\4, b\3, b\4 +** ... +*/ +/* +** uaddq_imm: +** ... +** ld1b\tz([0-9]+)\.b, .* +** uqadd\tz\1.b, z\1\.b, #50 +** ... +** movi\tv([0-9]+)\.8b, 0x32 +** ... +** ldr\tb([0-9]+), .* +** uqadd\tb\3, b\3, b\2 +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ld1b\tz([0-9]+)\.b, .* +** ld1b\tz([0-9]+)\.b, .* +** uqsub\tz\2.b, z\1\.b, z\2\.b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqsub\tb\4, b\3, b\4 +** ... +*/ + +#include + +#define UT unsigned char +#define UMAX UCHAR_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_2.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_2.c new file mode 100644 index 00000000000..928bc0054df --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_2.c @@ -0,0 +1,60 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uaddq: +** ... +** ld1h\tz([0-9]+)\.h, .* +** ld1h\tz([0-9]+)\.h, .* +** uqadd\tz\2.h, z\1\.h, z\2\.h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqadd\th\4, h\3, h\4 +** ... +*/ +/* +** uaddq2: +** ... +** ld1h\tz([0-9]+)\.h, .* +** ld1h\tz([0-9]+)\.h, .* +** uqadd\tz\2.h, z\1\.h, z\2\.h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** uqadd\th\4, h\3, h\4 +** ... +*/ +/* +** uaddq_imm: +** ... +** ld1h\tz([0-9]+)\.h, .* +** uqadd\tz\1.h, z\1\.h, #50 +** ... +** movi\tv([0-9]+)\.4h, 0x32 +** ... +** ldr\th([0-9]+), .* +** uqadd\th\3, h\3, h\2 +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ld1h\tz([0-9]+)\.h, .* +** ld1h\tz([0-9]+)\.h, .* +** usubq\tz\2.h, z\1\.h, z\2\.h +** ... +** ldr\th([0-9]+), .* +** ldr\th([0-9]+), .* +** usubq\th\4, h\3, h\4 +** ... +*/ + +#include + +#define UT unsigned short +#define UMAX USHRT_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_3.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_3.c new file mode 100644 index 00000000000..14e2de59b1e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_3.c @@ -0,0 +1,62 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uaddq: +** ... +** ld1w\tz([0-9]+)\.s, .* +** ld1w\tz([0-9]+)\.s, .* +** uqadd\tz\2.s, z\1\.s, z\2\.s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\3, w\3, w\4 +** csinv\tw\3, w\3, wzr, cc +** ... +*/ +/* +** uaddq2: +** ... +** ld1w\tz([0-9]+)\.s, .* +** ld1w\tz([0-9]+)\.s, .* +** uqadd\tz\2.s, z\1\.s, z\2\.s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** adds\tw\3, w\3, w\4 +** csinv\tw\3, w\3, wzr, cc +** ... +*/ +/* +** uaddq_imm: +** ... +** ld1w\tz([0-9]+)\.s, .* +** uqadd\tz\1.s, z\1\.s, #50 +** ... +** ldr\tw([0-9]+), .* +** adds\tw\2, w\2, #50 +** csinv\tw\2, w\2, wzr, cc +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ld1w\tz([0-9]+)\.s, .* +** ld1w\tz([0-9]+)\.s, .* +** uqsub\tz\2.s, z\1\.s, z\2\.s +** ... +** ldr\tw([0-9]+), .* +** ldr\tw([0-9]+), .* +** subs\tw\3, w\3, w\4 +** csel\tw\3, w\3, wzr, cs +** ... +*/ + +#include + +#define UT unsigned int +#define UMAX UINT_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_4.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_4.c new file mode 100644 index 00000000000..05a5786b4ab --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_4.c @@ -0,0 +1,62 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uaddq: +** ... +** ld1d\tz([0-9]+)\.d, .* +** ld1d\tz([0-9]+)\.d, .* +** uqadd\tz\2.d, z\1\.d, z\2\.d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\3, x\3, x\4 +** csinv\tx\3, x\3, xzr, cc +** ... +*/ +/* +** uaddq2: +** ... +** ld1d\tz([0-9]+)\.d, .* +** ld1d\tz([0-9]+)\.d, .* +** uqadd\tz\2.d, z\1\.d, z\2\.d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** adds\tx\3, x\3, x\4 +** csinv\tx\3, x\3, xzr, cc +** ... +*/ +/* +** uaddq_imm: +** ... +** ld1d\tz([0-9]+)\.d, .* +** uqadd\tz\1.d, z\1\.d, #50 +** ... +** ldr\tx([0-9]+), .* +** adds\tx\2, x\2, #50 +** csinv\tx\2, x\2, xzr, cc +** ... +*/ +/* +** usubq: { xfail *-*-* } +** ... +** ld1d\tz([0-9]+)\.d, .* +** ld1d\tz([0-9]+)\.d, .* +** uqsub\tz\2.d, z\1\.d, z\2\.d +** ... +** ldr\tx([0-9]+), .* +** ldr\tx([0-9]+), .* +** subs\tx\3, x\3, x\4 +** csel\tx\3, x\3, xzr, cs +** ... +*/ + +#include + +#define UT unsigned long +#define UMAX ULONG_MAX +#define UMIN 0 + +#include "saturating_arithmetic.inc" \ No newline at end of file