From patchwork Wed Jul 10 14:05:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Victor Do Nascimento X-Patchwork-Id: 1958857 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=q9A4fAzO; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=q9A4fAzO; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WK0B56pj6z1xqc for ; Thu, 11 Jul 2024 00:10:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A5B538708FD for ; Wed, 10 Jul 2024 14:10:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:2607::600]) by sourceware.org (Postfix) with ESMTPS id 9188E3870850 for ; Wed, 10 Jul 2024 14:07:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9188E3870850 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9188E3870850 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2607::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1720620476; cv=pass; b=CERDQrHcnBzoHyzZr6R+II3Wdd9vF1AlRNdJxKi6z0CulKnjnTeNPU/ajxAj9UZEW7d0+h4VNmiB6e2npXAknJsu+nt4SQIOmh4djGLTyoYVrGiTMyWuv/8579X+5u/jPpKYsgW1egYpfR7CcSmGZ5DjQPJ4V6LZwmwlJyOq2UI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1720620476; c=relaxed/simple; bh=8taW8u6KkzTLUMmtokk6iosTUAB7lIRks4fJSsrq9tM=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=sBIPF9jsw7bWRAyGykbWXPWGHhcyRMa/k7imLsw1oTv3D/ADxmcWYWinZYRUWWcmWK5XY3wa96Siw9ZL8A/n7Bn1KbDdyMIYVd0zN93BhGtYpRG+V9NeW70Iyn8uWqiAArU2hNAfDG3p8IykrHZsFcaJgPCppOxpBWespsf/u70= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=YB7++KFcXlab92mgbX1uljIfkYF49/9mmBmLkhEr/Qb3udzHja6HWlepM9AKxUZZFlwSoylN+VHRX3KhxeQvn4byERmJX2xQhaNC2XaZ7VuHErwgkcR8TGzZ0GCGlwW4LDpPYEMYZuDdMkCHMAK1Pu45/ur0QwKt56ziJLMz9i6LODq4n8abtEkctt3DJEl9lH2/9MzgxsdY7TJ6TaAry57PzRdVTp4iriliEq0O/qzIDPsyuH7KELrUDHblbwTKXwDthOSJmy2DL9VMsPKuFcul8z0JElvCzgBIAPVU0aYIPZMiGKuP5IDjs8RBHJJ5eNZRmRem4B8nUGx7N4lWOA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UJkwDBaEiiuEe95zZhPmxQ0se1bffISv3UcQ2gva8ek=; b=lU7flBjXpLxEVDUNePlP2niy6dj6Kfg5yW1poLD0wDmyX1WzOnwof1558MsDxiKc5CKbMAPC6Tcsse+Igb4xuaygewJxv82mEObJFttm1zGH7md7MtWJ+GaiaDj8HJdcDLsvGzz9kW5H/HaAuWgxrnfO+XjUBvMOjxu+SmQQJ8j9vGQ0Dd/ZUSG3RIUZecdvVQJrV3viQiYVd5g4gaK1l/aqJgeZeZM63ySm5lO6HxmQWs7zuV/yrSk1TLzpvWlQiFyuwzznjLFWfbmMpS6eggCwqxqbjQo95ZEuThqjM7IDUY56AUsrkbPcwssuW8IPLnDQ35sxyFgADixX/Dhk4A== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UJkwDBaEiiuEe95zZhPmxQ0se1bffISv3UcQ2gva8ek=; b=q9A4fAzOKj50DF+SNzSOQW3Lmcbjd7HSNKFrQRYSOMmCqu+LErf97MzSrbm/9aw4QQZ2Q3Kri8v7Yc2sBuW9NLgdbSxTjSXMzgZ+kfl2f0cANdh/BNeIstw2QoJn9D33/EinIphUR8UZB+GFg53uY0DPtLI3yrPmmUmFC8dqJyM= Received: from AM9P195CA0006.EURP195.PROD.OUTLOOK.COM (2603:10a6:20b:21f::11) by GV1PR08MB10600.eurprd08.prod.outlook.com (2603:10a6:150:15e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.20; Wed, 10 Jul 2024 14:07:39 +0000 Received: from AMS0EPF00000196.eurprd05.prod.outlook.com (2603:10a6:20b:21f:cafe::8) by AM9P195CA0006.outlook.office365.com (2603:10a6:20b:21f::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.20 via Frontend Transport; Wed, 10 Jul 2024 14:07:39 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF00000196.mail.protection.outlook.com (10.167.16.217) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7762.17 via Frontend Transport; Wed, 10 Jul 2024 14:07:39 +0000 Received: ("Tessian outbound 1bd2b4f45798:v359"); Wed, 10 Jul 2024 14:07:39 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: fa15f72eb6a5fe84 X-CR-MTA-TID: 64aa7808 Received: from b4d0c7953aa5.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 359BBDD9-3956-410A-810F-48DCF6A480C1.1; Wed, 10 Jul 2024 14:07:32 +0000 Received: from EUR03-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b4d0c7953aa5.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 10 Jul 2024 14:07:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I01eWV+15vO7Htc+sZGMaC3COe3faTXu5Ql+9EAMQs4ZvKQ0ipgW6kNtS3iqSDmjCfM9MaVpRJLykKiYpEd79znh1juCh0g9EtdYSkm4uqtlxDKMabrx4CdnJpro/sutgvelcTrbx7/rTnSrQomeK7/72MM94INS36YOw8aXZ8w45RU7Esm+zxDBlmlW1oX5rajdtr9l1Nd2U1OIQSJ/4PNXVyQarKszLWdhAOwapz9q8eU1awC5kNODAGsWZEWL9nGb74c5gbj14OFkAUxqehP8JUvkMkCmkJpAZk+YlV21ihJtuyDOKqVTHhomI13li8jpMBJklFgGM37K8hIgqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UJkwDBaEiiuEe95zZhPmxQ0se1bffISv3UcQ2gva8ek=; b=jKSJ+y/gzYdzNXVWDNsoeQnFJ2kD5Ys+5TyLImGj8c5u/K0DFG5dJ2wVQtH2dClSDKSyyIhtvO6DWay9t2pDx3R84mIFQgsRJc0TSKv9hLmzpPBmYtF6YdQbNZ53icolTZihEeRBCbp2um2WlvQgA6xzcCdhUXzxqESEo5yQnBNbrBqEMZxuegIkfnLGwbnyjy/+l8tLLbgQ58EC9R5OqQdbuuIQWQSiO8jbRLqJDgyBvFLOn3vnvENhNHY4aFtFY4KVOELtCiX9z1nM/8xqEav2J/z2gy5f80HMrJzEyDArFCVyzJoXlVspbKlI7ysOK7rsYz7jbf6vMNl5H+8J7w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UJkwDBaEiiuEe95zZhPmxQ0se1bffISv3UcQ2gva8ek=; b=q9A4fAzOKj50DF+SNzSOQW3Lmcbjd7HSNKFrQRYSOMmCqu+LErf97MzSrbm/9aw4QQZ2Q3Kri8v7Yc2sBuW9NLgdbSxTjSXMzgZ+kfl2f0cANdh/BNeIstw2QoJn9D33/EinIphUR8UZB+GFg53uY0DPtLI3yrPmmUmFC8dqJyM= Received: from AM5PR1001CA0042.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:206:15::19) by DB9PR08MB8579.eurprd08.prod.outlook.com (2603:10a6:10:3d4::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.20; Wed, 10 Jul 2024 14:07:29 +0000 Received: from AMS0EPF0000019A.eurprd05.prod.outlook.com (2603:10a6:206:15:cafe::46) by AM5PR1001CA0042.outlook.office365.com (2603:10a6:206:15::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.37 via Frontend Transport; Wed, 10 Jul 2024 14:07:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF0000019A.mail.protection.outlook.com (10.167.16.246) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7762.17 via Frontend Transport; Wed, 10 Jul 2024 14:07:29 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 10 Jul 2024 14:07:24 +0000 Received: from e133397.arm.com (10.57.7.170) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.35 via Frontend Transport; Wed, 10 Jul 2024 14:07:24 +0000 From: Victor Do Nascimento To: CC: , , "Victor Do Nascimento" Subject: [PATCH 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets Date: Wed, 10 Jul 2024 15:05:58 +0100 Message-ID: <20240710140602.1707875-6-victor.donascimento@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240710140602.1707875-1-victor.donascimento@arm.com> References: <20240710140602.1707875-1-victor.donascimento@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF0000019A:EE_|DB9PR08MB8579:EE_|AMS0EPF00000196:EE_|GV1PR08MB10600:EE_ X-MS-Office365-Filtering-Correlation-Id: 9b868a89-4913-4a7c-64e4-08dca0e9a84d x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|36860700013|376014|1800799024; X-Microsoft-Antispam-Message-Info-Original: kVQJq03jEpKHIScjewWMIJcZw8bBAWkW71hjt8T0K0N84tLaWhJr1+uf+gfRZ/lVrP/2G2iWw9lvKq3XacUempgyX59DuRXRG+NgzljI0/X1mA167yCaHqYw4/HE5A+wqK8ddFNUAr4bpopsIpWTxvH/tcihU5sfH2xMf2eshtRQ+FXIOQiLhFtMIwFfwrDKEik0yfeqSNsZfozw6gYt9Ug0uQZog/m7d/lzAKhNEFX1GsHcBdgHCxba/TCFjqePtObYv1orPYGwtIebC7kbl0zJg4jZGpE4M1ANvrK3Brwwyn673/qRK9l+tZHuVc32zdHbtxGvvViN3CkKkcHPs1Ynpwf56Gtzhzy5ioVps0JYX8q+XQlrMHwRwaQA3e/1BLaLht7m9JMGjSHhqBI51kciMdkkYPMYGc1cd+uenYwCdwEU2PcfXkZ2dznCLBUh4Kianhgom6x4D9XfE7BU9dHh0QLDV14ezOKjPLvPog9mxG8T7gmEsrKMVTDMWSPx1F6OxNpk5hVK5j922r+HTiaqHajhJ2CAJRI6zDdhg4Ztn1mGUvJX+hQM1X1wRSj1osnSWUBPOkKpnk/sT/o+2ay9UfV13YZHKKlkGy5qG1o4tIJG/6lECCaFOHdR+Fjs7YpX2UoHNP4FtR5CKhciYk2ytCv3exqUNYfiWuEm7pyubpvvuPpW440g0rrtGuXkp/IzKLHdpqG1R7jGEe0pcTI4plWVpYgOX+5Ev31s9omvwdQmtDVPkW36qzSEUWRpa7dimNbRgxkfAa0w0fJk8xZY/ZorqMNXJX1qbU6E0/b7e7y65PD0/vBPmQKjUNXKAi9txXuSae2W9DmLcfm1vwjdyYkxr0XKgkoamOQcdsheHbzzIpCgwNdySroaiCvBAHAko89EU/WjeOPhFY6Vxh3lnt7KhWU+i5ZgIduNlBaZM5rhnu5tFff1F0XfdJtea/BmgCCsp5buiVpUkvG5+/YQU9HpLJAchVNXF1Fh7GcdQsWnEyryr67nBpmTHAOKtDjAaYCkWgUuzeMSyBanAm3eX/JBqB8abH492tBdks5ML2K/A6hW0n80r9QAF+MYyeAvrVpqLzVNENZfd+7FtCcUi0b380LAvxWgxI9gpx5dLz/xNsSH/Kxm5Y/fOl7nHL3aVNmmJNCx++cYy6TYCTudq5F6lqCiS5DrefwsKVBgL8UPyxHsm8Rtsv77aK+tLIdxpnwekzb/L3HTZm6/t/nBdNqC6UuyapKlRP1SKZn8YXIzf1gGBiwUGyiv1JhH03C6X7OGL/SkOEI6Fbq4FXBSb8JQ6lX5tbEkLWc+sioxAGBxc60R/YrbA8XhVruubhUvfObuvVIME0X4P5TTfThwP67JVRbna74Sh19Iut3uFD46x4y5zbDu8PeOFvKJDZa8oRcXF9mMb5xM15H+WO9zzN5LyxvPayvF4wOsjGnE98h9uVpFYYo5hi/5D0fA X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(376014)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8579 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF00000196.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3524ab6c-21e3-43e2-ae2c-08dca0e9a2c0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|82310400026|35042699022|1800799024|36860700013; X-Microsoft-Antispam-Message-Info: 4aJip26v6Ro8KO6+xlDD4JcLRJJyLRYf2UfpR4YOezkOt2IXkG3dTLpnsYRpVllz/CP8JiclkvwPKz0ekvXFV7WZ6JPhCJNeSFT8z+3dOj/D8Sx5Ule2TjqyAWgoVt6zm5Lym07zvtld+V+m2RcFTGETn4uyOE86jseX3DxWdq+qo0IEvWqwANMyMp1maUSZucEfIqWZvV+gUno2Qfhn55gDVkE5nhEqgaTxmfl5Jfa+GCWzoF/0oES+g+vVXWH08eJB9O9QGm06740/nWCKyOHpsncZczHTHJ4Gwsws2Nb2eT8l0pgEkhc+nAYUUughVHAETIw+pZF/Ah8fDmw2fbMSfVpZ5BPIYB7BQUySFvzOEds/UZdh45SrJGJvMR7D1PMelVwWZWx1XTN9hYs6HZ9Atl/8SLgnthwfnxlyt1jcw8Wgz5s49LCoOeJWxs+GQu2sl3TE3LqzRfxDuX5h/vew7b1Ql7G/clPsbsIZocvj1rdB0qKyuYobOR/MHOAaX9jnnBJh8LOB4tKCkw7zBgWLnCzSf4VzgyIR7Fe46R1ijE1GGFtS7z0EZyW9/euRzXeMwdN45n2UX8N3Dhw8p1rycGf2uwB3wg1RDvXc+T7E68nS35EuCwMzoe2Su/T60F6QmoTuS/TR5JyRJjWRdn8g2+SW57Y4rS79WQMaE38mqIw3H2WTSt/cL7sQdeWDr2fbEUTnAmieCoRHLhnl7uznvzTsMxrdsYi9iJIifRJLEeQ9n7mwYoQWLAfCq1Pbpv/7MP6QXaPcZVIAwOaOTXK4dG5K76kppZRFsJzBEd0fLjR9k8oS44L2AilkbRHmvmumCMvOcMaAuPFlFuJc89p9neYdsM1GUnE7jy1pRsfEfRoEf2uKUr3aCNIIatMNSkJI//yV9VlrKdBC1PDpK54xXQtx00Psnzne0aALHU0A26pVJDvG+4GWD/57QKqePTi/JfLW5kDPY0I6QdsHzd8oAn8P8uxX0GLWp2EUnj343YVFMJ6cOhUDo3YIZiUhIYuJ9cckrv5U1G6UE1SLUnSIzLcf24taoKunvmLSdKFQSrYtfrKbn1DQErOvdcpGPJf97FExgf3wZVf1/VFb3mxAR5//RmenP4LwQagf5zgMqZ4FliD0IYuL29l3Fd8G5NJluy6F0Nl2sCXkFtSm+2Ni0prhSfXA6ghmyw+bMV/CmoSAlxKMRlE23emO4oRvHBnjk2zye7fLY7EVH6hjUtbXorxdk/+OWBFzur2MCRy/sGWaaJuSKhRmzaHnwxOt44mTSnp4gYYlUe8sGTCfxkPmKbvIKQTsceNMHqiN3/LOQGHKyzkEmkfOnGxkHmBK0GWfrWxY2woidz4EbjYK75oxyANqZDqQBVJLx6YLLrdZikvbat925iSRWIZiyiG34mNIMkZYIqW5wTyJnDQlZLujClYXqtGFOtoFZIlF2Y1MPXvBuLFCvfgZXQZIqqih X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(376014)(82310400026)(35042699022)(1800799024)(36860700013); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jul 2024 14:07:39.0455 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9b868a89-4913-4a7c-64e4-08dca0e9a84d X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF00000196.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10600 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/i386/mmx.md (usdot_prodv8qi): Deleted. (usdot_prodv2siv8qi): New. (sdot_prodv8qi): Deleted. (sdot_prodv2siv8qi): New. (udot_prodv8qi): Deleted. (udot_prodv2siv8qi): New. (usdot_prodv4hi): Deleted. (usdot_prodv2siv4hi): New. (udot_prodv4hi): Deleted. (udot_prodv2siv4hi): New. (sdot_prodv4hi): Deleted. (sdot_prodv2siv4hi): New. * config/i386/sse.md (fourwayacc): New. (twowayacc): New. (sdot_prod): Deleted. (sdot_prod): New. (sdot_prodv4si): Deleted. (sdot_prodv2div4si): New. (usdot_prod): Deleted. (usdot_prod): New. (sdot_prod): Deleted. (sdot_prod): New. (sdot_prodv64qi): Deleted. (sdot_prodv16siv64qi): New. (udot_prod): Deleted. (udot_prod): New. (udot_prodv64qi): Deleted. (udot_prodv16qiv64qi): New. (usdot_prod): Deleted. (usdot_prod): New. (udot_prod): Deleted. (udot_prod): New. --- gcc/config/i386/mmx.md | 30 +++++++++++++-------------- gcc/config/i386/sse.md | 47 +++++++++++++++++++++++++----------------- 2 files changed, 43 insertions(+), 34 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 94d3a6e5692..d78739b033d 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -6344,7 +6344,7 @@ (define_expand "usadv8qi" DONE; }) -(define_expand "usdot_prodv8qi" +(define_expand "usdot_prodv2siv8qi" [(match_operand:V2SI 0 "register_operand") (match_operand:V8QI 1 "register_operand") (match_operand:V8QI 2 "register_operand") @@ -6363,7 +6363,7 @@ (define_expand "usdot_prodv8qi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_usdot_prodv16qi (op0, op1, op2, op3)); + emit_insn (gen_usdot_prodv4siv16qi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); } else @@ -6377,7 +6377,7 @@ (define_expand "usdot_prodv8qi" emit_move_insn (op3, CONST0_RTX (V4SImode)); emit_insn (gen_zero_extendv8qiv8hi2 (op1, operands[1])); emit_insn (gen_extendv8qiv8hi2 (op2, operands[2])); - emit_insn (gen_sdot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_sdot_prodv4siv8hi (op0, op1, op2, op3)); /* vec_perm (op0, 2, 3, 0, 1); */ emit_insn (gen_sse2_pshufd (op0_1, op0, GEN_INT (78))); @@ -6388,7 +6388,7 @@ (define_expand "usdot_prodv8qi" DONE; }) -(define_expand "sdot_prodv8qi" +(define_expand "sdot_prodv2siv8qi" [(match_operand:V2SI 0 "register_operand") (match_operand:V8QI 1 "register_operand") (match_operand:V8QI 2 "register_operand") @@ -6406,7 +6406,7 @@ (define_expand "sdot_prodv8qi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_sdot_prodv16qi (op0, op1, op2, op3)); + emit_insn (gen_sdot_prodv4siv16qi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); } else @@ -6420,7 +6420,7 @@ (define_expand "sdot_prodv8qi" emit_move_insn (op3, CONST0_RTX (V4SImode)); emit_insn (gen_extendv8qiv8hi2 (op1, operands[1])); emit_insn (gen_extendv8qiv8hi2 (op2, operands[2])); - emit_insn (gen_sdot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_sdot_prodv4siv8hi (op0, op1, op2, op3)); /* vec_perm (op0, 2, 3, 0, 1); */ emit_insn (gen_sse2_pshufd (op0_1, op0, GEN_INT (78))); @@ -6432,7 +6432,7 @@ (define_expand "sdot_prodv8qi" }) -(define_expand "udot_prodv8qi" +(define_expand "udot_prodv2siv8qi" [(match_operand:V2SI 0 "register_operand") (match_operand:V8QI 1 "register_operand") (match_operand:V8QI 2 "register_operand") @@ -6450,7 +6450,7 @@ (define_expand "udot_prodv8qi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_udot_prodv16qi (op0, op1, op2, op3)); + emit_insn (gen_udot_prodv4siv16qi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); } else @@ -6464,7 +6464,7 @@ (define_expand "udot_prodv8qi" emit_move_insn (op3, CONST0_RTX (V4SImode)); emit_insn (gen_zero_extendv8qiv8hi2 (op1, operands[1])); emit_insn (gen_zero_extendv8qiv8hi2 (op2, operands[2])); - emit_insn (gen_sdot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_sdot_prodv4siv8hi (op0, op1, op2, op3)); /* vec_perm (op0, 2, 3, 0, 1); */ emit_insn (gen_sse2_pshufd (op0_1, op0, GEN_INT (78))); @@ -6476,7 +6476,7 @@ (define_expand "udot_prodv8qi" }) -(define_expand "usdot_prodv4hi" +(define_expand "usdot_prodv2siv4hi" [(match_operand:V2SI 0 "register_operand") (match_operand:V4HI 1 "register_operand") (match_operand:V4HI 2 "register_operand") @@ -6492,12 +6492,12 @@ (define_expand "usdot_prodv4hi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_usdot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_usdot_prodv4siv8hi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); DONE; }) -(define_expand "udot_prodv4hi" +(define_expand "udot_prodv2siv4hi" [(match_operand:V2SI 0 "register_operand") (match_operand:V4HI 1 "register_operand") (match_operand:V4HI 2 "register_operand") @@ -6513,12 +6513,12 @@ (define_expand "udot_prodv4hi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_udot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_udot_prodv4siv8hi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); DONE; }) -(define_expand "sdot_prodv4hi" +(define_expand "sdot_prodv2siv4hi" [(match_operand:V2SI 0 "register_operand") (match_operand:V4HI 1 "register_operand") (match_operand:V4HI 2 "register_operand") @@ -6534,7 +6534,7 @@ (define_expand "sdot_prodv4hi" rtx op3 = lowpart_subreg (V4SImode, operands[3], V2SImode); rtx op0 = gen_reg_rtx (V4SImode); - emit_insn (gen_sdot_prodv8hi (op0, op1, op2, op3)); + emit_insn (gen_sdot_prodv4siv8hi (op0, op1, op2, op3)); emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); DONE; }) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index bda66d5e121..861b87bb50f 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1195,6 +1195,15 @@ (define_mode_attr ssexmmmode (V16SF "V4SF") (V8SF "V4SF") (V4SF "V4SF") (V8DF "V2DF") (V4DF "V2DF") (V2DF "V2DF")]) +;; Mapping of input type to 4-way accumulated type +(define_mode_attr fourwayacc + [(V64QI "v16si") (V32QI "v8si") (V16QI "v4si")]) + +;; Mapping of input type to 2-way accumulated type +(define_mode_attr twowayacc + [(V32HI "v16si") (V16HI "v8si") (V8HI "v4si") + (V32QI "v16hi") (V16QI "v8hi")]) + ;; Pointer size override for scalar modes (Intel asm dialect) (define_mode_attr iptr [(V64QI "b") (V32HI "w") (V16SI "k") (V8DI "q") @@ -16712,7 +16721,7 @@ (define_mode_attr SDOT_PMADD_SUF (define_mode_attr SDOT_VPDP_SUF [(V32HI "v16si") (V16HI "v8si") (V8HI "v4si")]) -(define_expand "sdot_prod" +(define_expand "sdot_prod" [(match_operand: 0 "register_operand") (match_operand:VI2_AVX512VNNIBW 1 "register_operand") (match_operand:VI2_AVX512VNNIBW 2 "register_operand") @@ -16747,7 +16756,7 @@ (define_expand "sdot_prod" ;; Normally we use widen_mul_even/odd, but combine can't quite get it all ;; back together when madd is available. -(define_expand "sdot_prodv4si" +(define_expand "sdot_prodv2div4si" [(match_operand:V2DI 0 "register_operand") (match_operand:V4SI 1 "register_operand") (match_operand:V4SI 2 "register_operand") @@ -30290,7 +30299,7 @@ (define_insn "vpshldv__maskz_1" [(set_attr ("prefix") ("evex")) (set_attr "mode" "")]) -(define_expand "usdot_prod" +(define_expand "usdot_prod" [(match_operand: 0 "register_operand") (match_operand:VI1_AVX512 1 "register_operand") (match_operand:VI1_AVX512 2 "register_operand") @@ -30328,9 +30337,9 @@ (define_expand "usdot_prod" rtx sum = gen_reg_rtx (mode); emit_move_insn (sum, CONST0_RTX (mode)); - emit_insn (gen_sdot_prod (res1, op1_lo, + emit_insn (gen_sdot_prod (res1, op1_lo, op2_lo, sum)); - emit_insn (gen_sdot_prod (res2, op1_hi, + emit_insn (gen_sdot_prod (res2, op1_hi, op2_hi, operands[3])); emit_insn (gen_add3 (operands[0], res1, res2)); } @@ -31149,7 +31158,7 @@ (define_int_attr vpdotprodtype (UNSPEC_VPDPBSUD "bsud") (UNSPEC_VPDPBSUDS "bsuds") (UNSPEC_VPDPBUUD "buud") (UNSPEC_VPDPBUUDS "buuds")]) -(define_expand "sdot_prod" +(define_expand "sdot_prod" [(match_operand: 0 "register_operand") (match_operand:VI1_AVX2 1 "register_operand") (match_operand:VI1_AVX2 2 "register_operand") @@ -31185,9 +31194,9 @@ (define_expand "sdot_prod" rtx sum = gen_reg_rtx (mode); emit_move_insn (sum, CONST0_RTX (mode)); - emit_insn (gen_sdot_prod (res1, op1_lo, + emit_insn (gen_sdot_prod (res1, op1_lo, op2_lo, sum)); - emit_insn (gen_sdot_prod (res2, op1_hi, + emit_insn (gen_sdot_prod (res2, op1_hi, op2_hi, operands[3])); emit_insn (gen_add3 (operands[0], res1, res2)); } @@ -31195,7 +31204,7 @@ (define_expand "sdot_prod" DONE; }) -(define_expand "sdot_prodv64qi" +(define_expand "sdot_prodv16siv64qi" [(match_operand:V16SI 0 "register_operand") (match_operand:V64QI 1 "register_operand") (match_operand:V64QI 2 "register_operand") @@ -31218,14 +31227,14 @@ (define_expand "sdot_prodv64qi" rtx sum = gen_reg_rtx (V16SImode); emit_move_insn (sum, CONST0_RTX (V16SImode)); - emit_insn (gen_sdot_prodv32hi (res1, op1_lo, op2_lo, sum)); - emit_insn (gen_sdot_prodv32hi (res2, op1_hi, op2_hi, operands[3])); + emit_insn (gen_sdot_prodv16siv32hi (res1, op1_lo, op2_lo, sum)); + emit_insn (gen_sdot_prodv16siv32hi (res2, op1_hi, op2_hi, operands[3])); emit_insn (gen_addv16si3 (operands[0], res1, res2)); DONE; }) -(define_expand "udot_prod" +(define_expand "udot_prod" [(match_operand: 0 "register_operand") (match_operand:VI1_AVX2 1 "register_operand") (match_operand:VI1_AVX2 2 "register_operand") @@ -31261,9 +31270,9 @@ (define_expand "udot_prod" rtx sum = gen_reg_rtx (mode); emit_move_insn (sum, CONST0_RTX (mode)); - emit_insn (gen_sdot_prod (res1, op1_lo, + emit_insn (gen_sdot_prod (res1, op1_lo, op2_lo, sum)); - emit_insn (gen_sdot_prod (res2, op1_hi, + emit_insn (gen_sdot_prod (res2, op1_hi, op2_hi, operands[3])); emit_insn (gen_add3 (operands[0], res1, res2)); } @@ -31271,7 +31280,7 @@ (define_expand "udot_prod" DONE; }) -(define_expand "udot_prodv64qi" +(define_expand "udot_prodv16qiv64qi" [(match_operand:V16SI 0 "register_operand") (match_operand:V64QI 1 "register_operand") (match_operand:V64QI 2 "register_operand") @@ -31294,8 +31303,8 @@ (define_expand "udot_prodv64qi" rtx sum = gen_reg_rtx (V16SImode); emit_move_insn (sum, CONST0_RTX (V16SImode)); - emit_insn (gen_sdot_prodv32hi (res1, op1_lo, op2_lo, sum)); - emit_insn (gen_sdot_prodv32hi (res2, op1_hi, op2_hi, operands[3])); + emit_insn (gen_sdot_prodv16siv32hi (res1, op1_lo, op2_lo, sum)); + emit_insn (gen_sdot_prodv16siv32hi (res2, op1_hi, op2_hi, operands[3])); emit_insn (gen_addv16si3 (operands[0], res1, res2)); DONE; @@ -31401,7 +31410,7 @@ (define_int_attr vpdpwprodtype (UNSPEC_VPDPWSUD "wsud") (UNSPEC_VPDPWSUDS "wsuds") (UNSPEC_VPDPWUUD "wuud") (UNSPEC_VPDPWUUDS "wuuds")]) -(define_expand "usdot_prod" +(define_expand "usdot_prod" [(match_operand: 0 "register_operand") (match_operand:VI2_AVX2 1 "register_operand") (match_operand:VI2_AVX2 2 "register_operand") @@ -31419,7 +31428,7 @@ (define_expand "usdot_prod" DONE; }) -(define_expand "udot_prod" +(define_expand "udot_prod" [(match_operand: 0 "register_operand") (match_operand:VI2_AVX2 1 "register_operand") (match_operand:VI2_AVX2 2 "register_operand")