From patchwork Wed Sep 4 13:26:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980792 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=QFjRYciU; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNbQ31dZz1yfv for ; Wed, 4 Sep 2024 23:27:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7C883384F4BD for ; Wed, 4 Sep 2024 13:27:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 6D1823858C98 for ; Wed, 4 Sep 2024 13:27:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D1823858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6D1823858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456443; cv=none; b=eUgt7dS6pyiRl2BP/Jv+PI8epsteqhKgmI+jCsOqwaoarflsY0GRyC6VDQqtv4xI7Yq5TeHOXXI/stZrSgcOsZK3YELFF7CBZiVikE3fGCacohe3FnvFcDpOodEdubNZ+H4ZvZ0UdUtm/L8XpT8AXb3vrk3jDhLzwbO9VRbvb+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456443; c=relaxed/simple; bh=hlpEEH6lrEWHukZMDhiuil0/jtDO6QZxpAgzxLPmgDU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Hy+aTCcwsnnovKPaHxEhFGB0Vow1p07XxXbwjd6gtjz5QETrc//6R0OPgh3bTrPoZEkPmfTjUJfhUuDJaOAzhhqjX9XMjuuWaMtzVbueQbe39UK8RcmtzuAahEUFieMKyC4NFI1yZKnztjCnhBzl/EGE36L61WSmEqKiuzDr4ro= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5dca990cf58so4098368eaf.1 for ; Wed, 04 Sep 2024 06:27:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456440; x=1726061240; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yKQLt03cdjLeH58Bs4MWNkZpAmLIj4IuzaYsSSvKvAg=; b=QFjRYciU6AGo0ICIhzzNFuE6GSEIEb+272kMydKMrNf6x8ywMYat5cUhNiD1La3ru2 MVLR4rlxfZxDbtyFXDvIn3Xy63NO+/1Dk7xcjQrOdwZiypW11c9C3dWUUtykifoBYuQW GEiaDreloIID7Q8dK9sMiQRshqMGZmUh6rtlUMcN9lV/SKC2dJS1y+qCMB57D99x2adW 2nNzjbKGytjuWd7ePUbPiIInZqyHPt4m34ke6pKZJTqWu8kjTitdAOkicdnThvrk6Khh xIye56lxITVe7zy975Ozh0TK3+/tRflYqHp1WzxO59Bu/typLxgAaN5KrfZfNrlpdkrq GxJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456440; x=1726061240; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yKQLt03cdjLeH58Bs4MWNkZpAmLIj4IuzaYsSSvKvAg=; b=foS3otYeEsBAxbYrfJezUNshBQdIxIBtqvLAdsEvLlgS50xT+KUWkKQD0Qj4ukWM7h Ga9Rt+Kcd/OIfLogEmKZrff3xZsTCsc8hnD44RQ27kXb+sdpPljO1NRZuX4NyN32+3zB U8780kmsW2Gp4PVwZklmKQAJNALCiOvg4lLEmUHtdvm/HQvuL0UBCM2ZUpMoPXVJktZk L1P6DnpevGyS4C49uUphyJUm1c1jlqJdITli1tdTAH1fMYvWscyH/YPskLGtzenH5Ngn SQEpaCOBCKhkjh4c7SIy5A50HL7oSJ18FMDHpxgOo2uovyYWv8SMV6RwmW6XHgIyQ7m9 aw3A== X-Gm-Message-State: AOJu0YzIR2nDJuqd78wiahd31PxaoHVAScxJSv0F/HWNEhrT6EGx2Hpc 6kOhCrdd14KRDVJKdOqH08t62cOO8qpB64DyLQE//3SYTXaLIJmVW6UrhqyPCdmIcOJEmQYHk7+ 4bKY= X-Google-Smtp-Source: AGHT+IH4ImlL3L+lk5sFT+KJ4RH3cy2hkn2S7TThqxx0unICYMSqpPOWnLt1LcpUt/aDGOklXngcHg== X-Received: by 2002:a05:6820:168d:b0:5da:9bde:1bfa with SMTP id 006d021491bc7-5dfacc0452emr19431699eaf.0.1725456440382; Wed, 04 Sep 2024 06:27:20 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:20 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 01/36] arm: [MVE intrinsics] improve comment for orrq shape Date: Wed, 4 Sep 2024 13:26:15 +0000 Message-Id: <20240904132650.2720446-2-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Add a comment about the lack of "n" forms for floating-point nor 8-bit integers, to make it clearer why we use build_16_32 for MODE_n. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_orrq_def): Improve comment. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index ba20c6a8f73..e01939469e3 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -865,7 +865,12 @@ SHAPE (binary_opt_n) int16x8_t [__arm_]vorrq_m[_s16](int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p) int16x8_t [__arm_]vorrq_x[_s16](int16x8_t a, int16x8_t b, mve_pred16_t p) int16x8_t [__arm_]vorrq[_n_s16](int16x8_t a, const int16_t imm) - int16x8_t [__arm_]vorrq_m_n[_s16](int16x8_t a, const int16_t imm, mve_pred16_t p) */ + int16x8_t [__arm_]vorrq_m_n[_s16](int16x8_t a, const int16_t imm, mve_pred16_t p) + + No "_n" forms for floating-point, nor 8-bit integers: + float16x8_t [__arm_]vorrq[_f16](float16x8_t a, float16x8_t b) + float16x8_t [__arm_]vorrq_m[_f16](float16x8_t inactive, float16x8_t a, float16x8_t b, mve_pred16_t p) + float16x8_t [__arm_]vorrq_x[_f16](float16x8_t a, float16x8_t b, mve_pred16_t p) */ struct binary_orrq_def : public overloaded_base<0> { bool From patchwork Wed Sep 4 13:26:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980798 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=QGrY3lMb; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNdr0vpTz1yXY for ; Wed, 4 Sep 2024 23:30:08 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EE7F33861838 for ; Wed, 4 Sep 2024 13:30:04 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id B65DC3858C50 for ; Wed, 4 Sep 2024 13:27:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B65DC3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B65DC3858C50 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::329 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456446; cv=none; b=sOEVSUtzkAsPKwYTO6J/yG94u5I/34CngroudLpdevAQ/CEzaU8FwxyAlb4i7ja9XoJY5XHW113iZmzXrLtVjhzT8yDBe9N6KDiNv5UbxNl6wvhvUBzro5rrE86OL1Jd85fWByiPVWcLM0lceznnMy/RmkGi98IoxVSFB7RqKAg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456446; c=relaxed/simple; bh=FpEw6N+DxnhaFmYOzcpNA4+B/x/Hy2Mw7PF0ows3kNE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=hO45b9onIwhNj6mcZ67ewvlOpGfXJg/zd3kPIR4CrQdgmS/4qZKQliLPHp5BFLrWW2oQqbsp3fJhXj6OzvFcWKJKFvUW4hojvk3xJyLz5g9Ka9IYtKh4JyMx7J5/GuZOf3/yW/XfiGeVX1/Jau9AlIkLh2B+Jpodg83CMlO8N4U= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-710a73518d9so403013a34.1 for ; Wed, 04 Sep 2024 06:27:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456441; x=1726061241; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=l5mjEkziHbn9EkdRH1yxO7aiRl8+CbanGxIv3R9K800=; b=QGrY3lMb++tV8i63y30xdw3UdUh+kBOmE40Ifi9W9lFrSHyukbIz9KsgBq77ot6UNx a9Ec17yXz5/sKQxrBTNW91CpSVJc/7Zdvfo/8Lf6cmduiHXN3S+90yatEvKSv/Kyu6pp bJna9gNPYj+Pn0NOt/hmOU9tRbmFQ8lOHDLLWr/RwyLS4fnMPJPrqmqBJ/tKdn/U8YOA nuwsuDZwQPj8jycEDvXILIkHlM4Xyu1ry85/VVvZuXJosc4oUkPNGEAKxoa4I/wvod4p VuQFbu0VyOpkGZKe0MHyvup+9UA7QTjOLyT31zZlLMPGDCPbPOaF4qzt/ngUnZDy+lvO iOpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456441; x=1726061241; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l5mjEkziHbn9EkdRH1yxO7aiRl8+CbanGxIv3R9K800=; b=uHYb+CtbW/LSuUTD1wdQWFhyu0qJYeEE497hHiClcXWxmHz0KlEV+Ks51NYWWKisk3 kDCP+ueumZOXkLRFPcJpdVyBw2qt5CzfZE7JJO3vA16yJX8BdVrqn6dj0aT46NfegKpS p/GjZHMSugKJgb6R0iwL2pSeayLfzKF/jlT/lcH54f+Ot8/7JH4u/WYYbADm3+47E1ea SaAI7oeWwtSrIW51YZuVAoBodkixg5czr+LUc0PuahO9ffulNuYpB0E4yWpQp4B5kSmw rTI/6jDl37Vr1BJbbnKyYLBguJi1JpJYNBDTQcH/7h7/5I47/raW/75XYJt19NqwtkKm g7Hg== X-Gm-Message-State: AOJu0YwtoyP2yD4OiJzxh6YI5F4bOc+WtBGobl6+5fQt6xRXWRRHjhAb v0KmkhoOZDLUYIMoMEWl3yqFMdRQE4BejPVcBx/MyZUG+8bJ2rAJHAWezaZwc89s0Ly/ATasQR3 sayg= X-Google-Smtp-Source: AGHT+IEQkWpz/ws4MWHVdhIjXggjxDd01w3oLTtzEPWRW8jr8WzYf4erVLIFv37UEZtSzxK9LqPp5Q== X-Received: by 2002:a05:6830:6781:b0:70a:9885:f9d6 with SMTP id 46e09a7af769-710b65e0bdbmr1116959a34.15.1725456441463; Wed, 04 Sep 2024 06:27:21 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:20 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 02/36] arm: [MVE intrinsics] remove useless resolve from create shape Date: Wed, 4 Sep 2024 13:26:16 +0000 Message-Id: <20240904132650.2720446-3-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org vcreateq have no overloaded forms, so there's no need for resolve (). 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (create_def): Remove resolve. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 6 ------ 1 file changed, 6 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index e01939469e3..0520a8331db 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -1408,12 +1408,6 @@ struct create_def : public nonoverloaded_base { build_all (b, "v0,su64,su64", group, MODE_none, preserve_user_namespace); } - - tree - resolve (function_resolver &r) const override - { - return r.resolve_uniform (0, 2); - } }; SHAPE (create) From patchwork Wed Sep 4 13:26:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980807 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=H/BzsAoB; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNhF1hW4z1yg3 for ; Wed, 4 Sep 2024 23:32:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 696C13864808 for ; Wed, 4 Sep 2024 13:32:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id 018393858406 for ; Wed, 4 Sep 2024 13:27:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 018393858406 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 018393858406 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; cv=none; b=bPQ7UqkR6AaX/0A5bGo0AAP5b3uS+J3a9WWiLt5tRLslRZpoVZQAblL4/OR8dbkLXl5JpUBy5ba33l3czJrlregDTgk7YLDtBKohupaFpyi0Lhc1g8L5hJ9dUvOJahoOlZhBE8j5LZh6tiQakiw4tEbZ56TgYNH3UGDegQWxSIc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; c=relaxed/simple; bh=fg2+8PkUC6nWYF7HMKH5a57FPEqtm2Gg4vAYkWL3EY4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=m+s2APJCxm2B/gDXcixhEmEZiFB806T5OeCajKeYFATDUYKCLOJHiQmvICjmGR1AIu29ZefIKZbBAmMUGHqVogSZ8zDoIxtgDpUhrQ59Sj3IBJHpBj9O700cTCMGkwMh1GX6fbIU294CxW50jCDqnF1ku3Bb5K/vcJCNJj8nYZg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5e172cc6d66so1772553eaf.2 for ; Wed, 04 Sep 2024 06:27:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456443; x=1726061243; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sbdzpdpusfLYlT9whqqawHe0lKH3MzvoxvnKSgq15Qs=; b=H/BzsAoBE69ye07jPeXHnxChQGb8lpnkjT79ZM6cTAat3pNLCqxKUAIB4JDcIzIfr1 sQW1kQjW04Uf5dNhPGM92GK8xcFtTpXzY9n3p+h9ubZqWfMTQvgAWGznZbB/vuD3HRBo toSLBT7X6erLGJeb89lxXyWodFybrlGqS5L4cBz78sQ6gdTVUEwU+WgsQckEctbL7bKS vXVubbT4sCvH/NYSxpk9Pspw9EqdDSd7XDqGv8dcoE83kM8B2PO1YX4DfuCIa6YUTKSF vGDdIbWWe2rBGHhqMwbUjDTNjOmpZ0XpAiBUu34O0f1Qz/DupdNX7Ix+8p9POesLM9jY jbCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456443; x=1726061243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sbdzpdpusfLYlT9whqqawHe0lKH3MzvoxvnKSgq15Qs=; b=tuc97YC116enDLVJU+qcL6efuvHtN0F60o6KZqjEUURfi9zD/Em3TreIE7JKJMghfd eI3+duLVz2vHyKOKWDb5vIC3/IZQzWqP5TmWROrBFE4aR3UHE/hT633567pKp30hZ/Gi 92GkNxRuGnlXY/tmo3SeGlwBABvAJTHUY3kigOafU0L2SFOylp9koOJGucgEB2Kwwc9p Pv0lrsDvuOkKJcv0RIxDxkyXEErjyjrzyHFYugGvNYnO5CL5vxA3q0QDzw2UavBV8kcy RPxS9PczbntlDWi+gCZuyrGT7z/K5WXjFjYRaEIYOfBWF5IDi0+zzTncx96AjpzK9HUo 35Ig== X-Gm-Message-State: AOJu0YxTAvK4XZmSVZ8HdXWnN/I0+TErHEuBq+vAkMGqVUAPJMrMp/Rf xT9ijS6bTOgvS5Qljx5brK8SlemOnLEbD8X+/UyXbD6YHgxFHbpnaCPyaHNBiQ/1FKN8sZGnWdH D42/NJA== X-Google-Smtp-Source: AGHT+IEvuFa9euIJVOCDfDMZKuaU2Hut15+Z+XbTv3T7kV1BuBUqx5+8wRk2J9ThIQOxDXdNT9rhFw== X-Received: by 2002:a05:6820:1ad5:b0:5da:9b98:e208 with SMTP id 006d021491bc7-5dfad0203e6mr19648195eaf.5.1725456442552; Wed, 04 Sep 2024 06:27:22 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:21 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 03/36] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h Date: Wed, 4 Sep 2024 13:26:17 +0000 Message-Id: <20240904132650.2720446-4-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch brings no functional change but removes some code duplication in arm-mve-builtins-functions.h and makes it easier to read and maintain. It introduces a new expand_unspec () member of unspec_based_mve_function_base and makes a few classes inherit from it instead of function_base. This adds 3 new members containing the unspec codes for signed-int, unsigned-int and floating-point intrinsics (no mode, no predicate). Depending on the derived class, these will be used instead of the 3 similar RTX codes. The new expand_unspec () handles all the possible unspecs, some of which maybe not be supported by a given intrinsics family: such code paths won't be used in that case. Similarly, codes specific to a family (RTX, or PRED_p for instance) should be handled by the caller of expand_unspec (). Thanks to this, expand () for unspec_based_mve_function_exact_insn, unspec_mve_function_exact_insn, unspec_mve_function_exact_insn_pred_p, unspec_mve_function_exact_insn_vshl no longer duplicate a lot of code. The patch also makes most of PRED_m and PRED_x handling use the same code, and uses conditional operators when computing which RTX code/unspec to use when calling code_for_mve_q_XXX. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-functions.h (unspec_based_mve_function_base): Add m_unspec_for_sint, m_unspec_for_uint, m_unspec_for_fp and expand_unspec members. (unspec_based_mve_function_exact_insn): Inherit from unspec_based_mve_function_base and use expand_unspec. (unspec_mve_function_exact_insn): Likewise. (unspec_mve_function_exact_insn_pred_p): Likewise. Use conditionals. (unspec_mve_function_exact_insn_vshl): Likewise. (unspec_based_mve_function_exact_insn_vcmp): Initialize new inherited members. Use conditionals. (unspec_mve_function_exact_insn_rot): Merge PRED_m and PRED_x handling. Use conditionals. (unspec_mve_function_exact_insn_vmull): Likewise. (unspec_mve_function_exact_insn_vmull_poly): Likewise. --- gcc/config/arm/arm-mve-builtins-functions.h | 726 ++++++++------------ 1 file changed, 286 insertions(+), 440 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index ac2a731bff4..35cb5242b77 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -40,17 +40,23 @@ public: }; /* An incomplete function_base for functions that have an associated - rtx_code for signed integers, unsigned integers and floating-point - values for the non-predicated, non-suffixed intrinsic, and unspec - codes, with separate codes for signed integers, unsigned integers - and floating-point values. The class simply records information - about the mapping for derived classes to use. */ + rtx_code or an unspec for signed integers, unsigned integers and + floating-point values for the non-predicated, non-suffixed + intrinsics, and unspec codes, with separate codes for signed + integers, unsigned integers and floating-point values for + predicated and/or suffixed intrinsics. The class simply records + information about the mapping for derived classes to use and + provides a generic expand_unspec () to avoid duplicating expansion + code in derived classes. */ class unspec_based_mve_function_base : public function_base { public: CONSTEXPR unspec_based_mve_function_base (rtx_code code_for_sint, rtx_code code_for_uint, rtx_code code_for_fp, + int unspec_for_sint, + int unspec_for_uint, + int unspec_for_fp, int unspec_for_n_sint, int unspec_for_n_uint, int unspec_for_n_fp, @@ -63,6 +69,9 @@ public: : m_code_for_sint (code_for_sint), m_code_for_uint (code_for_uint), m_code_for_fp (code_for_fp), + m_unspec_for_sint (unspec_for_sint), + m_unspec_for_uint (unspec_for_uint), + m_unspec_for_fp (unspec_for_fp), m_unspec_for_n_sint (unspec_for_n_sint), m_unspec_for_n_uint (unspec_for_n_uint), m_unspec_for_n_fp (unspec_for_n_fp), @@ -83,6 +92,9 @@ public: /* The unspec code associated with signed-integer, unsigned-integer and floating-point operations respectively. It covers the cases with the _n suffix, and/or the _m predicate. */ + int m_unspec_for_sint; + int m_unspec_for_uint; + int m_unspec_for_fp; int m_unspec_for_n_sint; int m_unspec_for_n_uint; int m_unspec_for_n_fp; @@ -92,8 +104,101 @@ public: int m_unspec_for_m_n_sint; int m_unspec_for_m_n_uint; int m_unspec_for_m_n_fp; + + rtx expand_unspec (function_expander &e) const; }; +/* Expand the unspecs, which is common to all intrinsics using + unspec_based_mve_function_base. If some combinations are not + supported for an intrinsics family, they should be handled by the + caller (and not crash here). */ +rtx +unspec_based_mve_function_base::expand_unspec (function_expander &e) const +{ + machine_mode mode = e.vector_mode (0); + insn_code code; + + switch (e.pred) + { + case PRED_none: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No predicate, no suffix. */ + if (e.type_suffix (0).integer_p) + { + int unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_uint + : m_unspec_for_sint); + code = code_for_mve_q (unspec, unspec, mode); + } + else + code = code_for_mve_q_f (m_unspec_for_fp, mode); + break; + + case MODE_n: + /* No predicate, _n suffix. */ + if (e.type_suffix (0).integer_p) + { + int unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_n_uint + : m_unspec_for_n_sint); + code = code_for_mve_q_n (unspec, unspec, mode); + } + else + code = code_for_mve_q_n_f (m_unspec_for_n_fp, mode); + break; + + default: + gcc_unreachable (); + } + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + int unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_uint + : m_unspec_for_m_sint); + code = code_for_mve_q_m (unspec, unspec, mode); + } + else + code = code_for_mve_q_m_f (m_unspec_for_m_fp, mode); + break; + + case MODE_n: + /* _n suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + int unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_n_uint + : m_unspec_for_m_n_sint); + code = code_for_mve_q_m_n (unspec, unspec, mode); + } + else + code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, mode); + break; + + default: + gcc_unreachable (); + } + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + break; + + default: + gcc_unreachable (); + } +} + /* Map the function directly to CODE (UNSPEC, M) where M is the vector mode associated with type suffix 0, except when there is no predicate and no _n suffix, in which case we use the appropriate @@ -117,6 +222,9 @@ public: : unspec_based_mve_function_base (code_for_sint, code_for_uint, code_for_fp, + -1, + -1, + -1, unspec_for_n_sint, unspec_for_n_uint, unspec_for_n_fp, @@ -137,97 +245,13 @@ public: return e.map_to_rtx_codes (m_code_for_sint, m_code_for_uint, m_code_for_fp); - insn_code code; - switch (e.pred) - { - case PRED_none: - if (e.mode_suffix_id == MODE_n) - /* No predicate, _n suffix. */ - { - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_n_f (m_unspec_for_n_fp, e.vector_mode (0)); - - return e.use_exact_insn (code); - } - gcc_unreachable (); - break; - - case PRED_m: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); - - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_pred_x_insn (code); - - default: - gcc_unreachable (); - } - - gcc_unreachable (); + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC, M) where M is the vector mode associated with type suffix 0. */ -class unspec_mve_function_exact_insn : public function_base +class unspec_mve_function_exact_insn : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn (int unspec_for_sint, @@ -242,143 +266,33 @@ public: int unspec_for_m_n_sint, int unspec_for_m_n_uint, int unspec_for_m_n_fp) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp), - m_unspec_for_n_sint (unspec_for_n_sint), - m_unspec_for_n_uint (unspec_for_n_uint), - m_unspec_for_n_fp (unspec_for_n_fp), - m_unspec_for_m_sint (unspec_for_m_sint), - m_unspec_for_m_uint (unspec_for_m_uint), - m_unspec_for_m_fp (unspec_for_m_fp), - m_unspec_for_m_n_sint (unspec_for_m_n_sint), - m_unspec_for_m_n_uint (unspec_for_m_n_uint), - m_unspec_for_m_n_fp (unspec_for_m_n_fp) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + unspec_for_fp, + unspec_for_n_sint, + unspec_for_n_uint, + unspec_for_n_fp, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + unspec_for_m_n_fp) {} - /* The unspec code associated with signed-integer, unsigned-integer - and floating-point operations respectively. It covers the cases - with the _n suffix, and/or the _m predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_fp; - int m_unspec_for_n_sint; - int m_unspec_for_n_uint; - int m_unspec_for_n_fp; - int m_unspec_for_m_sint; - int m_unspec_for_m_uint; - int m_unspec_for_m_fp; - int m_unspec_for_m_n_sint; - int m_unspec_for_m_n_uint; - int m_unspec_for_m_n_fp; - rtx expand (function_expander &e) const override { - insn_code code; - switch (e.pred) - { - case PRED_none: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No predicate, no suffix. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - else - code = code_for_mve_q_f (m_unspec_for_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* No predicate, _n suffix. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_n_f (m_unspec_for_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_exact_insn (code); - - case PRED_m: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); - - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_pred_x_insn (code); - - default: - gcc_unreachable (); - } - - gcc_unreachable (); + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC), when there is a non-predicated version and one with the "_p" predicate. */ -class unspec_mve_function_exact_insn_pred_p : public function_base +class unspec_mve_function_exact_insn_pred_p : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn_pred_p (int unspec_for_sint, @@ -387,19 +301,23 @@ public: int unspec_for_p_sint, int unspec_for_p_uint, int unspec_for_p_fp) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp), + : unspec_based_mve_function_base (UNKNOWN, /* No RTX code. */ + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + unspec_for_fp, + -1, -1, -1, /* No _n intrinsics. */ + -1, -1, -1, /* No _m intrinsics. */ + -1, -1, -1), /* No _m_n intrinsics. */ m_unspec_for_p_sint (unspec_for_p_sint), m_unspec_for_p_uint (unspec_for_p_uint), m_unspec_for_p_fp (unspec_for_p_fp) {} - /* The unspec code associated with signed-integer and unsigned-integer - operations, with no predicate, or with "_p" predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_fp; + /* The unspec code associated with signed-integer and + unsigned-integer or floating-point operations with "_p" + predicate. */ int m_unspec_for_p_sint; int m_unspec_for_p_uint; int m_unspec_for_p_fp; @@ -408,6 +326,7 @@ public: expand (function_expander &e) const override { insn_code code; + int unspec; if (m_unspec_for_sint == VADDLVQ_S || m_unspec_for_sint == VADDLVAQ_S @@ -423,62 +342,49 @@ public: switch (e.pred) { case PRED_none: - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_v4si (m_unspec_for_uint, m_unspec_for_uint); - else - code = code_for_mve_q_v4si (m_unspec_for_sint, m_unspec_for_sint); + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_uint + : m_unspec_for_sint); + code = code_for_mve_q_v4si (unspec, unspec); return e.use_exact_insn (code); case PRED_p: - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_p_v4si (m_unspec_for_p_uint, m_unspec_for_p_uint); - else - code = code_for_mve_q_p_v4si (m_unspec_for_p_sint, m_unspec_for_p_sint); + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_p_uint + : m_unspec_for_p_sint); + code = code_for_mve_q_p_v4si (unspec, unspec); return e.use_exact_insn (code); default: gcc_unreachable (); } } - else - { - switch (e.pred) - { - case PRED_none: - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - else - code = code_for_mve_q_f (m_unspec_for_fp, e.vector_mode (0)); - - return e.use_exact_insn (code); - case PRED_p: - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_p (m_unspec_for_p_uint, m_unspec_for_p_uint, e.vector_mode (0)); - else - code = code_for_mve_q_p (m_unspec_for_p_sint, m_unspec_for_p_sint, e.vector_mode (0)); - else - code = code_for_mve_q_p_f (m_unspec_for_p_fp, e.vector_mode (0)); - - return e.use_exact_insn (code); + if (e.pred == PRED_p) + { + machine_mode mode = e.vector_mode (0); - default: - gcc_unreachable (); + if (e.type_suffix (0).integer_p) + { + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_p_uint + : m_unspec_for_p_sint); + code = code_for_mve_q_p (unspec, unspec, mode); } + else + code = code_for_mve_q_p_f (m_unspec_for_p_fp, mode); + + return e.use_exact_insn (code); } - gcc_unreachable (); + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC, M) for vshl-like builtins. The difference with unspec_mve_function_exact_insn is that this function handles MODE_r and the related unspecs.. */ -class unspec_mve_function_exact_insn_vshl : public function_base +class unspec_mve_function_exact_insn_vshl : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn_vshl (int unspec_for_sint, @@ -493,31 +399,29 @@ public: int unspec_for_m_r_uint, int unspec_for_r_sint, int unspec_for_r_uint) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_n_sint (unspec_for_n_sint), - m_unspec_for_n_uint (unspec_for_n_uint), - m_unspec_for_m_sint (unspec_for_m_sint), - m_unspec_for_m_uint (unspec_for_m_uint), - m_unspec_for_m_n_sint (unspec_for_m_n_sint), - m_unspec_for_m_n_uint (unspec_for_m_n_uint), + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + -1, + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + -1, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1), m_unspec_for_m_r_sint (unspec_for_m_r_sint), m_unspec_for_m_r_uint (unspec_for_m_r_uint), m_unspec_for_r_sint (unspec_for_r_sint), m_unspec_for_r_uint (unspec_for_r_uint) {} - /* The unspec code associated with signed-integer, unsigned-integer - and floating-point operations respectively. It covers the cases - with the _n suffix, and/or the _m predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_n_sint; - int m_unspec_for_n_uint; - int m_unspec_for_m_sint; - int m_unspec_for_m_uint; - int m_unspec_for_m_n_sint; - int m_unspec_for_m_n_uint; + /* The unspec code associated with signed-integer and unsigned-integer + operations with MODE_r with or without PRED_m. */ int m_unspec_for_m_r_sint; int m_unspec_for_m_r_uint; int m_unspec_for_r_sint; @@ -527,101 +431,40 @@ public: expand (function_expander &e) const override { insn_code code; - switch (e.pred) + int unspec; + + if (e.mode_suffix_id == MODE_r) { - case PRED_none: - switch (e.mode_suffix_id) + machine_mode mode = e.vector_mode (0); + switch (e.pred) { - case MODE_none: - /* No predicate, no suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - break; - - case MODE_n: - /* No predicate, _n suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - break; - - case MODE_r: + case PRED_none: /* No predicate, _r suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_r (m_unspec_for_r_uint, m_unspec_for_r_uint, e.vector_mode (0)); - else - code = code_for_mve_q_r (m_unspec_for_r_sint, m_unspec_for_r_sint, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_exact_insn (code); - - case PRED_m: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - break; - - case MODE_r: - /* _r suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_r (m_unspec_for_m_r_uint, m_unspec_for_m_r_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_r (m_unspec_for_m_r_sint, m_unspec_for_m_r_sint, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); - - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - break; + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_r_uint + : m_unspec_for_r_sint); + code = code_for_mve_q_r (unspec, unspec, mode); + return e.use_exact_insn (code); - case MODE_n: - /* _n suffix, "x" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); + case PRED_m: + case PRED_x: + /* _r suffix, "m" or "x" predicate. */ + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_r_uint + : m_unspec_for_m_r_sint); + code = code_for_mve_q_m_r (unspec, unspec, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - break; + return e.use_pred_x_insn (code); default: gcc_unreachable (); } - return e.use_pred_x_insn (code); - - default: - gcc_unreachable (); } - gcc_unreachable (); + return expand_unspec (e); } }; @@ -641,9 +484,8 @@ public: : unspec_based_mve_function_base (code_for_sint, code_for_uint, code_for_fp, - -1, - -1, - -1, + -1, -1, -1, /* No non-predicated, no mode intrinsics. */ + -1, -1, -1, /* No _n intrinsics. */ unspec_for_m_sint, unspec_for_m_uint, unspec_for_m_fp, @@ -662,24 +504,30 @@ public: /* No suffix, no predicate, use the right RTX code. */ if (e.pred == PRED_none) { + rtx_code r_code; + switch (e.mode_suffix_id) { case MODE_none: if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_vcmpq (m_code_for_uint, mode); - else - code = code_for_mve_vcmpq (m_code_for_sint, mode); + { + r_code = (e.type_suffix (0).unsigned_p + ? m_code_for_uint + : m_code_for_sint); + code = code_for_mve_vcmpq (r_code, mode); + } else code = code_for_mve_vcmpq_f (m_code_for_fp, mode); break; case MODE_n: if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_vcmpq_n (m_code_for_uint, mode); - else - code = code_for_mve_vcmpq_n (m_code_for_sint, mode); + { + r_code = (e.type_suffix (0).unsigned_p + ? m_code_for_uint + : m_code_for_sint); + code = code_for_mve_vcmpq_n (r_code, mode); + } else code = code_for_mve_vcmpq_n_f (m_code_for_fp, mode); break; @@ -691,6 +539,8 @@ public: } else { + int unspec; + switch (e.pred) { case PRED_m: @@ -699,10 +549,12 @@ public: case MODE_none: /* No suffix, "m" predicate. */ if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_vcmpq_m (m_unspec_for_m_uint, m_unspec_for_m_uint, mode); - else - code = code_for_mve_vcmpq_m (m_unspec_for_m_sint, m_unspec_for_m_sint, mode); + { + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_uint + : m_unspec_for_m_sint); + code = code_for_mve_vcmpq_m (unspec, unspec, mode); + } else code = code_for_mve_vcmpq_m_f (m_unspec_for_m_fp, mode); break; @@ -710,10 +562,12 @@ public: case MODE_n: /* _n suffix, "m" predicate. */ if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_vcmpq_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, mode); - else - code = code_for_mve_vcmpq_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, mode); + { + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_n_uint + : m_unspec_for_m_n_sint); + code = code_for_mve_vcmpq_m_n (unspec, unspec, mode); + } else code = code_for_mve_vcmpq_m_n_f (m_unspec_for_m_n_fp, mode); break; @@ -738,7 +592,9 @@ public: /* Map the function directly to CODE (UNSPEC, UNSPEC, UNSPEC, M) where M is the vector mode associated with type suffix 0. USed for the operations where there is a "rot90" or "rot270" suffix, depending - on the UNSPEC. */ + on the UNSPEC. We cannot use + unspec_based_mve_function_base::expand_unspec () because we call + code_for_mve_q with one more parameter. */ class unspec_mve_function_exact_insn_rot : public function_base { public: @@ -769,7 +625,9 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; + int unspec; switch (e.pred) { @@ -779,12 +637,14 @@ public: case MODE_none: /* No predicate, no suffix. */ if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); + { + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_uint + : m_unspec_for_sint); + code = code_for_mve_q (unspec, unspec, unspec, mode); + } else - code = code_for_mve_q_f (m_unspec_for_fp, m_unspec_for_fp, e.vector_mode (0)); + code = code_for_mve_q_f (m_unspec_for_fp, m_unspec_for_fp, mode); break; default: @@ -793,42 +653,30 @@ public: return e.use_exact_insn (code); case PRED_m: + case PRED_x: switch (e.mode_suffix_id) { case MODE_none: - /* No suffix, "m" predicate. */ + /* No suffix, "m" or "x" predicate. */ if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + { + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_uint + : m_unspec_for_m_sint); + code = code_for_mve_q_m (unspec, unspec, unspec, mode); + } else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); + code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, mode); - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); + return e.use_pred_x_insn (code); break; default: gcc_unreachable (); } - return e.use_pred_x_insn (code); default: gcc_unreachable (); @@ -866,7 +714,9 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; + int unspec; if (! e.type_suffix (0).integer_p) gcc_unreachable (); @@ -878,30 +728,25 @@ public: { case PRED_none: /* No predicate, no suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q_int (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_uint + : m_unspec_for_sint); + code = code_for_mve_q_int (unspec, unspec, mode); return e.use_exact_insn (code); case PRED_m: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_int_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - - return e.use_cond_insn (code, 0); - case PRED_x: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + /* No suffix, "m" or "x" predicate. */ + unspec = (e.type_suffix (0).unsigned_p + ? m_unspec_for_m_uint + : m_unspec_for_m_sint); + code = code_for_mve_q_int_m (unspec, unspec, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); else - code = code_for_mve_q_int_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - - return e.use_pred_x_insn (code); + return e.use_pred_x_insn (code); default: gcc_unreachable (); @@ -933,6 +778,7 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; if (e.mode_suffix_id != MODE_none) @@ -945,18 +791,18 @@ public: { case PRED_none: /* No predicate, no suffix. */ - code = code_for_mve_q_poly (m_unspec_for_poly, m_unspec_for_poly, e.vector_mode (0)); + code = code_for_mve_q_poly (m_unspec_for_poly, m_unspec_for_poly, mode); return e.use_exact_insn (code); case PRED_m: - /* No suffix, "m" predicate. */ - code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, e.vector_mode (0)); - return e.use_cond_insn (code, 0); - case PRED_x: - /* No suffix, "x" predicate. */ - code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, e.vector_mode (0)); - return e.use_pred_x_insn (code); + /* No suffix, "m" or "x" predicate. */ + code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); default: gcc_unreachable (); From patchwork Wed Sep 4 13:26:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980802 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=DWPAs/2r; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNg24WV1z1yg3 for ; Wed, 4 Sep 2024 23:31:10 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6450386482C for ; Wed, 4 Sep 2024 13:31:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by sourceware.org (Postfix) with ESMTPS id 5A984385841C for ; Wed, 4 Sep 2024 13:27:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5A984385841C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5A984385841C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; cv=none; b=IjYJuOwfvbV/grQU0zGmIqB5IHbUU8ITUEnwQY07uYN9pcFDGyALsIfj8BcGQWBkgcFCfe6QBvNozDJU4pSZr0qeaaqJJecVo6o73yLIR0FEXyjcC2eSDFt99Lp9LzFAySHcadIgjP9c3J0uPigxAJ6z9YfhoF1MtprAmfXNE1E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; c=relaxed/simple; bh=GSUmtEFh5DleYuiDOMHilNX8ilYxVwGu/DmvqInkka4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=mNj1R4S1jSgAFV1EUSvtEJDRs7RGDZ0Uwcrlj2RH3sqzaE+7Igbus8JVdTtbePtU4xmpXDaI8nxFncpK8/2c6eQvJsA7J8s4Ln9G6WY38Sjx4ugAMT8WTkbbAXR+O4aaQa7a0x2OndLZPSLQyI6xoMbV8Su2SF8o8Y5nSughobE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-5dfa315ffbdso3815526eaf.3 for ; Wed, 04 Sep 2024 06:27:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456443; x=1726061243; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GSgOrcsmlrnGXzPoglW5v0sK7DHH7K2tBTYcnt3+ia8=; b=DWPAs/2rUzH+uNF/9F9HVfKhrrflihWjvPDK/KkuPBDtsp9QoZDLunjcLJWqfKs3X6 lmmXc8DNv022HA0VhjeQBqMdm9AaEqG34/y5RERJhvk2kUFpR43RmZqFp6qupWZ9nF30 6hVbfw2kAp64LFJpKjgQctbYGhipvvPzrm+IJroIqUR6QInkz0telxddZSRGsA8Uyicw h53Szkl5jdlJkT33GWPxD5gAIQRbQgs7bXP0YmPqW43NOR++jzm77bEeKN4Ep9R4Fz4G OSZQ6s7Iwa7RnZSgF2qXA+73+kefYg+xTTBbkvVkATv0C6OZ0hMoFYNOdHO/CXlFy14a CDxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456443; x=1726061243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GSgOrcsmlrnGXzPoglW5v0sK7DHH7K2tBTYcnt3+ia8=; b=fPK9QO0TlpzFPUMu2Msu+4Tj32vbU1oiHsGuPx2COu/Mgv0mlhXiXUUghV1LDrGsMi 4iwOZeZWtHHEI6Y8aTjjN5VhQU8Lqgo2ra6db80cRrfp8Va2dbbSCdimwHmzgZFd31ct GCANMWuepkCWzeZMMRYFuXvLSSZqBhH+suSHWX+a9hmCxkWl9sva2A9kVIsZn4zeSP5i KYkBsrRpvNx9kuq2TJkE2HfquBi6p/yLDB8AYZON61V32AgUMdVzB0JOQVP26bAJOTt+ Q3QL1SCZqI+UB8KieQrnYpTq2YjIlZZxxABiaXkTgpxqbWhB2pfLMvCaIEMjR/dxjDKu 0WlA== X-Gm-Message-State: AOJu0YyEhgQRS45uCPE6EJ2eL1SF63PXKqreitJA9Yk8tvsB827loqFr 4ka0ntgwKolInQFqn79VbMhhuvVbhIJi8byUO7irpiqr2BirQOnUjVK+kszki+moEmbRm0Qfe7q /ldmm8g== X-Google-Smtp-Source: AGHT+IGI1RY4pZvvPXo9ZCr8LMw7yNAvdhT8N21bP7dRTDfE6FN6J1rZCWAf66eXeUWcw1dLwaFjDQ== X-Received: by 2002:a05:6820:1e13:b0:5da:b29a:3c84 with SMTP id 006d021491bc7-5dfaced6f2fmr15959043eaf.5.1725456443228; Wed, 04 Sep 2024 06:27:23 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:22 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 04/36] arm: [MVE intrinsics] factorize vcvtq Date: Wed, 4 Sep 2024 13:26:18 +0000 Message-Id: <20240904132650.2720446-5-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtq so that they use parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTQ_FROM_F_S, VCVTQ_FROM_F_U, VCVTQ_M_FROM_F_S, VCVTQ_M_FROM_F_U, VCVTQ_M_N_FROM_F_S, VCVTQ_M_N_FROM_F_U, VCVTQ_M_N_TO_F_S, VCVTQ_M_N_TO_F_U, VCVTQ_M_TO_F_S, VCVTQ_M_TO_F_U, VCVTQ_N_FROM_F_S, VCVTQ_N_FROM_F_U, VCVTQ_N_TO_F_S, VCVTQ_N_TO_F_U, VCVTQ_TO_F_S, VCVTQ_TO_F_U. * config/arm/mve.md (mve_vcvtq_to_f_): Rename into @mve_q_to_f_. (mve_vcvtq_from_f_): Rename into @mve_q_from_f_. (mve_vcvtq_n_to_f_): Rename into @mve_q_n_to_f_. (mve_vcvtq_n_from_f_): Rename into @mve_q_n_from_f_. (mve_vcvtq_m_to_f_): Rename into @mve_q_m_to_f_. (mve_vcvtq_m_n_from_f_): Rename into @mve_q_m_n_from_f_. (mve_vcvtq_m_from_f_): Rename into @mve_q_m_from_f_. (mve_vcvtq_m_n_to_f_): Rename into @mve_q_m_n_to_f_. --- gcc/config/arm/iterators.md | 8 +++++ gcc/config/arm/mve.md | 64 ++++++++++++++++++------------------- 2 files changed, 40 insertions(+), 32 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index b9ff01cb104..bf800625fac 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,6 +964,14 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") + (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") + (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") + (VCVTQ_M_N_TO_F_S "vcvt") (VCVTQ_M_N_TO_F_U "vcvt") + (VCVTQ_M_TO_F_S "vcvt") (VCVTQ_M_TO_F_U "vcvt") + (VCVTQ_N_FROM_F_S "vcvt") (VCVTQ_N_FROM_F_U "vcvt") + (VCVTQ_N_TO_F_S "vcvt") (VCVTQ_N_TO_F_U "vcvt") + (VCVTQ_TO_F_S "vcvt") (VCVTQ_TO_F_U "vcvt") (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup") (VDUPQ_N_S "vdup") (VDUPQ_N_U "vdup") (VDUPQ_N_F "vdup") (VEORQ_M_S "veor") (VEORQ_M_U "veor") (VEORQ_M_F "veor") diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 706a45c7d66..95c615c1534 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -248,17 +248,17 @@ (define_insn "mve_vcvtbq_f32_f16v4sf" ]) ;; -;; [vcvtq_to_f_s, vcvtq_to_f_u]) +;; [vcvtq_to_f_s, vcvtq_to_f_u] ;; -(define_insn "mve_vcvtq_to_f_" +(define_insn "@mve_q_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "w")] VCVTQ_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.f%#.%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_to_f_")) + ".f%#.%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_to_f_")) (set_attr "type" "mve_move") ]) @@ -278,17 +278,17 @@ (define_insn "@mve_q_" ]) ;; -;; [vcvtq_from_f_s, vcvtq_from_f_u]) +;; [vcvtq_from_f_s, vcvtq_from_f_u] ;; -(define_insn "mve_vcvtq_from_f_" +(define_insn "@mve_q_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] VCVTQ_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_from_f_")) + ".%#.f%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_from_f_")) (set_attr "type" "mve_move") ]) @@ -581,9 +581,9 @@ (define_insn "@mve_q_n_f" ]) ;; -;; [vcvtq_n_to_f_s, vcvtq_n_to_f_u]) +;; [vcvtq_n_to_f_s, vcvtq_n_to_f_u] ;; -(define_insn "mve_vcvtq_n_to_f_" +(define_insn "@mve_q_n_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "w") @@ -591,8 +591,8 @@ (define_insn "mve_vcvtq_n_to_f_" VCVTQ_N_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.f.\t%q0, %q1, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_to_f_")) + ".f.\t%q0, %q1, %2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_to_f_")) (set_attr "type" "mve_move") ]) @@ -679,9 +679,9 @@ (define_insn "mve_vshrq_n_u_imm" ]) ;; -;; [vcvtq_n_from_f_s, vcvtq_n_from_f_u]) +;; [vcvtq_n_from_f_s, vcvtq_n_from_f_u] ;; -(define_insn "mve_vcvtq_n_from_f_" +(define_insn "@mve_q_n_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w") @@ -689,8 +689,8 @@ (define_insn "mve_vcvtq_n_from_f_" VCVTQ_N_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt..f\t%q0, %q1, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_from_f_")) + "..f\t%q0, %q1, %2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_from_f_")) (set_attr "type" "mve_move") ]) @@ -1672,9 +1672,9 @@ (define_insn "mve_vcvtaq_m_" (set_attr "length""8")]) ;; -;; [vcvtq_m_to_f_s, vcvtq_m_to_f_u]) +;; [vcvtq_m_to_f_s, vcvtq_m_to_f_u] ;; -(define_insn "mve_vcvtq_m_to_f_" +(define_insn "@mve_q_m_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") @@ -1683,8 +1683,8 @@ (define_insn "mve_vcvtq_m_to_f_" VCVTQ_M_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.f%#.%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_to_f_")) + "vpst\;t.f%#.%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_to_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2651,9 +2651,9 @@ (define_insn "mve_vcvtnq_m_" (set_attr "length""8")]) ;; -;; [vcvtq_m_n_from_f_s, vcvtq_m_n_from_f_u]) +;; [vcvtq_m_n_from_f_s, vcvtq_m_n_from_f_u] ;; -(define_insn "mve_vcvtq_m_n_from_f_" +(define_insn "@mve_q_m_n_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") @@ -2663,8 +2663,8 @@ (define_insn "mve_vcvtq_m_n_from_f_" VCVTQ_M_N_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.%#.f%#\t%q0, %q2, %3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_from_f_")) + "vpst\;t.%#.f%#\t%q0, %q2, %3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_from_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2686,9 +2686,9 @@ (define_insn "@mve_q_m_" (set_attr "length""8")]) ;; -;; [vcvtq_m_from_f_u, vcvtq_m_from_f_s]) +;; [vcvtq_m_from_f_u, vcvtq_m_from_f_s] ;; -(define_insn "mve_vcvtq_m_from_f_" +(define_insn "@mve_q_m_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") @@ -2697,8 +2697,8 @@ (define_insn "mve_vcvtq_m_from_f_" VCVTQ_M_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_from_f_")) + "vpst\;t.%#.f%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_from_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2757,9 +2757,9 @@ (define_insn "@mve_q_m_n_" (set_attr "length" "8")]) ;; -;; [vcvtq_m_n_to_f_u, vcvtq_m_n_to_f_s]) +;; [vcvtq_m_n_to_f_u, vcvtq_m_n_to_f_s] ;; -(define_insn "mve_vcvtq_m_n_to_f_" +(define_insn "@mve_q_m_n_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") @@ -2769,8 +2769,8 @@ (define_insn "mve_vcvtq_m_n_to_f_" VCVTQ_M_N_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.f%#.%#\t%q0, %q2, %3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_to_f_")) + "vpst\;t.f%#.%#\t%q0, %q2, %3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_to_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) From patchwork Wed Sep 4 13:26:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980797 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=r2S9AB56; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNdg4JX7z1yfv for ; Wed, 4 Sep 2024 23:29:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CF129385DDCF for ; Wed, 4 Sep 2024 13:29:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id CDF2A385DC1E for ; Wed, 4 Sep 2024 13:27:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CDF2A385DC1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CDF2A385DC1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; cv=none; b=C42Z8PqkYg3U1jssQUdXixkV6S47FPeYxpvm0wuXcu3CsDjqgZEP24Z0zZJ05fUc0fmIdDpHjfyLmb3tL9svZ1S3udvrLY23/aeFcf2QIdP0I5BYlrjXbvxDq/7YeiaICF+licBLRMQeIDzD1TPQe9TPBN41pVCXH0v5AvHeV9I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456449; c=relaxed/simple; bh=706o++dQQIX9pMj2689NC1AXZeWyTy2gNdo+fwSfb4g=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=d70kLggOM5KldyZvA5ApopAwxYYsyWeL26Mqv2dW8g1xvkb+KuNzm/tnuX83u2rrLJ00jwiMRe8qmNg+q5IwGCMzISOeVdDLyCpYb3Z+nCWUJSXGG1gFjpknGRkyV9eK7TzmCNHm1RlI0Ev84acPDcwqqEamrDP8hz7Kerc8Etc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5de8ca99d15so3923394eaf.0 for ; Wed, 04 Sep 2024 06:27:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456445; x=1726061245; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uzb9243F9Occ2dJ/zYB9Tl/HaddiuF7kjVTQ0SiZsJI=; b=r2S9AB56guZx8PKbg0VoeM3ostBuK9pvpo0BJciKmqxutOsR+ZinHMcRdRwS/hZtsn pKm/OUKW3n6dU1709sYhgDHtETreO0k12BTUnQXjO1irg2Lv703S6DXYZg7e/RgMuByP JzFxBibDYrLklVpyQPKwhTHrLwGb6YJ1ymCGqjk+OLseQkWq//hewyZ/ns6+ah4WqYS3 9STdUsqNzuRx43ral7EAzMCLnkeC1Y5tUiSKQTc1JTEcTH154vYhn/y+OpRg0bLYdmY2 XztMCFvxkiHky+Cdlo9Neq8VIBCt3v0TsyShczg3SFZkDokqdFjMrtDzfMWHFSYycnki lIAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456445; x=1726061245; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uzb9243F9Occ2dJ/zYB9Tl/HaddiuF7kjVTQ0SiZsJI=; b=FsWcPdOHhGiZ6cgIj6yceL6ISASpBuq2J5zVrNE4v26/MFKHbgMmGEJynxy58Qz2T5 Ex61Q2RlJmzIKLLELoTY30PZgVOda5hoJcNW28aRZoox+tRqIfeA6zH5Jnd04w5hD3Lf SFId/wXqLIpkUeuKhxWdhVnDDp1djeOopBfo8i+A19dKOLshk+Ftco9GhGsYmMAjvmZh JSOktesOu+3JXXAC3+ABHy0RC8IO9IFKyGpb1hKtEDhPttDqyu2qHzjtV4UzTQt9csYa MPnAV+M5H8/C7waObBHefyViXbZihvYOz+teyZaoE6+sMvTFSntaSovDxlV3nZOnDGlL lXgw== X-Gm-Message-State: AOJu0YzylMvnMQ6IuM6t8n+2Fr2R6NnYKWHDtyLvBMpbKb2HFEpagxel W6Kb8qDM8AcFK98xNLUk7ufh6mCMDy3Ebc1+FIKtYb8438zPfKPKmWVAPr9gddqydbOG53PNXOp alp9heA== X-Google-Smtp-Source: AGHT+IG01LKRayuMPTiUOI4B/cp/9JfHKCa3yZoy6yLFo1HmDNRmQQLqcOa0Eig+Q4eYLTFWkEdwNw== X-Received: by 2002:a05:6820:545:b0:5c6:8eb6:91b2 with SMTP id 006d021491bc7-5dfacdde2c6mr20856094eaf.1.1725456444609; Wed, 04 Sep 2024 06:27:24 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:23 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 05/36] arm: [MVE intrinsics] add vcvt shape Date: Wed, 4 Sep 2024 13:26:19 +0000 Message-Id: <20240904132650.2720446-6-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvt shape description. It needs to add a new type_suffix_info parameter to explicit_type_suffix_p (), because vcvt uses overloads for type suffixes for integer to floating-point conversions, but not for floating-point to integer. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (nonoverloaded_base::explicit_type_suffix_p): Add unused type_suffix_info parameter. (overloaded_base::explicit_type_suffix_p): Likewise. (unary_n_def::explicit_type_suffix_p): Likewise. (vcvt): New. * config/arm/arm-mve-builtins-shapes.h (vcvt): New. * config/arm/arm-mve-builtins.cc (function_builder::get_name): Add new type_suffix parameter. (function_builder::add_overloaded_functions): Likewise. * config/arm/arm-mve-builtins.h (function_shape::explicit_type_suffix_p): Likewise. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 108 +++++++++++++++++++++- gcc/config/arm/arm-mve-builtins-shapes.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 9 +- gcc/config/arm/arm-mve-builtins.h | 10 +- 4 files changed, 119 insertions(+), 9 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 0520a8331db..bc99a6a7c43 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -330,7 +330,8 @@ build_16_32 (function_builder &b, const char *signature, struct nonoverloaded_base : public function_shape { bool - explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const override + explicit_type_suffix_p (unsigned int, enum predication_index, + enum mode_suffix_index, type_suffix_info) const override { return true; } @@ -360,7 +361,8 @@ template struct overloaded_base : public function_shape { bool - explicit_type_suffix_p (unsigned int i, enum predication_index, enum mode_suffix_index) const override + explicit_type_suffix_p (unsigned int i, enum predication_index, + enum mode_suffix_index, type_suffix_info) const override { return (EXPLICIT_MASK >> i) & 1; } @@ -1856,7 +1858,7 @@ struct unary_n_def : public overloaded_base<0> { bool explicit_type_suffix_p (unsigned int, enum predication_index pred, - enum mode_suffix_index) const override + enum mode_suffix_index, type_suffix_info) const override { return pred != PRED_m; } @@ -1979,6 +1981,106 @@ struct unary_widen_acc_def : public overloaded_base<0> }; SHAPE (unary_widen_acc) +/* _t foo_t0[_t1](_t) + _t foo_t0_n[_t1](_t, const int) + + Example: vcvtq. + float32x4_t [__arm_]vcvtq[_f32_s32](int32x4_t a) + float32x4_t [__arm_]vcvtq_m[_f32_s32](float32x4_t inactive, int32x4_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_x[_f32_s32](int32x4_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_n[_f32_s32](int32x4_t a, const int imm6) + float32x4_t [__arm_]vcvtq_m_n[_f32_s32](float32x4_t inactive, int32x4_t a, const int imm6, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_x_n[_f32_s32](int32x4_t a, const int imm6, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_s32_f32(float32x4_t a) + int32x4_t [__arm_]vcvtq_m[_s32_f32](int32x4_t inactive, float32x4_t a, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_x_s32_f32(float32x4_t a, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_n_s32_f32(float32x4_t a, const int imm6) + int32x4_t [__arm_]vcvtq_m_n[_s32_f32](int32x4_t inactive, float32x4_t a, const int imm6, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_x_n_s32_f32(float32x4_t a, const int imm6, mve_pred16_t p) */ +struct vcvt_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info type_info) const override + { + if (pred != PRED_m + && ((i == 0 && type_info.integer_p) + || (i == 1 && type_info.float_p))) + return true; + return false; + } + + bool + explicit_mode_suffix_p (enum predication_index, + enum mode_suffix_index) const override + { + return true; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v1,su64", group, MODE_n, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index from_type; + tree res; + unsigned int nimm = (r.mode_suffix_id == MODE_none) ? 0 : 1; + + if (!r.check_gp_argument (1 + nimm, i, nargs) + || (from_type + = r.infer_vector_type (i - nimm)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + if (nimm > 0 + && !r.require_integer_immediate (i)) + return error_mark_node; + + type_suffix_index to_type; + + if (type_suffixes[from_type].integer_p) + { + to_type = find_type_suffix (TYPE_float, + type_suffixes[from_type].element_bits); + } + else + { + /* This should not happen: when 'from_type' is float, the type + suffixes are not overloaded (except for "m" predication, + handled above). */ + gcc_assert (r.pred == PRED_m); + + /* Get the return type from the 'inactive' argument. */ + to_type = r.infer_vector_type (0); + } + + if ((res = r.lookup_form (r.mode_suffix_id, to_type, from_type))) + return res; + + return r.report_no_such_form (from_type); + } + + bool + check (function_checker &c) const override + { + if (c.mode_suffix_id == MODE_none) + return true; + + unsigned int bits = c.type_suffix (0).element_bits; + return c.require_immediate_range (1, 1, bits); + } +}; +SHAPE (vcvt) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 61aa4fa73b3..9a112ceeb29 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -77,6 +77,7 @@ namespace arm_mve extern const function_shape *const unary_n; extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; + extern const function_shape *const vcvt; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 7e8217666fe..ea44f463dd8 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -823,7 +823,8 @@ function_builder::get_name (const function_instance &instance, for (unsigned int i = 0; i < 2; ++i) if (!overloaded_p || instance.shape->explicit_type_suffix_p (i, instance.pred, - instance.mode_suffix_id)) + instance.mode_suffix_id, + instance.type_suffix (i))) append_name (instance.type_suffix (i).string); return finish_name (); } @@ -1001,9 +1002,11 @@ function_builder::add_overloaded_functions (const function_group_info &group, for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) { unsigned int explicit_type0 - = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode); + = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode, + type_suffixes[NUM_TYPE_SUFFIXES]); unsigned int explicit_type1 - = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode); + = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode, + type_suffixes[NUM_TYPE_SUFFIXES]); if ((*group.shape)->skip_overload_p (group.preds[pi], mode)) continue; diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve-builtins.h index f282236a843..3306736bff0 100644 --- a/gcc/config/arm/arm-mve-builtins.h +++ b/gcc/config/arm/arm-mve-builtins.h @@ -571,9 +571,13 @@ public: class function_shape { public: - virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const = 0; - virtual bool explicit_mode_suffix_p (enum predication_index, enum mode_suffix_index) const = 0; - virtual bool skip_overload_p (enum predication_index, enum mode_suffix_index) const = 0; + virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, + enum mode_suffix_index, + type_suffix_info) const = 0; + virtual bool explicit_mode_suffix_p (enum predication_index, + enum mode_suffix_index) const = 0; + virtual bool skip_overload_p (enum predication_index, + enum mode_suffix_index) const = 0; /* Define all functions associated with the given group. */ virtual void build (function_builder &, From patchwork Wed Sep 4 13:26:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980801 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=M/qvIXNc; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNfH6jzQz1yg3 for ; Wed, 4 Sep 2024 23:30:31 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85F493865C2C for ; Wed, 4 Sep 2024 13:30:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by sourceware.org (Postfix) with ESMTPS id 626BB385E82A for ; Wed, 4 Sep 2024 13:27:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 626BB385E82A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 626BB385E82A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456456; cv=none; b=PnyPpwuOZ/biKVEP90j/fe6aIYfIf5T75yrs3zoza9HNpzZwb6xqghumXLoPY4391qNHuTO22nYiAbuX/tcR4In3lGkPcIrRUe2yN0RPDmiVN6Nf6AKuVuwrYEYTLa+mz4DBpPF9paUyZfpDp3P/4FeQLK8HzR5RqpK7zwSnI+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456456; c=relaxed/simple; bh=92WQpp35zeLdzJuAFI3DYQpXbTJ3yfo0GPe7nx5xYSM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=ID9xhlCJG2PO8egnAkSNfcI7SNtGgZdrcs3w3Iw6e6HFpd0FDfZeWImG7nOloOy/2OOAXfWaaHJfgRvYHd8PITWmOdCaWQ9IJ1evvRSWLrM9TaKNtbQQNyodgHxsembaKBUFcuHZ5SFU/yc4VHTJB5erOvPh/qQQKyznXGZLl4I= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-5df9c34c890so550396eaf.1 for ; Wed, 04 Sep 2024 06:27:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456446; x=1726061246; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I8Y6v8NlvUrfs0mHzhKbMHRzK4zZZjB5be+ujcq+neQ=; b=M/qvIXNcm6BNhEOmlLRmf+R6h6P03eIYHX+TQ9tTX0Apn+EL/+M40ZaLA0rhTFLaGq Z5SwO81NLh7cEMKrqYvPsqdifThiRA1VHukBPjPLit1NP/jowhJgLYRvqJ7WXQbddICW dbG3w+NTLvgR50bj1Z0uW6N6MkflQdvUN/2nSIaUtvS/w9dRvyTm/nDypd/TskzK9pjy 9aelS1QqY7oKSq0UoYlrMowmXTFfcdVobv78qSgaq67vbntquZjECBe7r8qXq3R3f004 nRrjpmWaY7AEkahRd9SsErMK1OOBAzpIkfabxHaSM+h+MiX24iGlgGAQ2eETUMSpA1L7 ta4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456446; x=1726061246; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I8Y6v8NlvUrfs0mHzhKbMHRzK4zZZjB5be+ujcq+neQ=; b=NA7Lgj3czPN4y5hhAXeD6Qe6O64CrlKy/jH7oBagpM5ybSVyVmfMXp4t9Qk75aj1Fv tqg/NBXYsiUc/ECUdR0KmKfuTDjECOfpndERH3yp8+eTJ4L+rNbcc8vIAcsBcgM1bE5t cVghzOc4KS2/ZtWc1lHH2gdQfs+UzvD3g1W8eVdtlAFIaH1yG2LB5ULg1HgumKiqcB9u TNbVykr3OltEotwRRWRbFggU4B9sI8PNMyaD3YWgw69/+xRN3Piaz7pFAwiO69IYYNqY JXzQiXotrfygyNjRBhbrthJDGbo2vQATTOhiubhIq13cyWUt/YkA5txr5ZPt/4FP5Kd7 nADg== X-Gm-Message-State: AOJu0Ywb3hncAPYbbSROAV1in6vZ6y8ejzM429K3LLgG9lUQoiT3KN6k /0hdoRra7kKDvhgwIXKEzW4l2pSAF4poY3cOPKk74q/a20mQ/XiFL0Tw/EMUJCZXzOIDfhSgxCd kPzGYtQ== X-Google-Smtp-Source: AGHT+IFFUGjn49Vq6iDYmit5muCEfEbLZdxPTcAREzRZJ98uHGOTGIIcSz5EsaQHSzYZohU34k5A/Q== X-Received: by 2002:a05:6820:22a9:b0:5d5:b49c:b6f7 with SMTP id 006d021491bc7-5e18c217529mr3247806eaf.7.1725456445738; Wed, 04 Sep 2024 06:27:25 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:25 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 06/36] arm: [MVE intrinsics] rework vcvtq Date: Wed, 4 Sep 2024 13:26:20 +0000 Message-Id: <20240904132650.2720446-7-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtq using the new MVE builtins framework. In config/arm/arm-mve-builtins-base.def, the patch also restores the alphabetical order. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtq_impl): New. (vcvtq): New. * config/arm/arm-mve-builtins-base.def (vcvtq): New. * config/arm/arm-mve-builtins-base.h (vcvtq): New. * config/arm/arm-mve-builtins.cc (cvt): New type. * config/arm/arm_mve.h (vcvtq): Delete. (vcvtq_n): Delete. (vcvtq_m): Delete. (vcvtq_m_n): Delete. (vcvtq_x): Delete. (vcvtq_x_n): Delete. (vcvtq_f16_s16): Delete. (vcvtq_f32_s32): Delete. (vcvtq_f16_u16): Delete. (vcvtq_f32_u32): Delete. (vcvtq_s16_f16): Delete. (vcvtq_s32_f32): Delete. (vcvtq_u16_f16): Delete. (vcvtq_u32_f32): Delete. (vcvtq_n_f16_s16): Delete. (vcvtq_n_f32_s32): Delete. (vcvtq_n_f16_u16): Delete. (vcvtq_n_f32_u32): Delete. (vcvtq_n_s16_f16): Delete. (vcvtq_n_s32_f32): Delete. (vcvtq_n_u16_f16): Delete. (vcvtq_n_u32_f32): Delete. (vcvtq_m_f16_s16): Delete. (vcvtq_m_f16_u16): Delete. (vcvtq_m_f32_s32): Delete. (vcvtq_m_f32_u32): Delete. (vcvtq_m_s16_f16): Delete. (vcvtq_m_u16_f16): Delete. (vcvtq_m_s32_f32): Delete. (vcvtq_m_u32_f32): Delete. (vcvtq_m_n_f16_u16): Delete. (vcvtq_m_n_f16_s16): Delete. (vcvtq_m_n_f32_u32): Delete. (vcvtq_m_n_f32_s32): Delete. (vcvtq_m_n_s32_f32): Delete. (vcvtq_m_n_s16_f16): Delete. (vcvtq_m_n_u32_f32): Delete. (vcvtq_m_n_u16_f16): Delete. (vcvtq_x_f16_u16): Delete. (vcvtq_x_f16_s16): Delete. (vcvtq_x_f32_s32): Delete. (vcvtq_x_f32_u32): Delete. (vcvtq_x_n_f16_s16): Delete. (vcvtq_x_n_f16_u16): Delete. (vcvtq_x_n_f32_s32): Delete. (vcvtq_x_n_f32_u32): Delete. (vcvtq_x_s16_f16): Delete. (vcvtq_x_s32_f32): Delete. (vcvtq_x_u16_f16): Delete. (vcvtq_x_u32_f32): Delete. (vcvtq_x_n_s16_f16): Delete. (vcvtq_x_n_s32_f32): Delete. (vcvtq_x_n_u16_f16): Delete. (vcvtq_x_n_u32_f32): Delete. (__arm_vcvtq_f16_s16): Delete. (__arm_vcvtq_f32_s32): Delete. (__arm_vcvtq_f16_u16): Delete. (__arm_vcvtq_f32_u32): Delete. (__arm_vcvtq_s16_f16): Delete. (__arm_vcvtq_s32_f32): Delete. (__arm_vcvtq_u16_f16): Delete. (__arm_vcvtq_u32_f32): Delete. (__arm_vcvtq_n_f16_s16): Delete. (__arm_vcvtq_n_f32_s32): Delete. (__arm_vcvtq_n_f16_u16): Delete. (__arm_vcvtq_n_f32_u32): Delete. (__arm_vcvtq_n_s16_f16): Delete. (__arm_vcvtq_n_s32_f32): Delete. (__arm_vcvtq_n_u16_f16): Delete. (__arm_vcvtq_n_u32_f32): Delete. (__arm_vcvtq_m_f16_s16): Delete. (__arm_vcvtq_m_f16_u16): Delete. (__arm_vcvtq_m_f32_s32): Delete. (__arm_vcvtq_m_f32_u32): Delete. (__arm_vcvtq_m_s16_f16): Delete. (__arm_vcvtq_m_u16_f16): Delete. (__arm_vcvtq_m_s32_f32): Delete. (__arm_vcvtq_m_u32_f32): Delete. (__arm_vcvtq_m_n_f16_u16): Delete. (__arm_vcvtq_m_n_f16_s16): Delete. (__arm_vcvtq_m_n_f32_u32): Delete. (__arm_vcvtq_m_n_f32_s32): Delete. (__arm_vcvtq_m_n_s32_f32): Delete. (__arm_vcvtq_m_n_s16_f16): Delete. (__arm_vcvtq_m_n_u32_f32): Delete. (__arm_vcvtq_m_n_u16_f16): Delete. (__arm_vcvtq_x_f16_u16): Delete. (__arm_vcvtq_x_f16_s16): Delete. (__arm_vcvtq_x_f32_s32): Delete. (__arm_vcvtq_x_f32_u32): Delete. (__arm_vcvtq_x_n_f16_s16): Delete. (__arm_vcvtq_x_n_f16_u16): Delete. (__arm_vcvtq_x_n_f32_s32): Delete. (__arm_vcvtq_x_n_f32_u32): Delete. (__arm_vcvtq_x_s16_f16): Delete. (__arm_vcvtq_x_s32_f32): Delete. (__arm_vcvtq_x_u16_f16): Delete. (__arm_vcvtq_x_u32_f32): Delete. (__arm_vcvtq_x_n_s16_f16): Delete. (__arm_vcvtq_x_n_s32_f32): Delete. (__arm_vcvtq_x_n_u16_f16): Delete. (__arm_vcvtq_x_n_u32_f32): Delete. (__arm_vcvtq): Delete. (__arm_vcvtq_n): Delete. (__arm_vcvtq_m): Delete. (__arm_vcvtq_m_n): Delete. (__arm_vcvtq_x): Delete. (__arm_vcvtq_x_n): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 113 ++++ gcc/config/arm/arm-mve-builtins-base.def | 19 +- gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 15 + gcc/config/arm/arm_mve.h | 666 ----------------------- 5 files changed, 139 insertions(+), 675 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e0ae593a6c0..a780d686eb1 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -139,6 +139,118 @@ public: } }; + /* Implements vcvtq intrinsics. */ +class vcvtq_impl : public function_base +{ +public: + rtx + expand (function_expander &e) const override + { + insn_code code; + machine_mode target_mode = e.vector_mode (0); + int unspec; + switch (e.pred) + { + case PRED_none: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No predicate, no suffix. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_FROM_F_U + : VCVTQ_FROM_F_S; + code = code_for_mve_q_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_TO_F_U + : VCVTQ_TO_F_S; + code = code_for_mve_q_to_f (unspec, unspec, target_mode); + } + break; + + case MODE_n: + /* No predicate, _n suffix. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_N_FROM_F_U + : VCVTQ_N_FROM_F_S; + code = code_for_mve_q_n_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_N_TO_F_U + : VCVTQ_N_TO_F_S; + code = code_for_mve_q_n_to_f (unspec, unspec, target_mode); + } + break; + + default: + gcc_unreachable (); + } + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_M_FROM_F_U + : VCVTQ_M_FROM_F_S; + code = code_for_mve_q_m_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_M_TO_F_U + : VCVTQ_M_TO_F_S; + code = code_for_mve_q_m_to_f (unspec, unspec, target_mode); + } + break; + + case MODE_n: + /* _n suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_M_N_FROM_F_U + : VCVTQ_M_N_FROM_F_S; + code = code_for_mve_q_m_n_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_M_N_TO_F_U + : VCVTQ_M_N_TO_F_S; + code = code_for_mve_q_m_n_to_f (unspec, unspec, target_mode); + } + break; + + default: + gcc_unreachable (); + } + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + + gcc_unreachable (); + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -339,6 +451,7 @@ FUNCTION (vcmpltq, unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT, FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN)) FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) +FUNCTION (vcvtq, vcvtq_impl,) FUNCTION_ONLY_N (vdupq, VDUPQ) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 90d031eebec..671f86b5096 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -28,8 +28,8 @@ DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none) DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_integer, mx_or_none) -DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vclsq, unary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vclzq, unary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcmpcsq, cmp, all_unsigned, m_or_none) @@ -44,8 +44,8 @@ DEF_MVE_FUNCTION (vcreateq, create, all_integer_with_64, none) DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none) -DEF_MVE_FUNCTION (vhcaddq_rot90, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhcaddq_rot270, binary, all_signed, mx_or_none) +DEF_MVE_FUNCTION (vhcaddq_rot90, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhsubq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vld1q, load, all_integer, none) DEF_MVE_FUNCTION (vmaxaq, binary_maxamina, all_signed, m_or_none) @@ -80,8 +80,8 @@ DEF_MVE_FUNCTION (vmovnbq, binary_move_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vmovntq, binary_move_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vmulhq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmullbq_int, binary_widen, all_integer, mx_or_none) -DEF_MVE_FUNCTION (vmulltq_int, binary_widen, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmullbq_poly, binary_widen_poly, poly_8_16, mx_or_none) +DEF_MVE_FUNCTION (vmulltq_int, binary_widen, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmulltq_poly, binary_widen_poly, poly_8_16, mx_or_none) DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmvnq, mvn, all_integer, mx_or_none) @@ -162,23 +162,24 @@ DEF_MVE_FUNCTION (vabsq, unary, all_float, mx_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) -DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmlaq, ternary, all_float, m_or_none) -DEF_MVE_FUNCTION (vcmlaq_rot90, ternary, all_float, m_or_none) DEF_MVE_FUNCTION (vcmlaq_rot180, ternary, all_float, m_or_none) DEF_MVE_FUNCTION (vcmlaq_rot270, ternary, all_float, m_or_none) -DEF_MVE_FUNCTION (vcmulq, binary, all_float, mx_or_none) -DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) -DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) -DEF_MVE_FUNCTION (vcmulq_rot270, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmlaq_rot90, ternary, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpeqq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgeq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgtq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpleq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpltq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpneq, cmp, all_float, m_or_none) +DEF_MVE_FUNCTION (vcmulq, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot270, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) +DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_float, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vfmaq, ternary_opt_n, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index c9b52a81c5e..dee73d9c457 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -54,6 +54,7 @@ extern const function_base *const vcmulq_rot180; extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; +extern const function_base *const vcvtq; extern const function_base *const vdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index ea44f463dd8..3c5b54dade1 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -205,6 +205,20 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { #define TYPES_signed_32(S, D) \ S (s32) +/* All the type combinations allowed by vcvtq. */ +#define TYPES_cvt(S, D) \ + D (f16, s16), \ + D (f16, u16), \ + \ + D (f32, s32), \ + D (f32, u32), \ + \ + D (s16, f16), \ + D (s32, f32), \ + \ + D (u16, f16), \ + D (u32, f32) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -284,6 +298,7 @@ DEF_MVE_TYPES_ARRAY (integer_32); DEF_MVE_TYPES_ARRAY (poly_8_16); DEF_MVE_TYPES_ARRAY (signed_16_32); DEF_MVE_TYPES_ARRAY (signed_32); +DEF_MVE_TYPES_ARRAY (cvt); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index ae1b5438797..07897f510f5 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -139,18 +139,12 @@ #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) #define vcvttq_f32(__a) __arm_vcvttq_f32(__a) #define vcvtbq_f32(__a) __arm_vcvtbq_f32(__a) -#define vcvtq(__a) __arm_vcvtq(__a) -#define vcvtq_n(__a, __imm6) __arm_vcvtq_n(__a, __imm6) #define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtq_m(__inactive, __a, __p) __arm_vcvtq_m(__inactive, __a, __p) #define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) #define vcvttq_m(__a, __b, __p) __arm_vcvttq_m(__a, __b, __p) #define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) #define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) #define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) -#define vcvtq_m_n(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n(__inactive, __a, __imm6, __p) -#define vcvtq_x(__a, __p) __arm_vcvtq_x(__a, __p) -#define vcvtq_x_n(__a, __imm6, __p) __arm_vcvtq_x_n(__a, __imm6, __p) #define vst4q_s8( __addr, __value) __arm_vst4q_s8( __addr, __value) @@ -163,10 +157,6 @@ #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) #define vcvttq_f32_f16(__a) __arm_vcvttq_f32_f16(__a) #define vcvtbq_f32_f16(__a) __arm_vcvtbq_f32_f16(__a) -#define vcvtq_f16_s16(__a) __arm_vcvtq_f16_s16(__a) -#define vcvtq_f32_s32(__a) __arm_vcvtq_f32_s32(__a) -#define vcvtq_f16_u16(__a) __arm_vcvtq_f16_u16(__a) -#define vcvtq_f32_u32(__a) __arm_vcvtq_f32_u32(__a) #define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) #define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) #define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) @@ -175,10 +165,6 @@ #define vcvtpq_s32_f32(__a) __arm_vcvtpq_s32_f32(__a) #define vcvtmq_s16_f16(__a) __arm_vcvtmq_s16_f16(__a) #define vcvtmq_s32_f32(__a) __arm_vcvtmq_s32_f32(__a) -#define vcvtq_s16_f16(__a) __arm_vcvtq_s16_f16(__a) -#define vcvtq_s32_f32(__a) __arm_vcvtq_s32_f32(__a) -#define vcvtq_u16_f16(__a) __arm_vcvtq_u16_f16(__a) -#define vcvtq_u32_f32(__a) __arm_vcvtq_u32_f32(__a) #define vcvtpq_u16_f16(__a) __arm_vcvtpq_u16_f16(__a) #define vcvtpq_u32_f32(__a) __arm_vcvtpq_u32_f32(__a) #define vcvtnq_u16_f16(__a) __arm_vcvtnq_u16_f16(__a) @@ -192,14 +178,6 @@ #define vctp64q(__a) __arm_vctp64q(__a) #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) -#define vcvtq_n_f16_s16(__a, __imm6) __arm_vcvtq_n_f16_s16(__a, __imm6) -#define vcvtq_n_f32_s32(__a, __imm6) __arm_vcvtq_n_f32_s32(__a, __imm6) -#define vcvtq_n_f16_u16(__a, __imm6) __arm_vcvtq_n_f16_u16(__a, __imm6) -#define vcvtq_n_f32_u32(__a, __imm6) __arm_vcvtq_n_f32_u32(__a, __imm6) -#define vcvtq_n_s16_f16(__a, __imm6) __arm_vcvtq_n_s16_f16(__a, __imm6) -#define vcvtq_n_s32_f32(__a, __imm6) __arm_vcvtq_n_s32_f32(__a, __imm6) -#define vcvtq_n_u16_f16(__a, __imm6) __arm_vcvtq_n_u16_f16(__a, __imm6) -#define vcvtq_n_u32_f32(__a, __imm6) __arm_vcvtq_n_u32_f32(__a, __imm6) #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) #define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b) #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) @@ -234,10 +212,6 @@ #define vcvtaq_m_u16_f16(__inactive, __a, __p) __arm_vcvtaq_m_u16_f16(__inactive, __a, __p) #define vcvtaq_m_s32_f32(__inactive, __a, __p) __arm_vcvtaq_m_s32_f32(__inactive, __a, __p) #define vcvtaq_m_u32_f32(__inactive, __a, __p) __arm_vcvtaq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_f16_s16(__inactive, __a, __p) __arm_vcvtq_m_f16_s16(__inactive, __a, __p) -#define vcvtq_m_f16_u16(__inactive, __a, __p) __arm_vcvtq_m_f16_u16(__inactive, __a, __p) -#define vcvtq_m_f32_s32(__inactive, __a, __p) __arm_vcvtq_m_f32_s32(__inactive, __a, __p) -#define vcvtq_m_f32_u32(__inactive, __a, __p) __arm_vcvtq_m_f32_u32(__inactive, __a, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) @@ -251,23 +225,15 @@ #define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) #define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) #define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) -#define vcvtq_m_s16_f16(__inactive, __a, __p) __arm_vcvtq_m_s16_f16(__inactive, __a, __p) #define vcvtmq_m_u16_f16(__inactive, __a, __p) __arm_vcvtmq_m_u16_f16(__inactive, __a, __p) #define vcvtnq_m_u16_f16(__inactive, __a, __p) __arm_vcvtnq_m_u16_f16(__inactive, __a, __p) #define vcvtpq_m_u16_f16(__inactive, __a, __p) __arm_vcvtpq_m_u16_f16(__inactive, __a, __p) -#define vcvtq_m_u16_f16(__inactive, __a, __p) __arm_vcvtq_m_u16_f16(__inactive, __a, __p) #define vcvtmq_m_s32_f32(__inactive, __a, __p) __arm_vcvtmq_m_s32_f32(__inactive, __a, __p) #define vcvtnq_m_s32_f32(__inactive, __a, __p) __arm_vcvtnq_m_s32_f32(__inactive, __a, __p) #define vcvtpq_m_s32_f32(__inactive, __a, __p) __arm_vcvtpq_m_s32_f32(__inactive, __a, __p) -#define vcvtq_m_s32_f32(__inactive, __a, __p) __arm_vcvtq_m_s32_f32(__inactive, __a, __p) #define vcvtmq_m_u32_f32(__inactive, __a, __p) __arm_vcvtmq_m_u32_f32(__inactive, __a, __p) #define vcvtnq_m_u32_f32(__inactive, __a, __p) __arm_vcvtnq_m_u32_f32(__inactive, __a, __p) #define vcvtpq_m_u32_f32(__inactive, __a, __p) __arm_vcvtpq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_u32_f32(__inactive, __a, __p) __arm_vcvtq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_n_f16_u16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f16_u16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f16_s16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f16_s16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f32_u32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f32_u32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f32_s32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f32_s32(__inactive, __a, __imm6, __p) #define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) #define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) #define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) @@ -282,10 +248,6 @@ #define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) #define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) #define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) -#define vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_u16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u16_f16(__inactive, __a, __imm6, __p) #define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) #define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) @@ -600,22 +562,6 @@ #define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) #define vcvtbq_x_f32_f16(__a, __p) __arm_vcvtbq_x_f32_f16(__a, __p) #define vcvttq_x_f32_f16(__a, __p) __arm_vcvttq_x_f32_f16(__a, __p) -#define vcvtq_x_f16_u16(__a, __p) __arm_vcvtq_x_f16_u16(__a, __p) -#define vcvtq_x_f16_s16(__a, __p) __arm_vcvtq_x_f16_s16(__a, __p) -#define vcvtq_x_f32_s32(__a, __p) __arm_vcvtq_x_f32_s32(__a, __p) -#define vcvtq_x_f32_u32(__a, __p) __arm_vcvtq_x_f32_u32(__a, __p) -#define vcvtq_x_n_f16_s16(__a, __imm6, __p) __arm_vcvtq_x_n_f16_s16(__a, __imm6, __p) -#define vcvtq_x_n_f16_u16(__a, __imm6, __p) __arm_vcvtq_x_n_f16_u16(__a, __imm6, __p) -#define vcvtq_x_n_f32_s32(__a, __imm6, __p) __arm_vcvtq_x_n_f32_s32(__a, __imm6, __p) -#define vcvtq_x_n_f32_u32(__a, __imm6, __p) __arm_vcvtq_x_n_f32_u32(__a, __imm6, __p) -#define vcvtq_x_s16_f16(__a, __p) __arm_vcvtq_x_s16_f16(__a, __p) -#define vcvtq_x_s32_f32(__a, __p) __arm_vcvtq_x_s32_f32(__a, __p) -#define vcvtq_x_u16_f16(__a, __p) __arm_vcvtq_x_u16_f16(__a, __p) -#define vcvtq_x_u32_f32(__a, __p) __arm_vcvtq_x_u32_f32(__a, __p) -#define vcvtq_x_n_s16_f16(__a, __imm6, __p) __arm_vcvtq_x_n_s16_f16(__a, __imm6, __p) -#define vcvtq_x_n_s32_f32(__a, __imm6, __p) __arm_vcvtq_x_n_s32_f32(__a, __imm6, __p) -#define vcvtq_x_n_u16_f16(__a, __imm6, __p) __arm_vcvtq_x_n_u16_f16(__a, __imm6, __p) -#define vcvtq_x_n_u32_f32(__a, __imm6, __p) __arm_vcvtq_x_n_u32_f32(__a, __imm6, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3772,62 +3718,6 @@ __arm_vcvtbq_f32_f16 (float16x8_t __a) return __builtin_mve_vcvtbq_f32_f16v4sf (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f16_s16 (int16x8_t __a) -{ - return __builtin_mve_vcvtq_to_f_sv8hf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f32_s32 (int32x4_t __a) -{ - return __builtin_mve_vcvtq_to_f_sv4sf (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f16_u16 (uint16x8_t __a) -{ - return __builtin_mve_vcvtq_to_f_uv8hf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f32_u32 (uint32x4_t __a) -{ - return __builtin_mve_vcvtq_to_f_uv4sf (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtq_from_f_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtq_from_f_sv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtq_from_f_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtq_from_f_uv4si (__a); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtpq_u16_f16 (float16x8_t __a) @@ -3940,62 +3830,6 @@ __arm_vcvtmq_s32_f32 (float32x4_t __a) return __builtin_mve_vcvtmq_sv4si (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f16_s16 (int16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_sv8hf (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f32_s32 (int32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_sv4sf (__a, __imm6); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f16_u16 (uint16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_uv8hf (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f32_u32 (uint32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_uv4sf (__a, __imm6); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_s16_f16 (float16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_sv8hi (__a, __imm6); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_s32_f32 (float32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_sv4si (__a, __imm6); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_u16_f16 (float16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_uv8hi (__a, __imm6); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_u32_f32 (float32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_uv4si (__a, __imm6); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) @@ -4066,34 +3900,6 @@ __arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p return __builtin_mve_vcvtaq_m_uv4si (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f16_s16 (float16x8_t __inactive, int16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv8hf (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f16_u16 (float16x8_t __inactive, uint16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv8hf (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f32_s32 (float32x4_t __inactive, int32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv4sf (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f32_u32 (float32x4_t __inactive, uint32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv4sf (__inactive, __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) @@ -4144,13 +3950,6 @@ __arm_vcvtpq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __builtin_mve_vcvtpq_m_sv8hi (__inactive, __a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv8hi (__inactive, __a, __p); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -4172,13 +3971,6 @@ __arm_vcvtpq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p return __builtin_mve_vcvtpq_m_uv8hi (__inactive, __a, __p); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv8hi (__inactive, __a, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -4200,13 +3992,6 @@ __arm_vcvtpq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __builtin_mve_vcvtpq_m_sv4si (__inactive, __a, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv4si (__inactive, __a, __p); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -4228,41 +4013,6 @@ __arm_vcvtpq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p return __builtin_mve_vcvtpq_m_uv4si (__inactive, __a, __p); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f16_u16 (float16x8_t __inactive, uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv8hf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f16_s16 (float16x8_t __inactive, int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv8hf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f32_u32 (float32x4_t __inactive, uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv4sf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f32_s32 (float32x4_t __inactive, int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv4sf (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4277,34 +4027,6 @@ __arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_s32_f32 (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv4si (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_s16_f16 (int16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv8hi (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_u32_f32 (uint32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv4si (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_u16_f16 (uint16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv8hi (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4675,118 +4397,6 @@ __arm_vcvttq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) return __builtin_mve_vcvttq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f16_u16 (uint16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv8hf (__arm_vuninitializedq_f16 (), __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f16_s16 (int16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv8hf (__arm_vuninitializedq_f16 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f32_s32 (int32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f32_u32 (uint32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f16_s16 (int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv8hf (__arm_vuninitializedq_f16 (), __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f16_u16 (uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv8hf (__arm_vuninitializedq_f16 (), __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f32_s32 (int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv4sf (__arm_vuninitializedq_f32 (), __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f32_u32 (uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv4sf (__arm_vuninitializedq_f32 (), __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_s16_f16 (float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv8hi (__arm_vuninitializedq_s16 (), __a, __imm6, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_s32_f32 (float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv4si (__arm_vuninitializedq_s32 (), __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_u16_f16 (float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv8hi (__arm_vuninitializedq_u16 (), __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_u32_f32 (float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv4si (__arm_vuninitializedq_u32 (), __a, __imm6, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -7231,62 +6841,6 @@ __arm_vcvtbq_f32 (float16x8_t __a) return __arm_vcvtbq_f32_f16 (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (int16x8_t __a) -{ - return __arm_vcvtq_f16_s16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (int32x4_t __a) -{ - return __arm_vcvtq_f32_s32 (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (uint16x8_t __a) -{ - return __arm_vcvtq_f16_u16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (uint32x4_t __a) -{ - return __arm_vcvtq_f32_u32 (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (int16x8_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f16_s16 (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (int32x4_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f32_s32 (__a, __imm6); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (uint16x8_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f16_u16 (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (uint32x4_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f32_u32 (__a, __imm6); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float16x8_t __a, float16x8_t __b) @@ -7343,34 +6897,6 @@ __arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float16x8_t __inactive, int16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f16_s16 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float16x8_t __inactive, uint16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f16_u16 (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float32x4_t __inactive, int32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f32_s32 (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float32x4_t __inactive, uint32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f32_u32 (__inactive, __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtbq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7420,13 +6946,6 @@ __arm_vcvtpq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_s16_f16 (__inactive, __a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_s16_f16 (__inactive, __a, __p); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -7448,13 +6967,6 @@ __arm_vcvtpq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_u16_f16 (__inactive, __a, __p); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_u16_f16 (__inactive, __a, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -7476,13 +6988,6 @@ __arm_vcvtpq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_s32_f32 (__inactive, __a, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_s32_f32 (__inactive, __a, __p); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -7504,41 +7009,6 @@ __arm_vcvtpq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float16x8_t __inactive, uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f16_u16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float16x8_t __inactive, int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f16_s16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float32x4_t __inactive, uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f32_u32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float32x4_t __inactive, int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f32_s32 (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7553,34 +7023,6 @@ __arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pre return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_s32_f32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (int16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_s16_f16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (uint32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_u32_f32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (uint16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_u16_f16 (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7763,62 +7205,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (uint16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f16_u16 (__a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (int16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f16_s16 (__a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (int32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f32_s32 (__a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (uint32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f32_u32 (__a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f16_s16 (__a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f16_u16 (__a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f32_s32 (__a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f32_u32 (__a, __imm6, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -8276,20 +7662,6 @@ extern void *__ARM_undef; _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvttq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) -#define __arm_vcvtq(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_f16_s16 (__ARM_mve_coerce(__p0, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_f32_s32 (__ARM_mve_coerce(__p0, int32x4_t)), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_f16_u16 (__ARM_mve_coerce(__p0, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_f32_u32 (__ARM_mve_coerce(__p0, uint32x4_t)));}) - -#define __arm_vcvtq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_n_f16_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_n_f32_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_n_f16_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_n_f32_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1));}) - #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -8342,30 +7714,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) -#define __arm_vcvtq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcvtq_m_f16_s16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcvtq_m_f32_s32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcvtq_m_f16_u16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcvtq_m_f32_u32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtq_m_n(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_n_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_n_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_n_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_n_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcvtq_m_n_f16_s16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, int16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcvtq_m_n_f32_s32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcvtq_m_n_f16_u16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcvtq_m_n_f32_u32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) - #define __arm_vcvtbq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -8730,20 +8078,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcvtq_x(p1,p2) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_x_f16_s16 (__ARM_mve_coerce(__p1, int16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_x_f32_s32 (__ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_x_f16_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_x_f32_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2));}) - -#define __arm_vcvtq_x_n(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_x_n_f16_s16 (__ARM_mve_coerce(__p1, int16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_x_n_f32_s32 (__ARM_mve_coerce(__p1, int32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_x_n_f16_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_x_n_f32_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) - #define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ From patchwork Wed Sep 4 13:26:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980795 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=mLvDSRji; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNcF3TlFz1yfv for ; Wed, 4 Sep 2024 23:28:45 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EBFC13858433 for ; Wed, 4 Sep 2024 13:28:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id ED1BD385B508 for ; Wed, 4 Sep 2024 13:27:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ED1BD385B508 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ED1BD385B508 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456452; cv=none; b=rrDM8ztM7Q7tsvfTgiWv3/UJkFtvnaVkKldfJRz1+JctEp2PQvdk09Sah7eLPl8urhHQvfq9r5PfeFZnwF7aFYQE3ZNWIXWmh/URVJ9LRNLz3k421varvK+y+ScPFAUTIpluZ8fD2F4h2vAjEUyOSZkOa1RVdcmHG28ILgLYeZc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456452; c=relaxed/simple; bh=0yJIxR0gHgOLt3ensXeXVIiT8FQst1yCf+KfEf/mZho=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=cM4nQ5+iD9WlS1dTuyC3TZXsUyGwhv2wB6BqMjonIayGVKXNtKZ/qfQgrTJYiYeKNAwQzU/7CZNFQ/SVJNhyBrIyzVsivLkWFnShK+PdIjsbPJUkGTLyjUhYsOFxhvOzBo/JibYA0nuhJfZH4KY9Gn+5Q4JT5El+fylRL470bks= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5d5eec95a74so3789456eaf.1 for ; Wed, 04 Sep 2024 06:27:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456447; x=1726061247; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FV0SAHIQHKh35oUaJrcRLLTJOo00vvijmVRuFov1eBs=; b=mLvDSRjio6tGuGb5dJgpbV+FiMRtBLMzU9IdGpP2ibfLDnK/Mjj+wh9ovQGAOUZs4a Lj5vaz9dsn6UzMtJ/Wo3GoqaZpRS1TWT04uN3AZ0ZEp6lI7eBSWO1g9AF9tvaJGaL0Rh oxUj4ZGB+NHP5j8FX2v3hDBHaNdg2uvzkhBZ5zJ5LBqW92L6T4QvoFddRAu4DeSsJ+or 9fPqLK/Kpp5CqDOYF1bRjpQgsn+29omxa9RiBZoH3hd7hXI0kwU3KDj8x+/6DZiXRj18 h+5bcyKUiW9Q4KZFGhnIvz51IEf6tpVhO406buB84SarZpCOq5UZ4Gnrn6f7kReXH9SH 0zpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456447; x=1726061247; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FV0SAHIQHKh35oUaJrcRLLTJOo00vvijmVRuFov1eBs=; b=mjhFdNThMzLZq6C4Pyzq1H2LFswrmOzHowqL+/2u7YXOop6WbToKddF+M6KpXtvRYZ YWTF2+QzsYDCLQDwg4qb8eZUYhrXO3zSDOqLL+4vhsshQ6Ot7uPaKSDIb6Gu5XOCm2PC xa+rC57EbSs1RKpgBye/mlMiLrR/LUMvyfbSSzrio2rVnNhOR+q/BLX9t36O9MhoOl4B MDIKfXwb4C3lgGqjJ1C7YHQcawew0bwn+pvoJJi0tXqqGzz6OimBCo9HU9oAYhpSYPYh 74TF+EfhrkHUXNSHmhOlHdD/8mWygbe9QUKkkTmthqeXiWIT1JWvPWmf0m8AyYaRmGA2 el1w== X-Gm-Message-State: AOJu0Yw6MsRjaGJfDAHvjLTIGPhayfmqbOMtI0hUK/3aXBKTHBnsvKXb IkagPU4nwuyBiMzTM5w6J2KSPBGLCQqEdaIgnOOSm2xTVo7L2RG0eEXlM4n5HW8UX0uoyWR48Ab aFwtv6w== X-Google-Smtp-Source: AGHT+IHG+cG8BmCuvzXywyxyZDMk2slqVN7ljKHoSwDcNB0Zf/y9tYW/OvSvFk6NrMYg6IQybfAs8w== X-Received: by 2002:a05:6820:2297:b0:5df:8577:e7b with SMTP id 006d021491bc7-5dfacc20388mr19431026eaf.0.1725456446711; Wed, 04 Sep 2024 06:27:26 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:26 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 07/36] arm: [MVE intrinsics] factorize vcvtbq vcvttq Date: Wed, 4 Sep 2024 13:26:21 +0000 Message-Id: <20240904132650.2720446-8-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtbq, vcvttq so that they use the same parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTBQ_F16_F32, VCVTTQ_F16_F32, VCVTBQ_F32_F16, VCVTTQ_F32_F16, VCVTBQ_M_F16_F32, VCVTTQ_M_F16_F32, VCVTBQ_M_F32_F16, VCVTTQ_M_F32_F16. (VCVTxQ_F16_F32): New iterator. (VCVTxQ_F32_F16): Likewise. (VCVTxQ_M_F16_F32): Likewise. (VCVTxQ_M_F32_F16): Likewise. * config/arm/mve.md (mve_vcvttq_f32_f16v4sf) (mve_vcvtbq_f32_f16v4sf): Merge into ... (@mve_q_f32_f16v4sf): ... this. (mve_vcvtbq_f16_f32v8hf, mve_vcvttq_f16_f32v8hf): Merge into ... (@mve_q_f16_f32v8hf): ... this. (mve_vcvtbq_m_f16_f32v8hf, mve_vcvttq_m_f16_f32v8hf): Merge into ... (@mve_q_m_f16_f32v8hf): ... this. (mve_vcvtbq_m_f32_f16v4sf, mve_vcvttq_m_f32_f16v4sf): Merge into ... (@mve_q_m_f32_f16v4sf): ... this. --- gcc/config/arm/iterators.md | 8 +++ gcc/config/arm/mve.md | 112 +++++++++--------------------------- 2 files changed, 34 insertions(+), 86 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index bf800625fac..b9c39a98ca2 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,6 +964,10 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTBQ_F16_F32 "vcvtb") (VCVTTQ_F16_F32 "vcvtt") + (VCVTBQ_F32_F16 "vcvtb") (VCVTTQ_F32_F16 "vcvtt") + (VCVTBQ_M_F16_F32 "vcvtb") (VCVTTQ_M_F16_F32 "vcvtt") + (VCVTBQ_M_F32_F16 "vcvtb") (VCVTTQ_M_F32_F16 "vcvtt") (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") @@ -2948,6 +2952,10 @@ (define_int_iterator SQRSHRLQ [SQRSHRL_64 SQRSHRL_48]) (define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U]) (define_int_iterator VQSHLUQ_M_N [VQSHLUQ_M_N_S]) (define_int_iterator VQSHLUQ_N [VQSHLUQ_N_S]) +(define_int_iterator VCVTxQ_F16_F32 [VCVTBQ_F16_F32 VCVTTQ_F16_F32]) +(define_int_iterator VCVTxQ_F32_F16 [VCVTBQ_F32_F16 VCVTTQ_F32_F16]) +(define_int_iterator VCVTxQ_M_F16_F32 [VCVTBQ_M_F16_F32 VCVTTQ_M_F16_F32]) +(define_int_iterator VCVTxQ_M_F32_F16 [VCVTBQ_M_F32_F16 VCVTTQ_M_F32_F16]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 95c615c1534..6e2f542cdae 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -217,33 +217,20 @@ (define_insn "@mve_q_f" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f")) (set_attr "type" "mve_move") ]) -;; -;; [vcvttq_f32_f16]) -;; -(define_insn "mve_vcvttq_f32_f16v4sf" - [ - (set (match_operand:V4SF 0 "s_register_operand" "=w") - (unspec:V4SF [(match_operand:V8HF 1 "s_register_operand" "w")] - VCVTTQ_F32_F16)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtt.f32.f16\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f32_f16v4sf")) - (set_attr "type" "mve_move") -]) ;; -;; [vcvtbq_f32_f16]) +;; [vcvtbq_f32_f16] +;; [vcvttq_f32_f16] ;; -(define_insn "mve_vcvtbq_f32_f16v4sf" +(define_insn "@mve_q_f32_f16v4sf" [ (set (match_operand:V4SF 0 "s_register_operand" "=w") (unspec:V4SF [(match_operand:V8HF 1 "s_register_operand" "w")] - VCVTBQ_F32_F16)) + VCVTxQ_F32_F16)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtb.f32.f16\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f32_f16v4sf")) + ".f32.f16\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f32_f16v4sf")) (set_attr "type" "mve_move") ]) @@ -1342,34 +1329,19 @@ (define_insn "mve_vctpq_m" ]) ;; -;; [vcvtbq_f16_f32]) -;; -(define_insn "mve_vcvtbq_f16_f32v8hf" - [ - (set (match_operand:V8HF 0 "s_register_operand" "=w") - (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") - (match_operand:V4SF 2 "s_register_operand" "w")] - VCVTBQ_F16_F32)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtb.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f16_f32v8hf")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvttq_f16_f32]) +;; [vcvtbq_f16_f32] +;; [vcvttq_f16_f32] ;; -(define_insn "mve_vcvttq_f16_f32v8hf" +(define_insn "@mve_q_f16_f32v8hf" [ (set (match_operand:V8HF 0 "s_register_operand" "=w") (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") (match_operand:V4SF 2 "s_register_operand" "w")] - VCVTTQ_F16_F32)) + VCVTxQ_F16_F32)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f16_f32v8hf")) + ".f16.f32\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f16_f32v8hf")) (set_attr "type" "mve_move") ]) @@ -2237,73 +2209,41 @@ (define_insn "@mve_vcmpq_m_n_f" (set_attr "length""8")]) ;; -;; [vcvtbq_m_f16_f32]) +;; [vcvtbq_m_f16_f32] +;; [vcvttq_m_f16_f32] ;; -(define_insn "mve_vcvtbq_m_f16_f32v8hf" +(define_insn "@mve_q_m_f16_f32v8hf" [ (set (match_operand:V8HF 0 "s_register_operand" "=w") (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") (match_operand:V4SF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTBQ_M_F16_F32)) + (match_operand:V4BI 3 "vpr_register_operand" "Up")] + VCVTxQ_M_F16_F32)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtbt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f16_f32v8hf")) + "vpst\;t.f16.f32\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f16_f32v8hf")) (set_attr "type" "mve_move") (set_attr "length""8")]) ;; -;; [vcvtbq_m_f32_f16]) +;; [vcvtbq_m_f32_f16] +;; [vcvttq_m_f32_f16] ;; -(define_insn "mve_vcvtbq_m_f32_f16v4sf" +(define_insn "@mve_q_m_f32_f16v4sf" [ (set (match_operand:V4SF 0 "s_register_operand" "=w") (unspec:V4SF [(match_operand:V4SF 1 "s_register_operand" "0") (match_operand:V8HF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTBQ_M_F32_F16)) + (match_operand:V8BI 3 "vpr_register_operand" "Up")] + VCVTxQ_M_F32_F16)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtbt.f32.f16\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f32_f16v4sf")) + "vpst\;t.f32.f16\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f32_f16v4sf")) (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vcvttq_m_f16_f32]) -;; -(define_insn "mve_vcvttq_m_f16_f32v8hf" - [ - (set (match_operand:V8HF 0 "s_register_operand" "=w") - (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") - (match_operand:V4SF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTTQ_M_F16_F32)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvttt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f16_f32v8hf")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvttq_m_f32_f16]) -;; -(define_insn "mve_vcvttq_m_f32_f16v4sf" - [ - (set (match_operand:V4SF 0 "s_register_operand" "=w") - (unspec:V4SF [(match_operand:V4SF 1 "s_register_operand" "0") - (match_operand:V8HF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTTQ_M_F32_F16)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvttt.f32.f16\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f32_f16v4sf")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vdupq_m_n_f]) ;; From patchwork Wed Sep 4 13:26:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=yKf6LH2F; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNf14KF7z1yg3 for ; Wed, 4 Sep 2024 23:30:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 28212384AB48 for ; Wed, 4 Sep 2024 13:30:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by sourceware.org (Postfix) with ESMTPS id 7B7B43858433 for ; Wed, 4 Sep 2024 13:27:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B7B43858433 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7B7B43858433 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456452; cv=none; b=vsvR9NaHRq0+dJcaCjV6hy2vvmqwK3JOMCXiEJll7uCHB8lx+o1hUGJqwPZnlZCbWAdOzWelQUrCwOCaTC2oTV+fMzB5Bi2jf/gcWDSw/zRoj+Q8yyf1iNvmGFJA7PkYiEbPciaHtqOdybwRNM000W4iElHIUqyqMw5Caj0PrK4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456452; c=relaxed/simple; bh=PRJeLXs7rZ8WGY3LGveQdjcyrFDd3HHmyzZKeqSzRg4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=SSblvR030qNxNonX24pszH7YEvn8L9AouQK0jY6C8Hs857tUcR/gykd9x+uygxm24MUhT0/rdNrKTNUWFxOlAvPrsSsnxom9/tPYaCgRufVNtkl//vUBhHitjg9sAjTyNj9R7ELxMocfy16t3iZJYhXy9nFsCLqhvPZKS4qSzQk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-5df9542f3d8so5536737eaf.0 for ; Wed, 04 Sep 2024 06:27:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456447; x=1726061247; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=X7qg9uIESIT7/o4/S4/lkRhSVjJ7rd8lFlBTMGBL+/s=; b=yKf6LH2FHjAQH3m/ZaJy9YbNxfioFJ/w6Rr1Y/QW3jSHKoeRH4GvOrirWquCNB9Dnp pNIENNUp+VqnDun+xTWHjd7b+rrOOd5jBXowCPDZWE4C7nKuO85MAtvZlW8KvVAzHqrK xP+sZGTIXmBFEwYSj4arnb9GN+WfRY1+gMS1YvrjzacyEyCfP4T3OscpJehrWGr69tuR ap1opH6P9LpHPnApMvH21OJw+VTJgw4W8BcXBVHmDLox1xcLbflRnp/RR/4o5CaGNREp 5nCQoBRMlzWt+1s/HkKSS2rwHMJFXyNvj+WiZc8TzkGGnoZaNUksLAMuzhR3+ydH6Fwy BIZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456447; x=1726061247; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X7qg9uIESIT7/o4/S4/lkRhSVjJ7rd8lFlBTMGBL+/s=; b=ecmh0v6r1cWkhUiTa5o7ztNCUQkfFXl58P7OaxaXq+zbPsYHauV2nZsy7lz54t/fSv sXqCUQ7Dxkq/0DJZAEH8qlFGUm+lCjZEzpgB3ZF1KWwrJC5tKEnl6PhH8nJ22tElZPCB OUg4wavq+e6OkM79NR92Cbjb+9qGOY1nD83cMHWCj8GHiM/WACrr/gA9DR9DR+f2n6Wq d16AoD92WIy66gAbDC4IeoGBUNsApwanMdGFMomWHiBmSinyy+nTkQnrknl/J0gekm72 X1RgeBOEIVoyQz9kE8AsIrEsoTrspfraaRAXHPqMgFhUuiyMjf8H1KZ4zmUhp5ZLd6fN 8MWw== X-Gm-Message-State: AOJu0YygCMA/YEjQd2U+ttdHB8YKJyorPDnCfwXR18qKVsq1K6wUfs+J 36QKCmDbK4kASgZpzlzDcC6QZaUDubMQY26qUhj3N0JAUrKrPWm8px6bfciLgGI9Y8xFwjPIJUO IUiEJYQ== X-Google-Smtp-Source: AGHT+IEXeDsMnbuFTxcgIfYhuXesR3aW94hOQchPp7qrmgGUIINpuAOpcsRXG/Cpgv8n/+xqc20Yjg== X-Received: by 2002:a05:6820:1516:b0:5da:a2fd:5af9 with SMTP id 006d021491bc7-5dfad0b073amr19881216eaf.8.1725456447420; Wed, 04 Sep 2024 06:27:27 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:26 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 08/36] arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes Date: Wed, 4 Sep 2024 13:26:22 +0000 Message-Id: <20240904132650.2720446-9-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvt_f16_f32 and vcvt_f32_f16 shapes descriptions. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvt_f16_f32) (vcvt_f32_f16): New. * config/arm/arm-mve-builtins-shapes.h (vcvt_f16_f32) (vcvt_f32_f16): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 35 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 2 ++ 2 files changed, 37 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index bc99a6a7c43..5ebf666d954 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2081,6 +2081,41 @@ struct vcvt_def : public overloaded_base<0> }; SHAPE (vcvt) +/* float16x8_t foo_f16_f32(float16x8_t, float32x4_t) + + Example: vcvttq_f16_f32. + float16x8_t [__arm_]vcvttq_f16_f32(float16x8_t a, float32x4_t b) + float16x8_t [__arm_]vcvttq_m_f16_f32(float16x8_t a, float32x4_t b, mve_pred16_t p) +*/ +struct vcvt_f16_f32_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + build_all (b, "v0,v0,v1", group, MODE_none, preserve_user_namespace); + } +}; +SHAPE (vcvt_f16_f32) + +/* float32x4_t foo_f32_f16(float16x8_t) + + Example: vcvttq_f32_f16. + float32x4_t [__arm_]vcvttq_f32_f16(float16x8_t a) + float32x4_t [__arm_]vcvttq_m_f32_f16(float32x4_t inactive, float16x8_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvttq_x_f32_f16(float16x8_t a, mve_pred16_t p) +*/ +struct vcvt_f32_f16_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + } +}; +SHAPE (vcvt_f32_f16) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 9a112ceeb29..50157b57571 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -78,6 +78,8 @@ namespace arm_mve extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; extern const function_shape *const vcvt; + extern const function_shape *const vcvt_f16_f32; + extern const function_shape *const vcvt_f32_f16; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ From patchwork Wed Sep 4 13:26:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980796 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ah9UEpb7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNcK72V3z1yfv for ; Wed, 4 Sep 2024 23:28:49 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6F1E386549E for ; Wed, 4 Sep 2024 13:28:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by sourceware.org (Postfix) with ESMTPS id EECC6385EC2C for ; Wed, 4 Sep 2024 13:27:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EECC6385EC2C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EECC6385EC2C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; cv=none; b=euQe46KUa8c+dOIlI8U2FxGqbgKwAKpUTWugcKKuJtzhy5vwYvk/iBLrxZkgUwYNvN60Ix2YBxcVKlNA76Et6Db8H34xeSXfgHXjM4JHJwHZKfianr7kOZiYM7S0pvyIRuj2JlMkvRKyoAneOBLBOnwXxfGS/LIfgoV/MxVeRUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; c=relaxed/simple; bh=wWXrVuxdOcqLtSaN7qMbUOVjOxZ3QBQ429ViUM5KT7g=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=EZUXUegEn1UQ67vN0sxo/M+h7oO+pS01kiTtJdcu82oDr+mE6SkLyvZXXAoGUK9aTN7nMoM1dOCgcNg52tXOelIuEcuOZhmaUfptn047s2zm1XCm0sSbjOYiFQXkScf2MALkPJ9otgz8xcdjkfIe7dxFXSWF6G/I6BCvfKSmTlM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-5de8ca99d15so3923460eaf.0 for ; Wed, 04 Sep 2024 06:27:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456449; x=1726061249; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qhp9ZtW1ilKcnptzUHG7V937VLdAq1LWd6bYwa8KyJ8=; b=ah9UEpb730De9YDz6stG0zrU+IPbBVmXnBL2gRRy6OZX0a1W62s8yXrNnHV08eU8yS KjKOx5YXOgKNm7qKc1LqK0CWcivB3E6IjmVtkJd5gz6Jb55tEqM1UkARExzeB2UU84Iv w6dvTS8iIM5Wy/+Ok99/HxRvRiHDR7ceEiqdoojYyZ4k1NST2ith48QO5wWHC/4tcv8w pY/5VLwYuVEF3pgPhNMDz7hr445YqPdG58C8HulRSCAsq90jQlkGRnBHykKWtM4nlY6L nLuepI1hg8PPCk74wym62XxUaaAyqPYtOwwZ0zDfR7PMUiyI0lwQsHC7/XJ/72qiUFnU A+/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456449; x=1726061249; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qhp9ZtW1ilKcnptzUHG7V937VLdAq1LWd6bYwa8KyJ8=; b=vo/eex0/b9sYpqmSIHZQj2bmJyfaLwTlb/sAVKZlz8JMAiee/7Uc+v7VbE4TqsA94m wDQrczlUDLpUqQGMZXQyCeUNtZzLBV6iZ+iVcOZ8SiDrdIjchWLQiX73kKSeB4b4WTqA XtR1vzpTkANn9MyVkQY0K5BSs4fF0RdlWUKrRfnE69VSJxGyEGFYoR7mAAKKyNvmnD6U k5BSLP6VEYw7sNe2ofxDK1zfEWSjdYx8kX7CQyvi0EKuLEIWhUIWkKPInO9cBVlYqfQp wNABcOWnYwmXq9HXk9EQ34nDaoY/1egSmSbGzcZZetk977j5pENPKM8XLkOhr87xhE0i prEg== X-Gm-Message-State: AOJu0YwuKG+CW2T0t247YoUfztc6QBzIIPeh4au5GxyS2El/Jq+dVomk I/E4tZ17s4kERC52pqZvY9/VGm4uWoqZepyGlj9/huNWbixKC+Jb5Vyyj4Xx5NKqT80KsHVjY/5 BB2LdJQ== X-Google-Smtp-Source: AGHT+IFMCD4pA+IIOk2nf/qWfPqebzBdYpPiQWBlxOYDQOQedFqdNJGaDCTO7jPSnWjYEVPykrtfNA== X-Received: by 2002:a05:6820:1505:b0:5da:9204:6727 with SMTP id 006d021491bc7-5dfacf53b94mr18520736eaf.6.1725456448659; Wed, 04 Sep 2024 06:27:28 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:27 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 09/36] arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 vcvtbq_f32_f16 vcvttq_f32_f16 Date: Wed, 4 Sep 2024 13:26:23 +0000 Message-Id: <20240904132650.2720446-10-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtbq_f16_f32, vcvttq_f16_f32, vcvtbq_f32_f16 and vcvttq_f32_f16 using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtxq_impl): New. (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.def (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.h (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins.cc (cvt_f16_f32, cvt_f32_f16): New types. (function_instance::has_inactive_argument): Support vcvtbq and vcvttq. * config/arm/arm_mve.h (vcvttq_f32): Delete. (vcvtbq_f32): Delete. (vcvtbq_m): Delete. (vcvttq_m): Delete. (vcvttq_f32_f16): Delete. (vcvtbq_f32_f16): Delete. (vcvttq_f16_f32): Delete. (vcvtbq_f16_f32): Delete. (vcvtbq_m_f16_f32): Delete. (vcvtbq_m_f32_f16): Delete. (vcvttq_m_f16_f32): Delete. (vcvttq_m_f32_f16): Delete. (vcvtbq_x_f32_f16): Delete. (vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32_f16): Delete. (__arm_vcvtbq_f32_f16): Delete. (__arm_vcvttq_f16_f32): Delete. (__arm_vcvtbq_f16_f32): Delete. (__arm_vcvtbq_m_f16_f32): Delete. (__arm_vcvtbq_m_f32_f16): Delete. (__arm_vcvttq_m_f16_f32): Delete. (__arm_vcvttq_m_f32_f16): Delete. (__arm_vcvtbq_x_f32_f16): Delete. (__arm_vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32): Delete. (__arm_vcvtbq_f32): Delete. (__arm_vcvtbq_m): Delete. (__arm_vcvttq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 56 +++++++++ gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm-mve-builtins.cc | 12 ++ gcc/config/arm/arm_mve.h | 146 ----------------------- 5 files changed, 74 insertions(+), 146 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index a780d686eb1..760378c91b1 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -251,6 +251,60 @@ public: } }; + /* Implements vcvt[bt]q_f32_f16 and vcvt[bt]q_f16_f32 + intrinsics. */ +class vcvtxq_impl : public function_base +{ +public: + CONSTEXPR vcvtxq_impl (int unspec_f16_f32, int unspec_for_m_f16_f32, + int unspec_f32_f16, int unspec_for_m_f32_f16) + : m_unspec_f16_f32 (unspec_f16_f32), + m_unspec_for_m_f16_f32 (unspec_for_m_f16_f32), + m_unspec_f32_f16 (unspec_f32_f16), + m_unspec_for_m_f32_f16 (unspec_for_m_f32_f16) + {} + + /* The unspec code associated with vcvt[bt]q. */ + int m_unspec_f16_f32; + int m_unspec_for_m_f16_f32; + int m_unspec_f32_f16; + int m_unspec_for_m_f32_f16; + + rtx + expand (function_expander &e) const override + { + insn_code code; + switch (e.pred) + { + case PRED_none: + /* No predicate. */ + if (e.type_suffix (0).element_bits == 16) + code = code_for_mve_q_f16_f32v8hf (m_unspec_f16_f32); + else + code = code_for_mve_q_f32_f16v4sf (m_unspec_f32_f16); + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + /* "m" or "x" predicate. */ + if (e.type_suffix (0).element_bits == 16) + code = code_for_mve_q_m_f16_f32v8hf (m_unspec_for_m_f16_f32); + else + code = code_for_mve_q_m_f32_f16v4sf (m_unspec_for_m_f32_f16); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + + gcc_unreachable (); + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -452,6 +506,8 @@ FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNK FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) FUNCTION (vcvtq, vcvtq_impl,) +FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) +FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) FUNCTION_ONLY_N (vdupq, VDUPQ) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 671f86b5096..85211d2adc2 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -179,7 +179,11 @@ DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot270, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) +DEF_MVE_FUNCTION (vcvtbq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) +DEF_MVE_FUNCTION (vcvtbq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) +DEF_MVE_FUNCTION (vcvttq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) +DEF_MVE_FUNCTION (vcvttq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_float, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vfmaq, ternary_opt_n, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index dee73d9c457..7b2107d9a0a 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -54,7 +54,9 @@ extern const function_base *const vcmulq_rot180; extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; +extern const function_base *const vcvtbq; extern const function_base *const vcvtq; +extern const function_base *const vcvttq; extern const function_base *const vdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 3c5b54dade1..4c554a47d85 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -219,6 +219,14 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { D (u16, f16), \ D (u32, f32) +/* vcvt[bt]q_f16_f132. */ +#define TYPES_cvt_f16_f32(S, D) \ + D (f16, f32) + +/* vcvt[bt]q_f32_f16. */ +#define TYPES_cvt_f32_f16(S, D) \ + D (f32, f16) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -299,6 +307,8 @@ DEF_MVE_TYPES_ARRAY (poly_8_16); DEF_MVE_TYPES_ARRAY (signed_16_32); DEF_MVE_TYPES_ARRAY (signed_32); DEF_MVE_TYPES_ARRAY (cvt); +DEF_MVE_TYPES_ARRAY (cvt_f16_f32); +DEF_MVE_TYPES_ARRAY (cvt_f32_f16); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); @@ -730,6 +740,8 @@ function_instance::has_inactive_argument () const || base == functions::vcmpltq || base == functions::vcmpcsq || base == functions::vcmphiq + || (base == functions::vcvtbq && type_suffix (0).element_bits == 16) + || (base == functions::vcvttq && type_suffix (0).element_bits == 16) || base == functions::vfmaq || base == functions::vfmasq || base == functions::vfmsq diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 07897f510f5..5c35e08d754 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -137,11 +137,7 @@ #define vsetq_lane(__a, __b, __idx) __arm_vsetq_lane(__a, __b, __idx) #define vgetq_lane(__a, __idx) __arm_vgetq_lane(__a, __idx) #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) -#define vcvttq_f32(__a) __arm_vcvttq_f32(__a) -#define vcvtbq_f32(__a) __arm_vcvtbq_f32(__a) #define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) -#define vcvttq_m(__a, __b, __p) __arm_vcvttq_m(__a, __b, __p) #define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) #define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) #define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) @@ -155,8 +151,6 @@ #define vst4q_u32( __addr, __value) __arm_vst4q_u32( __addr, __value) #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) -#define vcvttq_f32_f16(__a) __arm_vcvttq_f32_f16(__a) -#define vcvtbq_f32_f16(__a) __arm_vcvtbq_f32_f16(__a) #define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) #define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) #define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) @@ -202,8 +196,6 @@ #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) #define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p) -#define vcvttq_f16_f32(__a, __b) __arm_vcvttq_f16_f32(__a, __b) -#define vcvtbq_f16_f32(__a, __b) __arm_vcvtbq_f16_f32(__a, __b) #define vbicq_m_n_s16(__a, __imm, __p) __arm_vbicq_m_n_s16(__a, __imm, __p) #define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) #define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) @@ -218,10 +210,6 @@ #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vcvtbq_m_f16_f32(__a, __b, __p) __arm_vcvtbq_m_f16_f32(__a, __b, __p) -#define vcvtbq_m_f32_f16(__inactive, __a, __p) __arm_vcvtbq_m_f32_f16(__inactive, __a, __p) -#define vcvttq_m_f16_f32(__a, __b, __p) __arm_vcvttq_m_f16_f32(__a, __b, __p) -#define vcvttq_m_f32_f16(__inactive, __a, __p) __arm_vcvttq_m_f32_f16(__inactive, __a, __p) #define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) #define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) #define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) @@ -560,8 +548,6 @@ #define vcvtmq_x_s32_f32(__a, __p) __arm_vcvtmq_x_s32_f32(__a, __p) #define vcvtmq_x_u16_f16(__a, __p) __arm_vcvtmq_x_u16_f16(__a, __p) #define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) -#define vcvtbq_x_f32_f16(__a, __p) __arm_vcvtbq_x_f32_f16(__a, __p) -#define vcvttq_x_f32_f16(__a, __p) __arm_vcvttq_x_f32_f16(__a, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3704,20 +3690,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f32_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvttq_f32_f16v4sf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f32_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtbq_f32_f16v4sf (__a); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtpq_u16_f16 (float16x8_t __a) @@ -3858,20 +3830,6 @@ __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vbicq_fv4sf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f16_f32 (float16x8_t __a, float32x4_t __b) -{ - return __builtin_mve_vcvttq_f16_f32v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f16_f32 (float16x8_t __a, float32x4_t __b) -{ - return __builtin_mve_vcvtbq_f16_f32v8hf (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtaq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -3901,34 +3859,6 @@ __arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m_f16_f32 (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f16_f32v8hf (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m_f32_f16 (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f32_f16v4sf (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m_f16_f32 (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f16_f32v8hf (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m_f32_f16 (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f32_f16v4sf (__inactive, __a, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -4383,20 +4313,6 @@ __arm_vcvtmq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) return __builtin_mve_vcvtmq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -6827,20 +6743,6 @@ __arm_vst4q (float32_t * __addr, float32x4x4_t __value) __arm_vst4q_f32 (__addr, __value); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f32 (float16x8_t __a) -{ - return __arm_vcvttq_f32_f16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f32 (float16x8_t __a) -{ - return __arm_vcvtbq_f32_f16 (__a); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float16x8_t __a, float16x8_t __b) @@ -6897,34 +6799,6 @@ __arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcvtbq_m_f16_f32 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtbq_m_f32_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcvttq_m_f16_f32 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvttq_m_f32_f16 (__inactive, __a, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -7654,14 +7528,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vcvtbq_f32(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvtbq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) - -#define __arm_vcvttq_f32(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvttq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) - #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -7714,18 +7580,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) -#define __arm_vcvtbq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float16x8_t]: __arm_vcvtbq_m_f32_f16 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float32x4_t]: __arm_vcvtbq_m_f16_f32 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvttq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float16x8_t]: __arm_vcvttq_m_f32_f16 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float32x4_t]: __arm_vcvttq_m_f16_f32 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - #define __arm_vcvtmq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ From patchwork Wed Sep 4 13:26:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980794 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BOZVlZYt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNc53NWSz1yfv for ; Wed, 4 Sep 2024 23:28:37 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 83635386182E for ; Wed, 4 Sep 2024 13:28:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id 7D26B385DDD4 for ; Wed, 4 Sep 2024 13:27:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D26B385DDD4 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7D26B385DDD4 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; cv=none; b=ZxXzKtvFeaHrqQn5xW6vYcuDznIX51/BXU9rjc7JDyKuXA7WQg+rNJMufikW7rBZdsoYmSj82ABxFlKjwxAvAFQgkPxttoLujpo88mbZMGFdzqY/4ckBdaqlmlaOnNDAxSqgLlrXUl0WTcoNevz9GDKYazy1hps4Frr5rqXYgBw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; c=relaxed/simple; bh=SPIcsmqTC1mq8i4bFgiZmX9X6Yms+DlNq23RYjogtxQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=M/Y3TiAsuPpGXF7pXWpZ1DG0PEaP5zrTY38RWoAQKTEKpJS5d7q3Em5PIiz/bu3RO9cDpCmD8+ld5iqosCBEYe9ocRtuiVaxySidQ5OiEXcF0KHl9VRSL3AY5fh2/P5LyFW3i6UbVqqjzK4CoKzg22sawghIwaNpbyPOv9vWXtk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc32.google.com with SMTP id 006d021491bc7-5daa93677e1so3914699eaf.3 for ; Wed, 04 Sep 2024 06:27:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456449; x=1726061249; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d/2QnJnj9Vv0ZzsykePkKGF/zPxZxjGlxl13dDjWyNQ=; b=BOZVlZYt4OHXiUf81HCRzE2AWK4c4wrQ3E/ABb/X0gjTMQP+Bkj/J3i8qaFrD1u0XR kt4JqBcFUa18phAL+UOoRooIEkf5apbQiD0VatxRroW+1rNAk1DrYgpzygt+bhM0FO+N 0Y+HiGtorRCzFK2xQuB0L1zYDivfLtBEfSSd30n3O2PaXDn4iCZQDglXrcvrRdNDccqg 3ZCVYPq8ac18ZrVKZFdhwah8CT0CHeCvX4OP6M9v60HaFI4MvYuvGrwtQbn+PO3FCQaq GQB+uOykT4KU+vN4RH1EAvo68la1rRVqzGMJAXeSrbCAerK+JQJivMV8Pxy5iAnbuvWl 2COw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456449; x=1726061249; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d/2QnJnj9Vv0ZzsykePkKGF/zPxZxjGlxl13dDjWyNQ=; b=ricFA87zJXNj+qpEECnflPekBSKtLfHXBVlsduo0dwUORb51RmT6gxcr+34qiOen5x x12aPRwLm+Skw5Gy6ttCLZGPKMzf+fFnagBJKHy5NpIdx6XzmKUfvD28sWibKXfHpZIc mFVHV+anwRUZz5+rBOpwKYDQMEgm1Mbu7zlo5o7hDyay/k4wd9QZ1bN1a075dSRaVCMb XTUV+sVMhQpeuC4gUQKwXYTKjjFgr2LZoH6EiaVUubGdbOuIPCoetkuUpsYUowXzbJQr A14X5irsgY5OOWyRKcduDtCQ3T2YsZX8d9YPqtyPIl6gpgv7qjVC1ovM5rsdqh6HLOoW uNIw== X-Gm-Message-State: AOJu0YwrtJF09bbeIlVwV4cug8d6qisNp2hfN4cCzKdHmsnx6RVXWwwF RpZmx1MNYNq6q8rrdhBO2iSMYSNxDOCX3qdQmehQD4jE0pYv+QtczAVqtSdyFyKw8xg0L7ty6Hs Ivmz5vw== X-Google-Smtp-Source: AGHT+IHmSL0QIpsMq4HvMFM7YAJ0+5yj0CSjyxqokidna+a5yhtZwqvhs6+h3YM/7qyBfuZgFXHdHA== X-Received: by 2002:a05:6820:168d:b0:5dc:a979:442a with SMTP id 006d021491bc7-5dfacc03655mr18829029eaf.0.1725456449388; Wed, 04 Sep 2024 06:27:29 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:28 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 10/36] arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq Date: Wed, 4 Sep 2024 13:26:24 +0000 Message-Id: <20240904132650.2720446-11-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtaq vcvtmq vcvtnq vcvtpq builtins so that they use the same parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTAQ_M_S, VCVTAQ_M_U, VCVTAQ_S, VCVTAQ_U, VCVTMQ_M_S, VCVTMQ_M_U, VCVTMQ_S, VCVTMQ_U, VCVTNQ_M_S, VCVTNQ_M_U, VCVTNQ_S, VCVTNQ_U, VCVTPQ_M_S, VCVTPQ_M_U, VCVTPQ_S, VCVTPQ_U. (VCVTAQ, VCVTPQ, VCVTNQ, VCVTMQ, VCVTAQ_M, VCVTMQ_M, VCVTNQ_M) (VCVTPQ_M): Delete. (VCVTxQ, VCVTxQ_M): New. * config/arm/mve.md (mve_vcvtpq_) (mve_vcvtnq_, mve_vcvtmq_) (mve_vcvtaq_): Merge into ... (@mve_q_): ... this. (mve_vcvtaq_m_, mve_vcvtmq_m_) (mve_vcvtpq_m_, mve_vcvtnq_m_): Merge into ... (@mve_q_m_): ... this. --- gcc/config/arm/iterators.md | 18 +++--- gcc/config/arm/mve.md | 121 +++++------------------------------- 2 files changed, 26 insertions(+), 113 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index b9c39a98ca2..162c0d56bfb 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,10 +964,18 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTAQ_M_S "vcvta") (VCVTAQ_M_U "vcvta") + (VCVTAQ_S "vcvta") (VCVTAQ_U "vcvta") (VCVTBQ_F16_F32 "vcvtb") (VCVTTQ_F16_F32 "vcvtt") (VCVTBQ_F32_F16 "vcvtb") (VCVTTQ_F32_F16 "vcvtt") (VCVTBQ_M_F16_F32 "vcvtb") (VCVTTQ_M_F16_F32 "vcvtt") (VCVTBQ_M_F32_F16 "vcvtb") (VCVTTQ_M_F32_F16 "vcvtt") + (VCVTMQ_M_S "vcvtm") (VCVTMQ_M_U "vcvtm") + (VCVTMQ_S "vcvtm") (VCVTMQ_U "vcvtm") + (VCVTNQ_M_S "vcvtn") (VCVTNQ_M_U "vcvtn") + (VCVTNQ_S "vcvtn") (VCVTNQ_U "vcvtn") + (VCVTPQ_M_S "vcvtp") (VCVTPQ_M_U "vcvtp") + (VCVTPQ_S "vcvtp") (VCVTPQ_U "vcvtp") (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") @@ -2732,14 +2740,10 @@ (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S]) (define_int_iterator VREV64Q [VREV64Q_S VREV64Q_U]) (define_int_iterator VCVTQ_FROM_F [VCVTQ_FROM_F_S VCVTQ_FROM_F_U]) (define_int_iterator VREV16Q [VREV16Q_U VREV16Q_S]) -(define_int_iterator VCVTAQ [VCVTAQ_U VCVTAQ_S]) (define_int_iterator VDUPQ_N [VDUPQ_N_U VDUPQ_N_S]) (define_int_iterator VADDVQ [VADDVQ_U VADDVQ_S]) (define_int_iterator VREV32Q [VREV32Q_U VREV32Q_S]) (define_int_iterator VMOVLxQ [VMOVLBQ_S VMOVLBQ_U VMOVLTQ_U VMOVLTQ_S]) -(define_int_iterator VCVTPQ [VCVTPQ_S VCVTPQ_U]) -(define_int_iterator VCVTNQ [VCVTNQ_S VCVTNQ_U]) -(define_int_iterator VCVTMQ [VCVTMQ_S VCVTMQ_U]) (define_int_iterator VADDLVQ [VADDLVQ_U VADDLVQ_S]) (define_int_iterator VCVTQ_N_TO_F [VCVTQ_N_TO_F_S VCVTQ_N_TO_F_U]) (define_int_iterator VCREATEQ [VCREATEQ_U VCREATEQ_S]) @@ -2795,7 +2799,6 @@ (define_int_iterator VQMOVNTQ [VQMOVNTQ_U VQMOVNTQ_S]) (define_int_iterator VSHLLxQ_N [VSHLLBQ_N_S VSHLLBQ_N_U VSHLLTQ_N_S VSHLLTQ_N_U]) (define_int_iterator VRMLALDAVHQ [VRMLALDAVHQ_U VRMLALDAVHQ_S]) (define_int_iterator VBICQ_M_N [VBICQ_M_N_S VBICQ_M_N_U]) -(define_int_iterator VCVTAQ_M [VCVTAQ_M_S VCVTAQ_M_U]) (define_int_iterator VCVTQ_M_TO_F [VCVTQ_M_TO_F_S VCVTQ_M_TO_F_U]) (define_int_iterator VQRSHRNBQ_N [VQRSHRNBQ_N_U VQRSHRNBQ_N_S]) (define_int_iterator VABAVQ [VABAVQ_S VABAVQ_U]) @@ -2845,9 +2848,6 @@ (define_int_iterator VQMOVNTQ_M [VQMOVNTQ_M_U VQMOVNTQ_M_S]) (define_int_iterator VMVNQ_M_N [VMVNQ_M_N_U VMVNQ_M_N_S]) (define_int_iterator VQSHRNTQ_N [VQSHRNTQ_N_U VQSHRNTQ_N_S]) (define_int_iterator VSHRNTQ_N [VSHRNTQ_N_S VSHRNTQ_N_U]) -(define_int_iterator VCVTMQ_M [VCVTMQ_M_S VCVTMQ_M_U]) -(define_int_iterator VCVTNQ_M [VCVTNQ_M_S VCVTNQ_M_U]) -(define_int_iterator VCVTPQ_M [VCVTPQ_M_S VCVTPQ_M_U]) (define_int_iterator VCVTQ_M_N_FROM_F [VCVTQ_M_N_FROM_F_S VCVTQ_M_N_FROM_F_U]) (define_int_iterator VCVTQ_M_FROM_F [VCVTQ_M_FROM_F_U VCVTQ_M_FROM_F_S]) (define_int_iterator VRMLALDAVHQ_P [VRMLALDAVHQ_P_S VRMLALDAVHQ_P_U]) @@ -2956,6 +2956,8 @@ (define_int_iterator VCVTxQ_F16_F32 [VCVTBQ_F16_F32 VCVTTQ_F16_F32]) (define_int_iterator VCVTxQ_F32_F16 [VCVTBQ_F32_F16 VCVTTQ_F32_F16]) (define_int_iterator VCVTxQ_M_F16_F32 [VCVTBQ_M_F16_F32 VCVTTQ_M_F16_F32]) (define_int_iterator VCVTxQ_M_F32_F16 [VCVTBQ_M_F32_F16 VCVTTQ_M_F32_F16]) +(define_int_iterator VCVTxQ [VCVTAQ_S VCVTAQ_U VCVTMQ_S VCVTMQ_U VCVTNQ_S VCVTNQ_U VCVTPQ_S VCVTPQ_U]) +(define_int_iterator VCVTxQ_M [VCVTAQ_M_S VCVTAQ_M_U VCVTMQ_M_S VCVTMQ_M_U VCVTNQ_M_S VCVTNQ_M_U VCVTPQ_M_S VCVTPQ_M_U]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 6e2f542cdae..41c7e73a161 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -416,62 +416,20 @@ (define_insn "@mve_q_" ]) ;; -;; [vcvtpq_s, vcvtpq_u]) +;; [vcvtaq_u, vcvtaq_s] +;; [vcvtmq_s, vcvtmq_u] +;; [vcvtnq_s, vcvtnq_u] +;; [vcvtpq_s, vcvtpq_u] ;; -(define_insn "mve_vcvtpq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTPQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtp.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtpq_")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvtnq_s, vcvtnq_u]) -;; -(define_insn "mve_vcvtnq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTNQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtn.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtnq_")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvtmq_s, vcvtmq_u]) -;; -(define_insn "mve_vcvtmq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTMQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtm.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtmq_")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvtaq_u, vcvtaq_s]) -;; -(define_insn "mve_vcvtaq_" +(define_insn "@mve_q_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTAQ)) + VCVTxQ)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvta.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtaq_")) + ".%#.f%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_")) (set_attr "type" "mve_move") ]) @@ -1627,19 +1585,22 @@ (define_insn "@mve_vcmpq_m_f" (set_attr "length""8")]) ;; -;; [vcvtaq_m_u, vcvtaq_m_s]) +;; [vcvtaq_m_u, vcvtaq_m_s] +;; [vcvtmq_m_s, vcvtmq_m_u] +;; [vcvtnq_m_s, vcvtnq_m_u] +;; [vcvtpq_m_u, vcvtpq_m_s] ;; -(define_insn "mve_vcvtaq_m_" +(define_insn "@mve_q_m_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") (match_operand: 2 "s_register_operand" "w") (match_operand: 3 "vpr_register_operand" "Up")] - VCVTAQ_M)) + VCVTxQ_M)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtat.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtaq_")) + "vpst\;t.%#.f%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2539,56 +2500,6 @@ (define_insn "@mve_q_p_v4si" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vcvtmq_m_s, vcvtmq_m_u]) -;; -(define_insn "mve_vcvtmq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTMQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtmt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtmq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvtpq_m_u, vcvtpq_m_s]) -;; -(define_insn "mve_vcvtpq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTPQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtpt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtpq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvtnq_m_s, vcvtnq_m_u]) -;; -(define_insn "mve_vcvtnq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTNQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtnt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtnq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) ;; ;; [vcvtq_m_n_from_f_s, vcvtq_m_n_from_f_u] From patchwork Wed Sep 4 13:26:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980804 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=t5fggm5z; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNgF0Xvrz1yg3 for ; Wed, 4 Sep 2024 23:31:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DEC993864807 for ; Wed, 4 Sep 2024 13:31:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) by sourceware.org (Postfix) with ESMTPS id AA4C8386101C for ; Wed, 4 Sep 2024 13:27:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AA4C8386101C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AA4C8386101C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; cv=none; b=A663kda9igcNquRxt7u9RHCgzcx5WTQIOLxdLh1ppvKrX49kI0GHqUOWuojVWdNAk2fKnYcg7RfM9EVwigF2wFrDGXhbE0zHZG9ZhHQe9ibLmZ6KrTuyM0lrkpZp/IRnQeH8+q/iCEu4UM3dgTNOGX0osb7+YmySXyJPJDMmpmY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456454; c=relaxed/simple; bh=CZBbxPuC1d15Qr9AO8qMj09b0G+9n/l8+piyAYuz4Mg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=xuvgLFBtfv04hkoqqdN4dl186P47eSsYQijRzNDqNBpm2O2i97/k9K6QGUIb7sTAxhZzPQrXDINMjJOS9gEczjwv5Hs6erhbBCTa8rW/+ateb5h08ltmdfVws34TX0IZuswG5R6u/L1NZOYHb9k0tqiEqjf9rHEFNUmHymcUDFI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc33.google.com with SMTP id 006d021491bc7-5de8a3f1cc6so360336eaf.1 for ; Wed, 04 Sep 2024 06:27:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456450; x=1726061250; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=c83W76PJu0tZZJu1nb8FhRUx8jMb+R5VzFiZiT7C7bU=; b=t5fggm5zxLpzmypZowrYfFwGDMqq+xY1UxdVDiDc1OcQ1euvOVutRNiGLrWzsRb/rK lesuDRuJJ1cGHHUh5hMRBGWfVnoMMeB1SzqakpDBzZTdyMMzGelOah40d3v8DIycbizl PvTg4t8g6KhUbsacV2zlYs8/MZdw2Ffv4mNMBhDH3ob59nLKsBmc7N7tV/5zMtNhLfQp qMSespGswKRJcrWUrKJ0WptS0r8HFiHcppV2SBcyZ4SDlPcs+8bQXtWlQFSrzS4KiYVj 8BaxWdGP/RDjJuq5GwlqL8GwEdev5eassMkmcfLvZh0KNeQqIRaSwg0OIyxNY5Kd573v JWOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456450; x=1726061250; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c83W76PJu0tZZJu1nb8FhRUx8jMb+R5VzFiZiT7C7bU=; b=e2yFLCPV7a1/xQjXZEnHtNZbW4+nmiZehs8kiI9BMrV+M8SGu2WR2DSxRtLnYpV+/f jqVAuoKwg4uxaziZqj/ChwuzCqEYDMp8+EE0pdB3fpXJbeMYZ4xnhXyVS25u7MbR+0sk ewxKDV/5xnuARMgb+L/zV7LNPwuXr6JqOhYnmklOpN1n5oPFjcTX1NdjHY3W1vkCNrow x+uRxBqIBZN557arNazgHvsBOLwCDSQ9/7Md1GkeJGe6gOc8DXQXR3jwfkA3Cze7z6Wg eWI8DxmhzmoAaHf3gCM28AGGfUm2Ge+WTA5vidBJw4NMG1FJ5OKcbbtIxN1OEPkWmW7O lJwg== X-Gm-Message-State: AOJu0Yyi82lmTgC9b9my8n2yEiTWiEx769cF2C/QJwfcA3HhbtR/k//F mktlKmxYVvi33hm6OGRr4ZT89ZtBdYWdj2OXpuDEBlvwV/ip665+SQCwG3jMMDtf8+aI5IfQErs csABZUA== X-Google-Smtp-Source: AGHT+IEJQwDmRr4j0IeTuHLjRscPYFO2esb41oXXHcAGQflHaMSZ+yLUYMuzWQClq4XbamMAYnDQkQ== X-Received: by 2002:a05:6820:70a:b0:5d8:a13:f99d with SMTP id 006d021491bc7-5e18eb5dd09mr754474eaf.1.1725456450365; Wed, 04 Sep 2024 06:27:30 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:29 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 11/36] arm: [MVE intrinsics] add vcvtx shape Date: Wed, 4 Sep 2024 13:26:25 +0000 Message-Id: <20240904132650.2720446-12-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvtx shape description for vcvtaq, vcvtmq, vcvtnq, vcvtpq. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvtx): New. * config/arm/arm-mve-builtins-shapes.h (vcvtx): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 59 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 60 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 5ebf666d954..6632ee49067 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2116,6 +2116,65 @@ struct vcvt_f32_f16_def : public nonoverloaded_base }; SHAPE (vcvt_f32_f16) +/* _t foo_t0[_t1](_t) + + Example: vcvtaq. + int16x8_t [__arm_]vcvtaq_s16_f16(float16x8_t a) + int16x8_t [__arm_]vcvtaq_m[_s16_f16](int16x8_t inactive, float16x8_t a, mve_pred16_t p) + int16x8_t [__arm_]vcvtaq_x_s16_f16(float16x8_t a, mve_pred16_t p) +*/ +struct vcvtx_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info) const override + { + return pred != PRED_m; + } + + bool + skip_overload_p (enum predication_index pred, enum mode_suffix_index) + const override + { + return pred != PRED_m; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index from_type; + tree res; + + if (!r.check_gp_argument (1, i, nargs) + || (from_type + = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + type_suffix_index to_type; + + gcc_assert (r.pred == PRED_m); + + /* Get the return type from the 'inactive' argument. */ + to_type = r.infer_vector_type (0); + + if ((res = r.lookup_form (r.mode_suffix_id, to_type, from_type))) + return res; + + return r.report_no_such_form (from_type); + } +}; +SHAPE (vcvtx) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 50157b57571..ef497b6c97a 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -80,6 +80,7 @@ namespace arm_mve extern const function_shape *const vcvt; extern const function_shape *const vcvt_f16_f32; extern const function_shape *const vcvt_f32_f16; + extern const function_shape *const vcvtx; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ From patchwork Wed Sep 4 13:26:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980811 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=B4/iTeLt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNjY532Cz1yg3 for ; Wed, 4 Sep 2024 23:33:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 881CD384DEF1 for ; Wed, 4 Sep 2024 13:33:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 24083385DDDA for ; Wed, 4 Sep 2024 13:27:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24083385DDDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 24083385DDDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456459; cv=none; b=SP/WFxwNK7UoyubpdtUD9MLCXrrXpFebYa/o7MiiIQqIzfh04SlXmx+IGPp2PdggDBYggW+ugUbRV217FqLOaIPlGmjRTZIRspkx3rt4mOi5ougcvuED2IkXwGLs1PXrwDLrKSY2DEmpjxViy+r0CkqDos53U75nzQiL3mPQ/Jg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456459; c=relaxed/simple; bh=K7NQbx1fqJCxLWUvtoaqjv4f0+7fRDkH8fg70CdI6Vk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=h+T5cN+W9p/a151dHHiGb54Cass/SPHlaEkni74G6+ONyA5ZBVcDgd/InqSOYSpxzlQL1LAav/LYZRxb/Mt7AqgAc0NTbBVTdlWs9vfL5izthYgXRX/KCE9zrYPTmD0e4oT8+xVB/a2V4a3n877Cu1BlWvlGM/W+sHXApD511BE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5e172cc6d97so1741975eaf.1 for ; Wed, 04 Sep 2024 06:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456452; x=1726061252; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+LyRyMGdWZhmPLAgm4AdqDKbBl129QWxuFuaiLfI4FM=; b=B4/iTeLt+wfhYvyGUq67ukpii68srWXpLATfZ0xsv4TeCk/oC7lfF1UZDeHFV03jLJ OJLtrLcVlOpX1sxsWY/qCiYBQUP32I/8uGL1XKsPl6l7ga3gkLptHVU0IYred4eb/rbR C0ZacDH4hjhI7jhqK63bPh8ykxjc0RHB4o1hA6/jrkfG7yaHxAwbLJmI9aIsN2flXNxZ 5jDlxNP2LjD9TQ3GCo3T84lITd57eLkowLK3Q8KDhdXyOsc4oB5PMIg/pL6w7pKVPB3a lgUrezrOhu+uA+CylenKopUEro6HyMlm6uB/Km7pqNOylLq0pDTWYySSIrH81E/C+IlB F6sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456452; x=1726061252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+LyRyMGdWZhmPLAgm4AdqDKbBl129QWxuFuaiLfI4FM=; b=GzT5H7Z47A05eczjlCNZnIUQv9HOlySPpt8tlDlXg3HHu6FzSStEn+e7CN73+Uv2sm Vz3nHxhQF5Ei5E53M8SdJmHKjSf8VwiieSqDLMIE4sJlrQ/GzDsfIp3Dpp7PiE0sT2KY GFtHZkjRm9OIzP+yORKf4MmfvE67a+ApealfWORhKi6CZ1BPNgA++2xaqjAjm2AEWv2X EAWVVLXNeTxxDvNXtgKc+hvfAJcnwoCSmmhtPL4e1pLa0ykL/Bum6o146+82Jh4kQd3m 09UBxcbfey/DnMld9wEEyC66kzBCZdjV0T0gFYpEVWiqvCqUEEDbHB4LORQlarvdJ+EM aQWg== X-Gm-Message-State: AOJu0YygKxrofj37twPAwX/u4bpL21xfhB7E2EfuYaw8OTq4zfgSi4pf h2Ccq7Efe40llYCnlXNL+Qz7Crq0s2MMnyTCBEtrN4dmmmgoqDnjfuiw4oIwBbtkRiGhffnYFTl aRdvePw== X-Google-Smtp-Source: AGHT+IEqHk2NVNbGLxzSQmfF92eN4QybFMHE4uFt/+c4Au9Dp150NxN98UUw+bdmrf8ikN48uNomtQ== X-Received: by 2002:a05:6820:545:b0:5df:9274:c07e with SMTP id 006d021491bc7-5dfacddf733mr19570397eaf.2.1725456451697; Wed, 04 Sep 2024 06:27:31 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:30 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 12/36] arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq Date: Wed, 4 Sep 2024 13:26:26 +0000 Message-Id: <20240904132650.2720446-13-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.def (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.h: (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins.cc (cvtx): New type. * config/arm/arm_mve.h (vcvtaq_m): Delete. (vcvtmq_m): Delete. (vcvtnq_m): Delete. (vcvtpq_m): Delete. (vcvtaq_s16_f16): Delete. (vcvtaq_s32_f32): Delete. (vcvtnq_s16_f16): Delete. (vcvtnq_s32_f32): Delete. (vcvtpq_s16_f16): Delete. (vcvtpq_s32_f32): Delete. (vcvtmq_s16_f16): Delete. (vcvtmq_s32_f32): Delete. (vcvtpq_u16_f16): Delete. (vcvtpq_u32_f32): Delete. (vcvtnq_u16_f16): Delete. (vcvtnq_u32_f32): Delete. (vcvtmq_u16_f16): Delete. (vcvtmq_u32_f32): Delete. (vcvtaq_u16_f16): Delete. (vcvtaq_u32_f32): Delete. (vcvtaq_m_s16_f16): Delete. (vcvtaq_m_u16_f16): Delete. (vcvtaq_m_s32_f32): Delete. (vcvtaq_m_u32_f32): Delete. (vcvtmq_m_s16_f16): Delete. (vcvtnq_m_s16_f16): Delete. (vcvtpq_m_s16_f16): Delete. (vcvtmq_m_u16_f16): Delete. (vcvtnq_m_u16_f16): Delete. (vcvtpq_m_u16_f16): Delete. (vcvtmq_m_s32_f32): Delete. (vcvtnq_m_s32_f32): Delete. (vcvtpq_m_s32_f32): Delete. (vcvtmq_m_u32_f32): Delete. (vcvtnq_m_u32_f32): Delete. (vcvtpq_m_u32_f32): Delete. (vcvtaq_x_s16_f16): Delete. (vcvtaq_x_s32_f32): Delete. (vcvtaq_x_u16_f16): Delete. (vcvtaq_x_u32_f32): Delete. (vcvtnq_x_s16_f16): Delete. (vcvtnq_x_s32_f32): Delete. (vcvtnq_x_u16_f16): Delete. (vcvtnq_x_u32_f32): Delete. (vcvtpq_x_s16_f16): Delete. (vcvtpq_x_s32_f32): Delete. (vcvtpq_x_u16_f16): Delete. (vcvtpq_x_u32_f32): Delete. (vcvtmq_x_s16_f16): Delete. (vcvtmq_x_s32_f32): Delete. (vcvtmq_x_u16_f16): Delete. (vcvtmq_x_u32_f32): Delete. (__arm_vcvtpq_u16_f16): Delete. (__arm_vcvtpq_u32_f32): Delete. (__arm_vcvtnq_u16_f16): Delete. (__arm_vcvtnq_u32_f32): Delete. (__arm_vcvtmq_u16_f16): Delete. (__arm_vcvtmq_u32_f32): Delete. (__arm_vcvtaq_u16_f16): Delete. (__arm_vcvtaq_u32_f32): Delete. (__arm_vcvtaq_s16_f16): Delete. (__arm_vcvtaq_s32_f32): Delete. (__arm_vcvtnq_s16_f16): Delete. (__arm_vcvtnq_s32_f32): Delete. (__arm_vcvtpq_s16_f16): Delete. (__arm_vcvtpq_s32_f32): Delete. (__arm_vcvtmq_s16_f16): Delete. (__arm_vcvtmq_s32_f32): Delete. (__arm_vcvtaq_m_s16_f16): Delete. (__arm_vcvtaq_m_u16_f16): Delete. (__arm_vcvtaq_m_s32_f32): Delete. (__arm_vcvtaq_m_u32_f32): Delete. (__arm_vcvtmq_m_s16_f16): Delete. (__arm_vcvtnq_m_s16_f16): Delete. (__arm_vcvtpq_m_s16_f16): Delete. (__arm_vcvtmq_m_u16_f16): Delete. (__arm_vcvtnq_m_u16_f16): Delete. (__arm_vcvtpq_m_u16_f16): Delete. (__arm_vcvtmq_m_s32_f32): Delete. (__arm_vcvtnq_m_s32_f32): Delete. (__arm_vcvtpq_m_s32_f32): Delete. (__arm_vcvtmq_m_u32_f32): Delete. (__arm_vcvtnq_m_u32_f32): Delete. (__arm_vcvtpq_m_u32_f32): Delete. (__arm_vcvtaq_x_s16_f16): Delete. (__arm_vcvtaq_x_s32_f32): Delete. (__arm_vcvtaq_x_u16_f16): Delete. (__arm_vcvtaq_x_u32_f32): Delete. (__arm_vcvtnq_x_s16_f16): Delete. (__arm_vcvtnq_x_s32_f32): Delete. (__arm_vcvtnq_x_u16_f16): Delete. (__arm_vcvtnq_x_u32_f32): Delete. (__arm_vcvtpq_x_s16_f16): Delete. (__arm_vcvtpq_x_s32_f32): Delete. (__arm_vcvtpq_x_u16_f16): Delete. (__arm_vcvtpq_x_u32_f32): Delete. (__arm_vcvtmq_x_s16_f16): Delete. (__arm_vcvtmq_x_s32_f32): Delete. (__arm_vcvtmq_x_u16_f16): Delete. (__arm_vcvtmq_x_u32_f32): Delete. (__arm_vcvtaq_m): Delete. (__arm_vcvtmq_m): Delete. (__arm_vcvtnq_m): Delete. (__arm_vcvtpq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 4 + gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 4 + gcc/config/arm/arm-mve-builtins.cc | 9 + gcc/config/arm/arm_mve.h | 533 ----------------------- 5 files changed, 21 insertions(+), 533 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 760378c91b1..281f3749bce 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -506,6 +506,10 @@ FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNK FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) FUNCTION (vcvtq, vcvtq_impl,) +FUNCTION_WITHOUT_N_NO_F (vcvtaq, VCVTAQ) +FUNCTION_WITHOUT_N_NO_F (vcvtmq, VCVTMQ) +FUNCTION_WITHOUT_N_NO_F (vcvtnq, VCVTNQ) +FUNCTION_WITHOUT_N_NO_F (vcvtpq, VCVTPQ) FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) FUNCTION_ONLY_N (vdupq, VDUPQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 85211d2adc2..cf733f7627a 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -179,8 +179,12 @@ DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot270, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) +DEF_MVE_FUNCTION (vcvtaq, vcvtx, cvtx, mx_or_none) DEF_MVE_FUNCTION (vcvtbq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) DEF_MVE_FUNCTION (vcvtbq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) +DEF_MVE_FUNCTION (vcvtmq, vcvtx, cvtx, mx_or_none) +DEF_MVE_FUNCTION (vcvtnq, vcvtx, cvtx, mx_or_none) +DEF_MVE_FUNCTION (vcvtpq, vcvtx, cvtx, mx_or_none) DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) DEF_MVE_FUNCTION (vcvttq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) DEF_MVE_FUNCTION (vcvttq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 7b2107d9a0a..eb79bae8bf5 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -54,7 +54,11 @@ extern const function_base *const vcmulq_rot180; extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; +extern const function_base *const vcvtaq; extern const function_base *const vcvtbq; +extern const function_base *const vcvtmq; +extern const function_base *const vcvtnq; +extern const function_base *const vcvtpq; extern const function_base *const vcvtq; extern const function_base *const vcvttq; extern const function_base *const vdupq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 4c554a47d85..07e63df35e4 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -227,6 +227,14 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { #define TYPES_cvt_f32_f16(S, D) \ D (f32, f16) +/* All the type combinations allowed by vcvtXq. */ +#define TYPES_cvtx(S, D) \ + D (s16, f16), \ + D (s32, f32), \ + \ + D (u16, f16), \ + D (u32, f32) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -309,6 +317,7 @@ DEF_MVE_TYPES_ARRAY (signed_32); DEF_MVE_TYPES_ARRAY (cvt); DEF_MVE_TYPES_ARRAY (cvt_f16_f32); DEF_MVE_TYPES_ARRAY (cvt_f32_f16); +DEF_MVE_TYPES_ARRAY (cvtx); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 5c35e08d754..448407627e9 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -137,10 +137,6 @@ #define vsetq_lane(__a, __b, __idx) __arm_vsetq_lane(__a, __b, __idx) #define vgetq_lane(__a, __idx) __arm_vgetq_lane(__a, __idx) #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) -#define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) -#define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) -#define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) #define vst4q_s8( __addr, __value) __arm_vst4q_s8( __addr, __value) @@ -151,22 +147,6 @@ #define vst4q_u32( __addr, __value) __arm_vst4q_u32( __addr, __value) #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) -#define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) -#define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) -#define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) -#define vcvtnq_s32_f32(__a) __arm_vcvtnq_s32_f32(__a) -#define vcvtpq_s16_f16(__a) __arm_vcvtpq_s16_f16(__a) -#define vcvtpq_s32_f32(__a) __arm_vcvtpq_s32_f32(__a) -#define vcvtmq_s16_f16(__a) __arm_vcvtmq_s16_f16(__a) -#define vcvtmq_s32_f32(__a) __arm_vcvtmq_s32_f32(__a) -#define vcvtpq_u16_f16(__a) __arm_vcvtpq_u16_f16(__a) -#define vcvtpq_u32_f32(__a) __arm_vcvtpq_u32_f32(__a) -#define vcvtnq_u16_f16(__a) __arm_vcvtnq_u16_f16(__a) -#define vcvtnq_u32_f32(__a) __arm_vcvtnq_u32_f32(__a) -#define vcvtmq_u16_f16(__a) __arm_vcvtmq_u16_f16(__a) -#define vcvtmq_u32_f32(__a) __arm_vcvtmq_u32_f32(__a) -#define vcvtaq_u16_f16(__a) __arm_vcvtaq_u16_f16(__a) -#define vcvtaq_u32_f32(__a) __arm_vcvtaq_u32_f32(__a) #define vctp16q(__a) __arm_vctp16q(__a) #define vctp32q(__a) __arm_vctp32q(__a) #define vctp64q(__a) __arm_vctp64q(__a) @@ -200,28 +180,12 @@ #define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) #define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) #define vbicq_m_n_u32(__a, __imm, __p) __arm_vbicq_m_n_u32(__a, __imm, __p) -#define vcvtaq_m_s16_f16(__inactive, __a, __p) __arm_vcvtaq_m_s16_f16(__inactive, __a, __p) -#define vcvtaq_m_u16_f16(__inactive, __a, __p) __arm_vcvtaq_m_u16_f16(__inactive, __a, __p) -#define vcvtaq_m_s32_f32(__inactive, __a, __p) __arm_vcvtaq_m_s32_f32(__inactive, __a, __p) -#define vcvtaq_m_u32_f32(__inactive, __a, __p) __arm_vcvtaq_m_u32_f32(__inactive, __a, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) -#define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) -#define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) -#define vcvtmq_m_u16_f16(__inactive, __a, __p) __arm_vcvtmq_m_u16_f16(__inactive, __a, __p) -#define vcvtnq_m_u16_f16(__inactive, __a, __p) __arm_vcvtnq_m_u16_f16(__inactive, __a, __p) -#define vcvtpq_m_u16_f16(__inactive, __a, __p) __arm_vcvtpq_m_u16_f16(__inactive, __a, __p) -#define vcvtmq_m_s32_f32(__inactive, __a, __p) __arm_vcvtmq_m_s32_f32(__inactive, __a, __p) -#define vcvtnq_m_s32_f32(__inactive, __a, __p) __arm_vcvtnq_m_s32_f32(__inactive, __a, __p) -#define vcvtpq_m_s32_f32(__inactive, __a, __p) __arm_vcvtpq_m_s32_f32(__inactive, __a, __p) -#define vcvtmq_m_u32_f32(__inactive, __a, __p) __arm_vcvtmq_m_u32_f32(__inactive, __a, __p) -#define vcvtnq_m_u32_f32(__inactive, __a, __p) __arm_vcvtnq_m_u32_f32(__inactive, __a, __p) -#define vcvtpq_m_u32_f32(__inactive, __a, __p) __arm_vcvtpq_m_u32_f32(__inactive, __a, __p) #define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) #define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) #define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) @@ -532,22 +496,6 @@ #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vcvtaq_x_s16_f16(__a, __p) __arm_vcvtaq_x_s16_f16(__a, __p) -#define vcvtaq_x_s32_f32(__a, __p) __arm_vcvtaq_x_s32_f32(__a, __p) -#define vcvtaq_x_u16_f16(__a, __p) __arm_vcvtaq_x_u16_f16(__a, __p) -#define vcvtaq_x_u32_f32(__a, __p) __arm_vcvtaq_x_u32_f32(__a, __p) -#define vcvtnq_x_s16_f16(__a, __p) __arm_vcvtnq_x_s16_f16(__a, __p) -#define vcvtnq_x_s32_f32(__a, __p) __arm_vcvtnq_x_s32_f32(__a, __p) -#define vcvtnq_x_u16_f16(__a, __p) __arm_vcvtnq_x_u16_f16(__a, __p) -#define vcvtnq_x_u32_f32(__a, __p) __arm_vcvtnq_x_u32_f32(__a, __p) -#define vcvtpq_x_s16_f16(__a, __p) __arm_vcvtpq_x_s16_f16(__a, __p) -#define vcvtpq_x_s32_f32(__a, __p) __arm_vcvtpq_x_s32_f32(__a, __p) -#define vcvtpq_x_u16_f16(__a, __p) __arm_vcvtpq_x_u16_f16(__a, __p) -#define vcvtpq_x_u32_f32(__a, __p) __arm_vcvtpq_x_u32_f32(__a, __p) -#define vcvtmq_x_s16_f16(__a, __p) __arm_vcvtmq_x_s16_f16(__a, __p) -#define vcvtmq_x_s32_f32(__a, __p) __arm_vcvtmq_x_s32_f32(__a, __p) -#define vcvtmq_x_u16_f16(__a, __p) __arm_vcvtmq_x_u16_f16(__a, __p) -#define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3690,118 +3638,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtpq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtpq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtnq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtnq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtmq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtmq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtaq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtaq_uv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtaq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtaq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtnq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtnq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtpq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtpq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtmq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtmq_sv4si (__a); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) @@ -3830,119 +3666,6 @@ __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vbicq_fv4sf (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv4si (__inactive, __a, __p); -} - - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv4si (__inactive, __a, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4201,118 +3924,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -6771,118 +6382,6 @@ __arm_vbicq (float32x4_t __a, float32x4_t __b) return __arm_vbicq_f32 (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_u32_f32 (__inactive, __a, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7572,38 +7071,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vcvtaq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtmq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtmq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtmq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtmq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtmq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtnq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtnq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtnq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtnq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtnq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtpq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtpq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtpq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtpq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtpq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - #define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ From patchwork Wed Sep 4 13:26:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980803 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ZjOZl+RP; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNg520Qfz1yg3 for ; Wed, 4 Sep 2024 23:31:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6B71D3861036 for ; Wed, 4 Sep 2024 13:31:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id DB4783858C41 for ; Wed, 4 Sep 2024 13:27:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DB4783858C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DB4783858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; cv=none; b=DskVaElIEKoW+k2PK+xS+NEaGuWf6p3bi1cIlgmhLalCWcgI+39HOl4iMaW49pHjwDrRDqm9/jI3N8vUyyMeY1Ub3XMrOWv7XmIZGceXeOaRHLDM7aCEeaItqOqJzgmJ9dDdpIP75Hbkw/ofBpdqnAN0c/mHlGYOIYSy7r8JEcA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; c=relaxed/simple; bh=g9rXFwc2iX40SaSbUShqfuznckEkUuznx4q+99Xc9KE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Xp2B4ukqvCd+utiFi/Jf6Nru+Hc9UWNoK2oSleg5dcNRQzlEjzgAbs4SA9ysnF438M65VzOycUgd8QubwLH2qtVyzztGiKdH7/SN9wXGcFA1lY23iFegSbDZN7tCec8L2wvMfcX27M44zjs+mi0ZBngHpX76Wrf7jo7ruFaCMPY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc32.google.com with SMTP id 006d021491bc7-5df9542f3d8so5536845eaf.0 for ; Wed, 04 Sep 2024 06:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456453; x=1726061253; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R/8403PJyhHEVs83vPy3EnbrxxaSCVz/jbzkTW+TAkk=; b=ZjOZl+RP6qRbiSpvFC8eLAVENntvx1sQEUvt3yca0dMjmRPZ+ZjWoqiKCbnFIE/zfC Zp78RG5lb0r1mTwIpUnSSMeStHdTdex0GMX4c/YJEp6htlZHckldLDj7HxMGAFzSRFRL JoUYWihFLVLV/D5jCT7itLTNBdbKw0015w7tx98nj5u2Y7LJOq7wL8Y6RSmStYzmVFMJ ZjAaJzsx7Hlz8pn0Yepe46SIhM3LfHFJZi9XLnIG1wehXByxpaQhn7PO9txz+Lfgnyhn MmCG1X9xBqTopkjRfOlhirxaHi25NsjndXdYUvJzxJeRzR0GYaoQsPWAwanEW+7/hj/b d8Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456453; x=1726061253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R/8403PJyhHEVs83vPy3EnbrxxaSCVz/jbzkTW+TAkk=; b=pcn7Q/ASHeo+HKR0FBPrg6iogflaYNkUGJ7CJ2U6Opgpx0S7j9jMxLf5RkaA5QsyLy IouBvFeF7GS4KGxIwEp1YdjYf3LRMQs8YFvD1lHjBDB4Uw6vGBgLV2+68e4Vv1FUJUcg M1FcDCQu4oStWPLIwUjKOSLzAE5EBvE9Ykk0sB38zNtKR4ggYsOGkiPHErHQmq84xeJq lpkcAiT5SmtdTm6/WqUiPY6KOsjKg/LDbxkEPDn7bIGyNZ4g61sEWhlSC8+Jw3UODliU QK5qCpJFLREibTFGZoOlweaYyUQCvF1+m/Z76+Ghf8yiKrD15k+hsYm6XYYv0y08iP1i okxg== X-Gm-Message-State: AOJu0YyrNo5rCS3vYhI8vMGoC4iq3DaUQZSSYW70tQpZd6dRfqSh/n1F uy1h8RBTPEMqiPXwsN2dSeJrU9fodGpmKbMe7GKZwLEsEmlTpB5oAgGg63vBcRqn+dROdavqPJ+ g7Dn6MA== X-Google-Smtp-Source: AGHT+IFcErtu5RvbP1xMVi8BjP76bHwxJHF6thrdfTqUZFny78kM4kiwSdF1cXtGPuE9/7Y94i7Rmg== X-Received: by 2002:a05:6820:1ad5:b0:5da:9b98:e208 with SMTP id 006d021491bc7-5dfad0203e6mr19648807eaf.5.1725456452435; Wed, 04 Sep 2024 06:27:32 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:31 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 13/36] arm: [MVE intrinsics] rework vbicq Date: Wed, 4 Sep 2024 13:26:27 +0000 Message-Id: <20240904132650.2720446-14-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vbicq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vbicq): New. * config/arm/arm-mve-builtins-base.def (vbicq): New. * config/arm/arm-mve-builtins-base.h (vbicq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vbic): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vbicq. * config/arm/arm_mve.h (vbicq): Delete. (vbicq_m_n): Delete. (vbicq_m): Delete. (vbicq_x): Delete. (vbicq_u8): Delete. (vbicq_s8): Delete. (vbicq_u16): Delete. (vbicq_s16): Delete. (vbicq_u32): Delete. (vbicq_s32): Delete. (vbicq_n_u16): Delete. (vbicq_f16): Delete. (vbicq_n_s16): Delete. (vbicq_n_u32): Delete. (vbicq_f32): Delete. (vbicq_n_s32): Delete. (vbicq_m_n_s16): Delete. (vbicq_m_n_s32): Delete. (vbicq_m_n_u16): Delete. (vbicq_m_n_u32): Delete. (vbicq_m_s8): Delete. (vbicq_m_s32): Delete. (vbicq_m_s16): Delete. (vbicq_m_u8): Delete. (vbicq_m_u32): Delete. (vbicq_m_u16): Delete. (vbicq_m_f32): Delete. (vbicq_m_f16): Delete. (vbicq_x_s8): Delete. (vbicq_x_s16): Delete. (vbicq_x_s32): Delete. (vbicq_x_u8): Delete. (vbicq_x_u16): Delete. (vbicq_x_u32): Delete. (vbicq_x_f16): Delete. (vbicq_x_f32): Delete. (__arm_vbicq_u8): Delete. (__arm_vbicq_s8): Delete. (__arm_vbicq_u16): Delete. (__arm_vbicq_s16): Delete. (__arm_vbicq_u32): Delete. (__arm_vbicq_s32): Delete. (__arm_vbicq_n_u16): Delete. (__arm_vbicq_n_s16): Delete. (__arm_vbicq_n_u32): Delete. (__arm_vbicq_n_s32): Delete. (__arm_vbicq_m_n_s16): Delete. (__arm_vbicq_m_n_s32): Delete. (__arm_vbicq_m_n_u16): Delete. (__arm_vbicq_m_n_u32): Delete. (__arm_vbicq_m_s8): Delete. (__arm_vbicq_m_s32): Delete. (__arm_vbicq_m_s16): Delete. (__arm_vbicq_m_u8): Delete. (__arm_vbicq_m_u32): Delete. (__arm_vbicq_m_u16): Delete. (__arm_vbicq_x_s8): Delete. (__arm_vbicq_x_s16): Delete. (__arm_vbicq_x_s32): Delete. (__arm_vbicq_x_u8): Delete. (__arm_vbicq_x_u16): Delete. (__arm_vbicq_x_u32): Delete. (__arm_vbicq_f16): Delete. (__arm_vbicq_f32): Delete. (__arm_vbicq_m_f32): Delete. (__arm_vbicq_m_f16): Delete. (__arm_vbicq_x_f16): Delete. (__arm_vbicq_x_f32): Delete. (__arm_vbicq): Delete. (__arm_vbicq_m_n): Delete. (__arm_vbicq_m): Delete. (__arm_vbicq_x): Delete. * config/arm/mve.md (mve_vbicq_u): Rename into ... (@mve_vbicq_u): ... this. (mve_vbicq_s): Rename into ... (@mve_vbicq_s): ... this. (mve_vbicq_f): Rename into ... (@mve_vbicq_f): ... this. --- gcc/config/arm/arm-mve-builtins-base.cc | 1 + gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins-functions.h | 54 ++ gcc/config/arm/arm-mve-builtins.cc | 1 + gcc/config/arm/arm_mve.h | 574 -------------------- gcc/config/arm/mve.md | 6 +- 7 files changed, 62 insertions(+), 577 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 281f3749bce..e33603ec1f3 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -481,6 +481,7 @@ FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ) FUNCTION_PRED_P_S_U (vaddvq, VADDVQ) FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ) FUNCTION_WITH_RTX_M (vandq, AND, VANDQ) +FUNCTION (vbicq, unspec_based_mve_function_exact_insn_vbic, (VBICQ_N_S, VBICQ_N_U, VBICQ_M_S, VBICQ_M_U, VBICQ_M_F, VBICQ_M_N_S, VBICQ_M_N_U)) FUNCTION_ONLY_N (vbrsrq, VBRSRQ) FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M, VCADDQ_ROT90_M, VCADDQ_ROT90_M_F)) FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M, VCADDQ_ROT270_M, VCADDQ_ROT270_M_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index cf733f7627a..aa7b71387f9 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -27,6 +27,7 @@ DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none) DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vbicq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_integer, mx_or_none) @@ -161,6 +162,7 @@ DEF_MVE_FUNCTION (vabdq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vabsq, unary, all_float, mx_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vbicq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index eb79bae8bf5..e6b828a4e1e 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -32,6 +32,7 @@ extern const function_base *const vaddq; extern const function_base *const vaddvaq; extern const function_base *const vaddvq; extern const function_base *const vandq; +extern const function_base *const vbicq; extern const function_base *const vbrsrq; extern const function_base *const vcaddq_rot270; extern const function_base *const vcaddq_rot90; diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index 35cb5242b77..0bb91f5ec1f 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -468,6 +468,60 @@ public: } }; +/* Map the function directly to CODE (M) for vbic-like builtins. The difference + with unspec_based_mve_function_exact_insn is that this function has vbic + hardcoded for the PRED_none, MODE_none version, rather than using an + RTX. */ +class unspec_based_mve_function_exact_insn_vbic : public unspec_based_mve_function_base +{ +public: + CONSTEXPR unspec_based_mve_function_exact_insn_vbic (int unspec_for_n_sint, + int unspec_for_n_uint, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp, + int unspec_for_m_n_sint, + int unspec_for_m_n_uint) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + -1, -1, -1, /* No non-predicated, no mode intrinsics. */ + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1) + {} + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + + /* No suffix, no predicate, use the right RTX code. */ + if (e.pred == PRED_none + && e.mode_suffix_id == MODE_none) + { + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_vbicq_u (mode); + else + code = code_for_mve_vbicq_s (mode); + else + code = code_for_mve_vbicq_f (mode); + + return e.use_exact_insn (code); + } + + return expand_unspec (e); + } +}; + /* Map the comparison functions. */ class unspec_based_mve_function_exact_insn_vcmp : public unspec_based_mve_function_base { diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 07e63df35e4..13c666b8f6a 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -737,6 +737,7 @@ function_instance::has_inactive_argument () const return false; if (mode_suffix_id == MODE_r + || (base == functions::vbicq && mode_suffix_id == MODE_n) || base == functions::vcmlaq || base == functions::vcmlaq_rot90 || base == functions::vcmlaq_rot180 diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 448407627e9..3fd6980a58d 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -43,10 +43,7 @@ #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) #define vornq(__a, __b) __arm_vornq(__a, __b) -#define vbicq(__a, __b) __arm_vbicq(__a, __b) -#define vbicq_m_n(__a, __imm, __p) __arm_vbicq_m_n(__a, __imm, __p) #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) -#define vbicq_m(__inactive, __a, __b, __p) __arm_vbicq_m(__inactive, __a, __b, __p) #define vornq_m(__inactive, __a, __b, __p) __arm_vornq_m(__inactive, __a, __b, __p) #define vstrbq_scatter_offset(__base, __offset, __value) __arm_vstrbq_scatter_offset(__base, __offset, __value) #define vstrbq(__addr, __value) __arm_vstrbq(__addr, __value) @@ -119,7 +116,6 @@ #define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) #define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) #define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) -#define vbicq_x(__a, __b, __p) __arm_vbicq_x(__a, __b, __p) #define vornq_x(__a, __b, __p) __arm_vornq_x(__a, __b, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) #define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) @@ -153,53 +149,29 @@ #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) -#define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b) #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) -#define vbicq_s8(__a, __b) __arm_vbicq_s8(__a, __b) #define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b) -#define vbicq_u16(__a, __b) __arm_vbicq_u16(__a, __b) #define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b) -#define vbicq_s16(__a, __b) __arm_vbicq_s16(__a, __b) #define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b) -#define vbicq_u32(__a, __b) __arm_vbicq_u32(__a, __b) #define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b) -#define vbicq_s32(__a, __b) __arm_vbicq_s32(__a, __b) -#define vbicq_n_u16(__a, __imm) __arm_vbicq_n_u16(__a, __imm) #define vornq_f16(__a, __b) __arm_vornq_f16(__a, __b) -#define vbicq_f16(__a, __b) __arm_vbicq_f16(__a, __b) -#define vbicq_n_s16(__a, __imm) __arm_vbicq_n_s16(__a, __imm) -#define vbicq_n_u32(__a, __imm) __arm_vbicq_n_u32(__a, __imm) #define vornq_f32(__a, __b) __arm_vornq_f32(__a, __b) -#define vbicq_f32(__a, __b) __arm_vbicq_f32(__a, __b) -#define vbicq_n_s32(__a, __imm) __arm_vbicq_n_s32(__a, __imm) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) #define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p) -#define vbicq_m_n_s16(__a, __imm, __p) __arm_vbicq_m_n_s16(__a, __imm, __p) -#define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) -#define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) -#define vbicq_m_n_u32(__a, __imm, __p) __arm_vbicq_m_n_u32(__a, __imm, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) -#define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) -#define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) -#define vbicq_m_u8(__inactive, __a, __b, __p) __arm_vbicq_m_u8(__inactive, __a, __b, __p) -#define vbicq_m_u32(__inactive, __a, __b, __p) __arm_vbicq_m_u32(__inactive, __a, __b, __p) -#define vbicq_m_u16(__inactive, __a, __b, __p) __arm_vbicq_m_u16(__inactive, __a, __b, __p) #define vornq_m_s8(__inactive, __a, __b, __p) __arm_vornq_m_s8(__inactive, __a, __b, __p) #define vornq_m_s32(__inactive, __a, __b, __p) __arm_vornq_m_s32(__inactive, __a, __b, __p) #define vornq_m_s16(__inactive, __a, __b, __p) __arm_vornq_m_s16(__inactive, __a, __b, __p) #define vornq_m_u8(__inactive, __a, __b, __p) __arm_vornq_m_u8(__inactive, __a, __b, __p) #define vornq_m_u32(__inactive, __a, __b, __p) __arm_vornq_m_u32(__inactive, __a, __b, __p) #define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) -#define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) -#define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) #define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) #define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) @@ -484,20 +456,12 @@ #define viwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u8(__a, __b, __imm, __p) #define viwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u16(__a, __b, __imm, __p) #define viwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u32(__a, __b, __imm, __p) -#define vbicq_x_s8(__a, __b, __p) __arm_vbicq_x_s8(__a, __b, __p) -#define vbicq_x_s16(__a, __b, __p) __arm_vbicq_x_s16(__a, __b, __p) -#define vbicq_x_s32(__a, __b, __p) __arm_vbicq_x_s32(__a, __b, __p) -#define vbicq_x_u8(__a, __b, __p) __arm_vbicq_x_u8(__a, __b, __p) -#define vbicq_x_u16(__a, __b, __p) __arm_vbicq_x_u16(__a, __b, __p) -#define vbicq_x_u32(__a, __b, __p) __arm_vbicq_x_u32(__a, __b, __p) #define vornq_x_s8(__a, __b, __p) __arm_vornq_x_s8(__a, __b, __p) #define vornq_x_s16(__a, __b, __p) __arm_vornq_x_s16(__a, __b, __p) #define vornq_x_s32(__a, __b, __p) __arm_vornq_x_s32(__a, __b, __p) #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) -#define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) #define vornq_x_f32(__a, __b, __p) __arm_vornq_x_f32(__a, __b, __p) #define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) @@ -708,13 +672,6 @@ __arm_vornq_u8 (uint8x16_t __a, uint8x16_t __b) return __builtin_mve_vornq_uv16qi (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return __builtin_mve_vbicq_uv16qi (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s8 (int8x16_t __a, int8x16_t __b) @@ -722,13 +679,6 @@ __arm_vornq_s8 (int8x16_t __a, int8x16_t __b) return __builtin_mve_vornq_sv16qi (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vbicq_sv16qi (__a, __b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) @@ -736,13 +686,6 @@ __arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) return __builtin_mve_vornq_uv8hi (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return __builtin_mve_vbicq_uv8hi (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s16 (int16x8_t __a, int16x8_t __b) @@ -750,13 +693,6 @@ __arm_vornq_s16 (int16x8_t __a, int16x8_t __b) return __builtin_mve_vornq_sv8hi (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vbicq_sv8hi (__a, __b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) @@ -764,13 +700,6 @@ __arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) return __builtin_mve_vornq_uv4si (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return __builtin_mve_vbicq_uv4si (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s32 (int32x4_t __a, int32x4_t __b) @@ -778,41 +707,6 @@ __arm_vornq_s32 (int32x4_t __a, int32x4_t __b) return __builtin_mve_vornq_sv4si (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vbicq_sv4si (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u16 (uint16x8_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_uv8hi (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s16 (int16x8_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_sv8hi (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u32 (uint32x4_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_uv4si (__a, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s32 (int32x4_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_sv4si (__a, __imm); -} - __extension__ extern __inline mve_pred16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vctp8q_m (uint32_t __a, mve_pred16_t __p) @@ -841,34 +735,6 @@ __arm_vctp16q_m (uint32_t __a, mve_pred16_t __p) return __builtin_mve_vctp16q_mv8bi (__a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_sv8hi (__a, __imm, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_s32 (int32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_sv4si (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_uv8hi (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_u32 (uint32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_uv4si (__a, __imm, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq_s8 (int8x16_t __a, uint32_t * __b, const int __imm) @@ -923,48 +789,6 @@ __arm_vshlcq_u32 (uint32x4_t __a, uint32_t * __b, const int __imm) return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv8hi (__inactive, __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -2834,48 +2658,6 @@ __arm_viwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16 return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -3645,13 +3427,6 @@ __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) return __builtin_mve_vornq_fv8hf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vbicq_fv8hf (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f32 (float32x4_t __a, float32x4_t __b) @@ -3659,27 +3434,6 @@ __arm_vornq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vornq_fv4sf (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vbicq_fv4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -3924,20 +3678,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -4119,13 +3859,6 @@ __arm_vornq (uint8x16_t __a, uint8x16_t __b) return __arm_vornq_u8 (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vbicq_u8 (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int8x16_t __a, int8x16_t __b) @@ -4133,13 +3866,6 @@ __arm_vornq (int8x16_t __a, int8x16_t __b) return __arm_vornq_s8 (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int8x16_t __a, int8x16_t __b) -{ - return __arm_vbicq_s8 (__a, __b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (uint16x8_t __a, uint16x8_t __b) @@ -4147,13 +3873,6 @@ __arm_vornq (uint16x8_t __a, uint16x8_t __b) return __arm_vornq_u16 (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vbicq_u16 (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int16x8_t __a, int16x8_t __b) @@ -4161,13 +3880,6 @@ __arm_vornq (int16x8_t __a, int16x8_t __b) return __arm_vornq_s16 (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, int16x8_t __b) -{ - return __arm_vbicq_s16 (__a, __b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (uint32x4_t __a, uint32x4_t __b) @@ -4175,13 +3887,6 @@ __arm_vornq (uint32x4_t __a, uint32x4_t __b) return __arm_vornq_u32 (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vbicq_u32 (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int32x4_t __a, int32x4_t __b) @@ -4189,69 +3894,6 @@ __arm_vornq (int32x4_t __a, int32x4_t __b) return __arm_vornq_s32 (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, int32x4_t __b) -{ - return __arm_vbicq_s32 (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, const int __imm) -{ - return __arm_vbicq_n_u16 (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, const int __imm) -{ - return __arm_vbicq_n_s16 (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, const int __imm) -{ - return __arm_vbicq_n_u32 (__a, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, const int __imm) -{ - return __arm_vbicq_n_s32 (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (int16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_s16 (__a, __imm, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (int32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_s32 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u32 (__a, __imm, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) @@ -4294,48 +3936,6 @@ __arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) return __arm_vshlcq_u32 (__a, __b, __imm); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -5778,48 +5378,6 @@ __arm_viwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t return __arm_viwdupq_x_wb_u32 (__a, __b, __imm, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u32 (__a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -6361,13 +5919,6 @@ __arm_vornq (float16x8_t __a, float16x8_t __b) return __arm_vornq_f16 (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (float16x8_t __a, float16x8_t __b) -{ - return __arm_vbicq_f16 (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float32x4_t __a, float32x4_t __b) @@ -6375,27 +5926,6 @@ __arm_vornq (float32x4_t __a, float32x4_t __b) return __arm_vornq_f32 (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (float32x4_t __a, float32x4_t __b) -{ - return __arm_vbicq_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -6578,20 +6108,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -7027,22 +6543,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -7055,13 +6555,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) -#define __arm_vbicq_m_n(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vbicq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vbicq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -7071,19 +6564,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -7387,18 +6867,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_wb_p_u32 (p0, p1, __ARM_mve_coerce(__p2, uint32x4_t), p3), \ int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_wb_p_f32 (p0, p1, __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vbicq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ @@ -7469,20 +6937,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -7492,24 +6946,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vbicq_m_n(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vbicq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vbicq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - -#define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -7750,16 +7186,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) -#define __arm_vbicq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ int (*)[__ARM_mve_type_int16_t_ptr]: __arm_vld1q_z_s16 (__ARM_mve_coerce_s16_ptr(p0, int16_t *), p1), \ diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 41c7e73a161..c0dd4b9019e 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -803,7 +803,7 @@ (define_expand "mve_vandq_s" ;; ;; [vbicq_s, vbicq_u]) ;; -(define_insn "mve_vbicq_u" +(define_insn "@mve_vbicq_u" [ (set (match_operand:MVE_2 0 "s_register_operand" "=w") (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand" "w")) @@ -815,7 +815,7 @@ (define_insn "mve_vbicq_u" (set_attr "type" "mve_move") ]) -(define_expand "mve_vbicq_s" +(define_expand "@mve_vbicq_s" [ (set (match_operand:MVE_2 0 "s_register_operand") (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand")) @@ -1209,7 +1209,7 @@ (define_insn "mve_vandq_f" ;; ;; [vbicq_f]) ;; -(define_insn "mve_vbicq_f" +(define_insn "@mve_vbicq_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (and:MVE_0 (not:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")) From patchwork Wed Sep 4 13:26:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980800 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=q9sFDuq5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNf85bYkz1yg3 for ; Wed, 4 Sep 2024 23:30:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9D1CC3865C16 for ; Wed, 4 Sep 2024 13:30:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id 44BEE385700F for ; Wed, 4 Sep 2024 13:27:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 44BEE385700F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 44BEE385700F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c30 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456457; cv=none; b=TBNxUstk/lBQH1lUdCehf6rL73U05cSMqj/OtecJm/IC8jiVAWKvTlOteZ4UbTuOFnAHbOx6KHvXqhOiUmT3Fekvwh7dlH5D6l0yoe5o6qH+PJJerP+DzFNv6RfjrYc6mb2pwQ8xbk8ZOkwh0znqHFSLH5KY31r+Q/Yk3GqtWUI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456457; c=relaxed/simple; bh=tETKiWdAzpu/oSsBolLuXfjaT2YrWFZTgH8ApuMPfJg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=jvdQe9WMENOQFid3DfUj9/DUPiPPTdOeOsH+Qf/P1sfP/DSEqixq9iN2f5NUqEOs0LjjemMNZ2deUYrEadOA4FuwA3TOrZ2Zm+DY7qDrpAKh33tfjGooaBOulEV/fezKtjnsky0f3CoEHXYDqXXjCDG7yozD3dzpLxOPPWqqSDM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-5e16cb56d4aso1926336eaf.1 for ; Wed, 04 Sep 2024 06:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456453; x=1726061253; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gCTFkTEXMFNqv/aGPmm6fvek61Jg8J2LoDXzDXT83+I=; b=q9sFDuq5b2RT2o1p6fUXSo2iIGGXyeW7ZYhMHX1F0T5d1y3M1YGeS0id5oj3Itil2j jt5rWHzLSBrMq1e/DQR9m6M+2+FKFlf2ZefMGMqFGlK/NLeXVRuEch/+/dJpdelzuXjF sLDy0tbsBCzZq7+rJRN5Ayl8W8m+o2pVabWlbw73pz8zoXDjmwse3HIbB4iY5r3U5jwi FzXBbCfR/g6iPxdL+2lHtfjPzR5e2AQVj6M3nKDTz5/nPi4u2dcRT62tmwkP3RWALJn5 SLbSF/xQlfhvXh4pbtQTOuSYseROJ8ZADKpKEMK7bMAWI7c8PFtIDAEMBXGI7/U6rvGD UMOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456453; x=1726061253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gCTFkTEXMFNqv/aGPmm6fvek61Jg8J2LoDXzDXT83+I=; b=iIVlxjd0gu3kTlfQ3CvObliX1Lvjh3bSKeGDOFz7oq9Hx4hFnJ4W8W7OBPwrpzMNrv i5s80EW2Vl5Y+5WJaEQMQEqp5T8Hr/i/z5MsNA0hU2lvO0Nv/S+h4Q34qdQ7B1ijo/AG PSaktSKyUKwbdrevPAA081vwGCrD7ebrG/xAUjhf/Qbx9sfZqZpEj6EpKnQ7ITNLokEE 37bbGLj2Ntaa4Altt7QXH7MfgCVKKJN768GlMBKV3IIeyfsqCWaGtJZjLTcrIxgwBF+e IOUTMUDUoT4ih3biGBhsfByKLZAILnHaLTbY7SvYuGUYkcJHC0bPbjcmFDbXsvl3nDVu mdWQ== X-Gm-Message-State: AOJu0YzV15d5AjuBfX+kgLGMqVNnL2seZH/HYsOUn6MXHtdnTK/iHt8R hyCf++RYT8CTMqXBnsCmjCKckOreJcpIt4aBRf2ZdXo5MWOHGlo4uPtE/ZPLW+jAAxqfWUXSh9k JCs034w== X-Google-Smtp-Source: AGHT+IGSHk423uAlWLQMSrjPfDSVIXfKo1pH1DVzKkXaQheGHv1Ex5ZkNPJ4FkT+uSWjiZ05uZufdQ== X-Received: by 2002:a05:6820:545:b0:5c4:57d:691e with SMTP id 006d021491bc7-5dfacf23410mr18429217eaf.2.1725456453091; Wed, 04 Sep 2024 06:27:33 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:32 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 14/36] arm: [MVE intrinsics] factorize vorn Date: Wed, 4 Sep 2024 13:26:28 +0000 Message-Id: <20240904132650.2720446-15-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vorn so that they use parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (MVE_INT_M_BINARY_LOGIC): Add VORNQ_M_S, VORNQ_M_U. (MVE_FP_M_BINARY_LOGIC): Add VORNQ_M_F. (mve_insn): Add VORNQ_M_S, VORNQ_M_U, VORNQ_M_F. * config/arm/mve.md (mve_vornq_s): Rename into ... (@mve_vornq_s): ... this. (mve_vornq_u): Rename into ... (@mve_vornq_u): ... this. (mve_vornq_f): Rename into ... (@mve_vornq_f): ... this. (mve_vornq_m_): Merge into vand/vbic pattern. (mve_vornq_m_f): Likewise. --- gcc/config/arm/iterators.md | 3 +++ gcc/config/arm/mve.md | 48 ++++++------------------------------- 2 files changed, 10 insertions(+), 41 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 162c0d56bfb..3a1825ebab2 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -444,6 +444,7 @@ (define_int_iterator MVE_INT_M_BINARY_LOGIC [ VANDQ_M_S VANDQ_M_U VBICQ_M_S VBICQ_M_U VEORQ_M_S VEORQ_M_U + VORNQ_M_S VORNQ_M_U VORRQ_M_S VORRQ_M_U ]) @@ -594,6 +595,7 @@ (define_int_iterator MVE_FP_M_BINARY_LOGIC [ VANDQ_M_F VBICQ_M_F VEORQ_M_F + VORNQ_M_F VORRQ_M_F ]) @@ -1094,6 +1096,7 @@ (define_int_attr mve_insn [ (VMVNQ_N_S "vmvn") (VMVNQ_N_U "vmvn") (VNEGQ_M_F "vneg") (VNEGQ_M_S "vneg") + (VORNQ_M_S "vorn") (VORNQ_M_U "vorn") (VORNQ_M_F "vorn") (VORRQ_M_N_S "vorr") (VORRQ_M_N_U "vorr") (VORRQ_M_S "vorr") (VORRQ_M_U "vorr") (VORRQ_M_F "vorr") (VORRQ_N_S "vorr") (VORRQ_N_U "vorr") diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index c0dd4b9019e..3d8b199d9d6 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1021,9 +1021,9 @@ (define_insn "mve_q" ]) ;; -;; [vornq_u, vornq_s]) +;; [vornq_u, vornq_s] ;; -(define_insn "mve_vornq_s" +(define_insn "@mve_vornq_s" [ (set (match_operand:MVE_2 0 "s_register_operand" "=w") (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand" "w")) @@ -1035,7 +1035,7 @@ (define_insn "mve_vornq_s" (set_attr "type" "mve_move") ]) -(define_expand "mve_vornq_u" +(define_expand "@mve_vornq_u" [ (set (match_operand:MVE_2 0 "s_register_operand") (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand")) @@ -1429,9 +1429,9 @@ (define_insn "mve_q_f" ]) ;; -;; [vornq_f]) +;; [vornq_f] ;; -(define_insn "mve_vornq_f" +(define_insn "@mve_vornq_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (ior:MVE_0 (not:MVE_0 (match_operand:MVE_0 2 "s_register_operand" "w")) @@ -2710,6 +2710,7 @@ (define_insn "@mve_q_m_" ;; [vandq_m_u, vandq_m_s] ;; [vbicq_m_u, vbicq_m_s] ;; [veorq_m_u, veorq_m_s] +;; [vornq_m_u, vornq_m_s] ;; [vorrq_m_u, vorrq_m_s] ;; (define_insn "@mve_q_m_" @@ -2836,24 +2837,6 @@ (define_insn "@mve_q_int_m_" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vornq_m_u, vornq_m_s]) -;; -(define_insn "mve_vornq_m_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:MVE_2 2 "s_register_operand" "w") - (match_operand:MVE_2 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VORNQ_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;vornt\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vornq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vqshlq_m_n_s, vqshlq_m_n_u] ;; [vshlq_m_n_s, vshlq_m_n_u] @@ -3108,6 +3091,7 @@ (define_insn "@mve_q_m_n_f" ;; [vandq_m_f] ;; [vbicq_m_f] ;; [veorq_m_f] +;; [vornq_m_f] ;; [vorrq_m_f] ;; (define_insn "@mve_q_m_f" @@ -3187,24 +3171,6 @@ (define_insn "@mve_q_m_f" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vornq_m_f]) -;; -(define_insn "mve_vornq_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VORNQ_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vornt\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vornq_f")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vstrbq_s vstrbq_u] ;; From patchwork Wed Sep 4 13:26:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980805 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Qkfc9nkg; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNgX2HkKz1yg3 for ; Wed, 4 Sep 2024 23:31:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2B6273865C33 for ; Wed, 4 Sep 2024 13:31:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [IPv6:2607:f8b0:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id B0805385DDFA for ; Wed, 4 Sep 2024 13:27:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B0805385DDFA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B0805385DDFA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::330 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; cv=none; b=YJuKbqML9MmKs8LWYve+bk/dbO7epzWn5gSbHgptgSVNAEBWCsWu7fQj94gEyZhwzZTMK3+AieiDExgB4/XCWqrxBcOfXLu+K4pvuAre8AjvxYlCpvQJjFajtki9LGAnxxYXoykc43vKTqodGvB3CWZj3mlLe6YXpIBjMqB5PSY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; c=relaxed/simple; bh=cgfLC4Ixk4usrIhjye4nUm3iutEgLwj4aLkumK194xA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=KJ61zLH9nHbbux09+YiPXGwMhfBtSVm+JbhfhsqltehskcAb0DW8HvlW95ItoP6c62tvXO8E6nIBkjVEuvkAlhaxu/xUpJls8fqLWafJBwqEIMMiERbDYPifbDM9FZxyctPA8BaDTTrQlJiOd76PSsz0pRwVTaeq6uv24TUly6Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-70f5a9bf18bso2798345a34.2 for ; Wed, 04 Sep 2024 06:27:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456454; x=1726061254; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=O9QXjQ2NkSVQeiuGYPoYPY2mr4pfljVlx9pfNNSmszs=; b=Qkfc9nkgF395HzW9wUPyB2Uuu4SIk5w+My+1n1ZoQjvbOvyNmCDABVa5mcapPAQrff pWkTYuSeHJ8dSbluMeGxZlk0Z9MQOpWGqRfo8TJCmu1tZltwft5PD3OO/8fSxw7PxbTy uBVsVYSdr15pDgSER3jg92PTyiUsI0IXv5SJidmyf6n0Pc/dvGLp2JbdaVE6icgi4QNz n3nl0VzpbbdCa773eiHs2SnH+vgtODO3Qfm+LTvVNUnx5itC3yUVlQTBpOXhdkxQxwIG f/ZjYnsP79NBs9BjoF0ljX71atuwfbZDTnse2u4Dzwm4iW1XEsT3homaHV3p/KAntORl pI6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456454; x=1726061254; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O9QXjQ2NkSVQeiuGYPoYPY2mr4pfljVlx9pfNNSmszs=; b=qZ0Yc5pPCb0MJQ+13+gxyFeG1wANnbm+HvTUayAGmw5VD6CuLqZ5d6iEfI0vy7xkdo EBb/CtC8/e1qHtm8UI6S2vhliZ8A9yGwC4ypUyU7tmrENWAa06qqh+07sayijHBO3aBd tchAyOxxbSopEEIgXmgo7hwou/g1Argzhb9ZzU8Y+SNIYmr9u3wLZmo9t3WXzLoAfZHJ N/8oFpMspHWRJzyheVRfXqCvEAhs2M+vmTKcnDhKFNNe8OObKXdlR1/YmfGOaJ6kbc4V WH1dBNqzgi42xXF3Os1e1dvLgYQjuCERRvQs8xFgM/8Hi5q9M5VAKA+Ekw0jPv0uCZu4 vVBw== X-Gm-Message-State: AOJu0YxJmyRJsbmkV+089BDRIU7/Y+G4wapMq68t+SAQdES8fQN8wkw1 7+lA7Br5402ZlOI4eP4v/hbZpw3PXIbfMcnNhFgryhLSwR9sR/yIkS8ORXbYmY3bqNIn+qkEkTQ ZgxG8MQ== X-Google-Smtp-Source: AGHT+IHkH2sC5kswcCGdBXUN7JILEfcfQf5dEFrI5T1TJJhaA0nADbRuyKtDqleTcnHNeWxb5wpc7w== X-Received: by 2002:a05:6830:6e18:b0:708:f1ad:c4bf with SMTP id 46e09a7af769-70f7072d55bmr16784628a34.27.1725456454250; Wed, 04 Sep 2024 06:27:34 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:33 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 15/36] arm: [MVE intrinsics] rework vorn Date: Wed, 4 Sep 2024 13:26:29 +0000 Message-Id: <20240904132650.2720446-16-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vorn using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vornq): New. * config/arm/arm-mve-builtins-base.def (vornq): New. * config/arm/arm-mve-builtins-base.h (vornq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vorn): New. * config/arm/arm_mve.h (vornq): Delete. (vornq_m): Delete. (vornq_x): Delete. (vornq_u8): Delete. (vornq_s8): Delete. (vornq_u16): Delete. (vornq_s16): Delete. (vornq_u32): Delete. (vornq_s32): Delete. (vornq_f16): Delete. (vornq_f32): Delete. (vornq_m_s8): Delete. (vornq_m_s32): Delete. (vornq_m_s16): Delete. (vornq_m_u8): Delete. (vornq_m_u32): Delete. (vornq_m_u16): Delete. (vornq_m_f32): Delete. (vornq_m_f16): Delete. (vornq_x_s8): Delete. (vornq_x_s16): Delete. (vornq_x_s32): Delete. (vornq_x_u8): Delete. (vornq_x_u16): Delete. (vornq_x_u32): Delete. (vornq_x_f16): Delete. (vornq_x_f32): Delete. (__arm_vornq_u8): Delete. (__arm_vornq_s8): Delete. (__arm_vornq_u16): Delete. (__arm_vornq_s16): Delete. (__arm_vornq_u32): Delete. (__arm_vornq_s32): Delete. (__arm_vornq_m_s8): Delete. (__arm_vornq_m_s32): Delete. (__arm_vornq_m_s16): Delete. (__arm_vornq_m_u8): Delete. (__arm_vornq_m_u32): Delete. (__arm_vornq_m_u16): Delete. (__arm_vornq_x_s8): Delete. (__arm_vornq_x_s16): Delete. (__arm_vornq_x_s32): Delete. (__arm_vornq_x_u8): Delete. (__arm_vornq_x_u16): Delete. (__arm_vornq_x_u32): Delete. (__arm_vornq_f16): Delete. (__arm_vornq_f32): Delete. (__arm_vornq_m_f32): Delete. (__arm_vornq_m_f16): Delete. (__arm_vornq_x_f16): Delete. (__arm_vornq_x_f32): Delete. (__arm_vornq): Delete. (__arm_vornq_m): Delete. (__arm_vornq_x): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 1 + gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins-functions.h | 53 +++ gcc/config/arm/arm_mve.h | 431 -------------------- 5 files changed, 57 insertions(+), 431 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e33603ec1f3..f8260f5f483 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -568,6 +568,7 @@ FUNCTION_WITH_RTX_M_N (vmulq, MULT, VMULQ) FUNCTION_WITH_RTX_M_N_NO_F (vmvnq, NOT, VMVNQ) FUNCTION (vnegq, unspec_based_mve_function_exact_insn, (NEG, NEG, NEG, -1, -1, -1, VNEGQ_M_S, -1, VNEGQ_M_F, -1, -1, -1)) FUNCTION_WITHOUT_M_N (vpselq, VPSELQ) +FUNCTION (vornq, unspec_based_mve_function_exact_insn_vorn, (-1, -1, VORNQ_M_S, VORNQ_M_U, VORNQ_M_F, -1, -1)) FUNCTION_WITH_RTX_M_N_NO_N_F (vorrq, IOR, VORRQ) FUNCTION_WITHOUT_N_NO_U_F (vqabsq, VQABSQ) FUNCTION_WITH_M_N_NO_F (vqaddq, VQADDQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index aa7b71387f9..cc76db3e0b9 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -87,6 +87,7 @@ DEF_MVE_FUNCTION (vmulltq_poly, binary_widen_poly, poly_8_16, mx_or_none) DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmvnq, mvn, all_integer, mx_or_none) DEF_MVE_FUNCTION (vnegq, unary, all_signed, mx_or_none) +DEF_MVE_FUNCTION (vornq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vorrq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vpselq, vpsel, all_integer_with_64, none) DEF_MVE_FUNCTION (vqabsq, unary, all_signed, m_or_none) @@ -206,6 +207,7 @@ DEF_MVE_FUNCTION (vminnmq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vminnmvq, binary_maxvminv, all_float, p_or_none) DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vnegq, unary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vornq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vorrq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vpselq, vpsel, all_float, none) DEF_MVE_FUNCTION (vreinterpretq, unary_convert, reinterpret_float, none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index e6b828a4e1e..ad2647b6758 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -118,6 +118,7 @@ extern const function_base *const vmulltq_poly; extern const function_base *const vmulq; extern const function_base *const vmvnq; extern const function_base *const vnegq; +extern const function_base *const vornq; extern const function_base *const vorrq; extern const function_base *const vpselq; extern const function_base *const vqabsq; diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index 0bb91f5ec1f..57e59e30c36 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -522,6 +522,59 @@ public: } }; +/* Map the function directly to CODE (M) for vorn-like builtins. The difference + with unspec_based_mve_function_exact_insn is that this function has vbic + hardcoded for the PRED_none, MODE_none version, rather than using an + RTX. */ +class unspec_based_mve_function_exact_insn_vorn : public unspec_based_mve_function_base +{ +public: + CONSTEXPR unspec_based_mve_function_exact_insn_vorn (int unspec_for_n_sint, + int unspec_for_n_uint, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp, + int unspec_for_m_n_sint, + int unspec_for_m_n_uint) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + -1, -1, -1, /* No non-predicated, no mode unspec intrinsics. */ + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1) + {} + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + + /* No suffix, no predicate, use the right RTX code. */ + if (e.pred == PRED_none + && e.mode_suffix_id == MODE_none) + { + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_vornq_u (mode); + else + code = code_for_mve_vornq_s (mode); + else + code = code_for_mve_vornq_f (mode); + return e.use_exact_insn (code); + } + + return expand_unspec (e); + } +}; + /* Map the comparison functions. */ class unspec_based_mve_function_exact_insn_vcmp : public unspec_based_mve_function_base { diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 3fd6980a58d..7aa61103a7d 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -42,9 +42,7 @@ #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) -#define vornq(__a, __b) __arm_vornq(__a, __b) #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) -#define vornq_m(__inactive, __a, __b, __p) __arm_vornq_m(__inactive, __a, __b, __p) #define vstrbq_scatter_offset(__base, __offset, __value) __arm_vstrbq_scatter_offset(__base, __offset, __value) #define vstrbq(__addr, __value) __arm_vstrbq(__addr, __value) #define vstrwq_scatter_base(__addr, __offset, __value) __arm_vstrwq_scatter_base(__addr, __offset, __value) @@ -116,7 +114,6 @@ #define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) #define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) #define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) -#define vornq_x(__a, __b, __p) __arm_vornq_x(__a, __b, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) #define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) #define vadcq(__a, __b, __carry) __arm_vadcq(__a, __b, __carry) @@ -148,14 +145,6 @@ #define vctp64q(__a) __arm_vctp64q(__a) #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) -#define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) -#define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) -#define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b) -#define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b) -#define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b) -#define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b) -#define vornq_f16(__a, __b) __arm_vornq_f16(__a, __b) -#define vornq_f32(__a, __b) __arm_vornq_f32(__a, __b) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) @@ -166,14 +155,6 @@ #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vornq_m_s8(__inactive, __a, __b, __p) __arm_vornq_m_s8(__inactive, __a, __b, __p) -#define vornq_m_s32(__inactive, __a, __b, __p) __arm_vornq_m_s32(__inactive, __a, __b, __p) -#define vornq_m_s16(__inactive, __a, __b, __p) __arm_vornq_m_s16(__inactive, __a, __b, __p) -#define vornq_m_u8(__inactive, __a, __b, __p) __arm_vornq_m_u8(__inactive, __a, __b, __p) -#define vornq_m_u32(__inactive, __a, __b, __p) __arm_vornq_m_u32(__inactive, __a, __b, __p) -#define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) -#define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) -#define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) #define vstrbq_u8( __addr, __value) __arm_vstrbq_u8( __addr, __value) #define vstrbq_u16( __addr, __value) __arm_vstrbq_u16( __addr, __value) @@ -456,14 +437,6 @@ #define viwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u8(__a, __b, __imm, __p) #define viwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u16(__a, __b, __imm, __p) #define viwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u32(__a, __b, __imm, __p) -#define vornq_x_s8(__a, __b, __p) __arm_vornq_x_s8(__a, __b, __p) -#define vornq_x_s16(__a, __b, __p) __arm_vornq_x_s16(__a, __b, __p) -#define vornq_x_s32(__a, __b, __p) __arm_vornq_x_s32(__a, __b, __p) -#define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) -#define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) -#define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) -#define vornq_x_f32(__a, __b, __p) __arm_vornq_x_f32(__a, __b, __p) #define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) #define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) #define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) @@ -665,48 +638,6 @@ __arm_vpnot (mve_pred16_t __a) return __builtin_mve_vpnotv16bi (__a); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return __builtin_mve_vornq_uv16qi (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vornq_sv16qi (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return __builtin_mve_vornq_uv8hi (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vornq_sv8hi (__a, __b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return __builtin_mve_vornq_uv4si (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vornq_sv4si (__a, __b); -} - __extension__ extern __inline mve_pred16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vctp8q_m (uint32_t __a, mve_pred16_t __p) @@ -789,48 +720,6 @@ __arm_vshlcq_u32 (uint32x4_t __a, uint32_t * __b, const int __imm) return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv8hi (__inactive, __a, __b, __p); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset_s8 (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -2658,48 +2547,6 @@ __arm_viwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16 return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -3420,34 +3267,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vornq_fv8hf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vornq_fv4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_f32 (float32_t const * __base) @@ -3678,20 +3497,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline float16x8x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vld4q_f16 (float16_t const * __addr) @@ -3852,48 +3657,6 @@ __arm_vst4q (uint32_t * __addr, uint32x4x4_t __value) __arm_vst4q_u32 (__addr, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vornq_u8 (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int8x16_t __a, int8x16_t __b) -{ - return __arm_vornq_s8 (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vornq_u16 (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int16x8_t __a, int16x8_t __b) -{ - return __arm_vornq_s16 (__a, __b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vornq_u32 (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int32x4_t __a, int32x4_t __b) -{ - return __arm_vornq_s32 (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) @@ -3936,48 +3699,6 @@ __arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) return __arm_vshlcq_u32 (__a, __b, __imm); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -5378,48 +5099,6 @@ __arm_viwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t return __arm_viwdupq_x_wb_u32 (__a, __b, __imm, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u32 (__a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -5912,34 +5591,6 @@ __arm_vst4q (float32_t * __addr, float32x4x4_t __value) __arm_vst4q_f32 (__addr, __value); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (float16x8_t __a, float16x8_t __b) -{ - return __arm_vornq_f16 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (float32x4_t __a, float32x4_t __b) -{ - return __arm_vornq_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrhq_gather_offset (float16_t const * __base, uint16x8_t __offset) @@ -6108,20 +5759,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vld4q (float16_t const * __addr) @@ -6543,18 +6180,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -6564,19 +6189,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( \ _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ @@ -6867,18 +6479,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_wb_p_u32 (p0, p1, __ARM_mve_coerce(__p2, uint32x4_t), p3), \ int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_wb_p_f32 (p0, p1, __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \ @@ -6927,16 +6527,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16_t_ptr][__ARM_mve_type_uint16x8x4_t]: __arm_vst4q_u16 (__ARM_mve_coerce_u16_ptr(p0, uint16_t *), __ARM_mve_coerce(__p1, uint16x8x4_t)), \ int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4x4_t]: __arm_vst4q_u32 (__ARM_mve_coerce_u32_ptr(p0, uint32_t *), __ARM_mve_coerce(__p1, uint32x4x4_t)));}) -#define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -6946,17 +6536,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vstrwq_scatter_base(p0,p1,p2) ({ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p2)])0, \ int (*)[__ARM_mve_type_int32x4_t]: __arm_vstrwq_scatter_base_s32(p0, p1, __ARM_mve_coerce(__p2, int32x4_t)), \ @@ -7176,16 +6755,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vuninitializedq_u32 (), \ int (*)[__ARM_mve_type_uint64x2_t]: __arm_vuninitializedq_u64 ());}) -#define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ int (*)[__ARM_mve_type_int16_t_ptr]: __arm_vld1q_z_s16 (__ARM_mve_coerce_s16_ptr(p0, int16_t *), p1), \ From patchwork Wed Sep 4 13:26:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980809 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=RgEL34Bn; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNhR2wrkz1yg3 for ; Wed, 4 Sep 2024 23:32:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 443EB385B508 for ; Wed, 4 Sep 2024 13:32:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id 46CA03858C98 for ; Wed, 4 Sep 2024 13:27:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 46CA03858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 46CA03858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456458; cv=none; b=hDwXY1G0k9PuaGBKmUglDy9WFmB+vH2e9mWGHUhKJXur0/vgKXpbDkU1QOb3RXcqKPBqozy/8aT2fCBlE3aLlh0djXmyw08dKZOdkbLMvqEwZJiZaCLNbkow3NdhD71DEpdyBFYg+pRHje5I4o+tU+4+C/ob7Z1amv5i4TnnzDY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456458; c=relaxed/simple; bh=+AAK48IBoQoK6sPKKqIa6K6aYcspXrcLcjYbL/ewRQs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=qmzNoerchSZDWfYBv3nlOFbkyKEkt7DJtxpSFbhIfLbvlvhJjkD3IUN8u7NHnlChjxGaWRJd+lwKhbte7Bau8HD57G1EXueI/SwArsZAKZdbQW8IM0b3MoiFZl1eA0OXYD9noIy7dgTgAcMHqelF6OUPLG4cldic9B+V5ApfG+Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-5df9dec9c0eso4289607eaf.2 for ; Wed, 04 Sep 2024 06:27:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456455; x=1726061255; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Pcu2BRs0nfCt77eF0DJ+5zeDBJ3ntQ3Wcd4EremRu7w=; b=RgEL34Bn92l3OVH2p8b4lJSKMfYE7iYUNmcsBXFGt4DGybcpLFZpk+GpYwU22CFx2T R4YtwbQczDeU+6Yh8GT4LhptvriJLojmYNlae1bObya+PBr6JT0UVjVNdULxp99ZddKl w/hr8ny34cIPE/lARFOA432zztizoPhy5g6NfKTgF1znf2CbYBCJM38KTXUvzk/khVJX CrIkVEnNekcGkii852XjBtWAPHPje67EA6oYJ6+prVE/i2QNfs1HW1f4LN+tV47i0Qj7 F3vnJgGPfN0wgiqnY42xLOmdq/FzixTTVzoAgUdI16p53JgtkjW69qt7LLFoF0d//OC2 KMLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456455; x=1726061255; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Pcu2BRs0nfCt77eF0DJ+5zeDBJ3ntQ3Wcd4EremRu7w=; b=NOaHYWbCbRpO7i6Lp87VRtHxEjSwI1saXZQmJHVC83rpHgeJSV5Ll9pZpiBteySZzu Wuh6AVQJhMtcVpFbFawoyRDJZJReF7qc74YcgmcUVkVtBto71bWbnwqdfAGIW48noB49 S0SY56t4F4kmaayd/SEZRKPYHLmCtkHuzXV6IMuAshm9uu4wLrhlvFfLf2UI5UO7dLXI 3UNY+gfq73szvtNJapEIHQVmacmTG745ZPAD1Q1RN8Pe6ichakFnu/bZIcKGcM8Wa6VR g06qlT72BjCq2XZbvKLxqT6C8B3QZAYbRopu8Dhlfay+km8MQDwoi5xKH//PQVBqvy0l 78Pg== X-Gm-Message-State: AOJu0YyIvp/kSR+75aK/2iTdtHUmJx2e4S3k6yj0+iH/KLg85a5dN+mH 9rgCx+13cvPzjTowDstZfJ1XYvKuOXM5LLzC0EGF2JNsXfFoHHsNpaPtVeC+EAsAkB+NqJeCmKU sbXfOmw== X-Google-Smtp-Source: AGHT+IEF6EMQ1Q4hvkgJh3ewHJ4lnO8D+2N3crpflAsYukX7J7cr95jVnzMc+taUklbiHAWb6yffMg== X-Received: by 2002:a05:6820:168d:b0:5da:9bde:1bfa with SMTP id 006d021491bc7-5dfacc0452emr19432513eaf.0.1725456454981; Wed, 04 Sep 2024 06:27:34 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:34 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 16/36] arm: [MVE intrinsics] rework vctp Date: Wed, 4 Sep 2024 13:26:30 +0000 Message-Id: <20240904132650.2720446-17-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vctp using the new MVE builtins framework. 2024-08-21 Christophe Lyon gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New. (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.def (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.h (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-shapes.cc (vctp): New. * config/arm/arm-mve-builtins-shapes.h (vctp): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vctp. * config/arm/arm_mve.h (vctp16q): Delete. (vctp32q): Delete. (vctp64q): Delete. (vctp8q): Delete. (vctp8q_m): Delete. (vctp64q_m): Delete. (vctp32q_m): Delete. (vctp16q_m): Delete. (__arm_vctp16q): Delete. (__arm_vctp32q): Delete. (__arm_vctp64q): Delete. (__arm_vctp8q): Delete. (__arm_vctp8q_m): Delete. (__arm_vctp64q_m): Delete. (__arm_vctp32q_m): Delete. (__arm_vctp16q_m): Delete. * config/arm/mve.md (mve_vctpq): Add '@' prefix. (mve_vctpq_m): Likewise. --- gcc/config/arm/arm-mve-builtins-base.cc | 48 +++++++++++++++++ gcc/config/arm/arm-mve-builtins-base.def | 4 ++ gcc/config/arm/arm-mve-builtins-base.h | 4 ++ gcc/config/arm/arm-mve-builtins-shapes.cc | 16 ++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 4 ++ gcc/config/arm/arm_mve.h | 64 ----------------------- gcc/config/arm/mve.md | 4 +- 8 files changed, 79 insertions(+), 66 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index f8260f5f483..89724320d43 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -139,6 +139,50 @@ public: } }; + /* Implements vctp8q, vctp16q, vctp32q and vctp64q intrinsics. */ +class vctpq_impl : public function_base +{ +public: + CONSTEXPR vctpq_impl (machine_mode mode) + : m_mode (mode) + {} + + /* Mode this intrinsic operates on. */ + machine_mode m_mode; + + rtx + expand (function_expander &e) const override + { + insn_code code; + rtx target; + + if (e.mode_suffix_id != MODE_none) + gcc_unreachable (); + + switch (e.pred) + { + case PRED_none: + /* No predicate, no suffix. */ + code = code_for_mve_vctpq (m_mode, m_mode); + target = e.use_exact_insn (code); + break; + + case PRED_m: + /* No suffix, "m" predicate. */ + code = code_for_mve_vctpq_m (m_mode, m_mode); + target = e.use_cond_insn (code, 0); + break; + + default: + gcc_unreachable (); + } + + rtx HItarget = gen_reg_rtx (HImode); + emit_move_insn (HItarget, gen_lowpart (HImode, target)); + return HItarget; + } +}; + /* Implements vcvtq intrinsics. */ class vcvtq_impl : public function_base { @@ -506,6 +550,10 @@ FUNCTION (vcmpltq, unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT, FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN)) FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) +FUNCTION (vctp8q, vctpq_impl, (V16BImode)) +FUNCTION (vctp16q, vctpq_impl, (V8BImode)) +FUNCTION (vctp32q, vctpq_impl, (V4BImode)) +FUNCTION (vctp64q, vctpq_impl, (V2QImode)) FUNCTION (vcvtq, vcvtq_impl,) FUNCTION_WITHOUT_N_NO_F (vcvtaq, VCVTAQ) FUNCTION_WITHOUT_N_NO_F (vcvtmq, VCVTMQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index cc76db3e0b9..dd46d882882 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -42,6 +42,10 @@ DEF_MVE_FUNCTION (vcmpleq, cmp, all_signed, m_or_none) DEF_MVE_FUNCTION (vcmpltq, cmp, all_signed, m_or_none) DEF_MVE_FUNCTION (vcmpneq, cmp, all_integer, m_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_integer_with_64, none) +DEF_MVE_FUNCTION (vctp16q, vctp, none, m_or_none) +DEF_MVE_FUNCTION (vctp32q, vctp, none, m_or_none) +DEF_MVE_FUNCTION (vctp64q, vctp, none, m_or_none) +DEF_MVE_FUNCTION (vctp8q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index ad2647b6758..41fcf666b11 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -55,6 +55,10 @@ extern const function_base *const vcmulq_rot180; extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; +extern const function_base *const vctp16q; +extern const function_base *const vctp32q; +extern const function_base *const vctp64q; +extern const function_base *const vctp8q; extern const function_base *const vcvtaq; extern const function_base *const vcvtbq; extern const function_base *const vcvtmq; diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 6632ee49067..8a849c2bc02 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -1981,6 +1981,22 @@ struct unary_widen_acc_def : public overloaded_base<0> }; SHAPE (unary_widen_acc) +/* mve_pred16_t foo_t0(uint32_t) + + Example: vctp16q. + mve_pred16_t [__arm_]vctp16q(uint32_t a) + mve_pred16_t [__arm_]vctp16q_m(uint32_t a, mve_pred16_t p) */ +struct vctp_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + build_all (b, "p,su32", group, MODE_none, preserve_user_namespace); + } +}; +SHAPE (vctp) + /* _t foo_t0[_t1](_t) _t foo_t0_n[_t1](_t, const int) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index ef497b6c97a..80340dc33ec 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -77,6 +77,7 @@ namespace arm_mve extern const function_shape *const unary_n; extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; + extern const function_shape *const vctp; extern const function_shape *const vcvt; extern const function_shape *const vcvt_f16_f32; extern const function_shape *const vcvt_f32_f16; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 13c666b8f6a..84d94bb634f 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -750,6 +750,10 @@ function_instance::has_inactive_argument () const || base == functions::vcmpltq || base == functions::vcmpcsq || base == functions::vcmphiq + || base == functions::vctp16q + || base == functions::vctp32q + || base == functions::vctp64q + || base == functions::vctp8q || (base == functions::vcvtbq && type_suffix (0).element_bits == 16) || (base == functions::vcvttq && type_suffix (0).element_bits == 16) || base == functions::vfmaq diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 7aa61103a7d..49c4ea9afee 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -140,15 +140,7 @@ #define vst4q_u32( __addr, __value) __arm_vst4q_u32( __addr, __value) #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) -#define vctp16q(__a) __arm_vctp16q(__a) -#define vctp32q(__a) __arm_vctp32q(__a) -#define vctp64q(__a) __arm_vctp64q(__a) -#define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) -#define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) -#define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) -#define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) -#define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) @@ -603,34 +595,6 @@ __arm_vst4q_u32 (uint32_t * __addr, uint32x4x4_t __value) __builtin_mve_vst4qv4si ((__builtin_neon_si *) __addr, __rv.__o); } -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp16q (uint32_t __a) -{ - return __builtin_mve_vctp16qv8bi (__a); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp32q (uint32_t __a) -{ - return __builtin_mve_vctp32qv4bi (__a); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp64q (uint32_t __a) -{ - return __builtin_mve_vctp64qv2qi (__a); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp8q (uint32_t __a) -{ - return __builtin_mve_vctp8qv16bi (__a); -} - __extension__ extern __inline mve_pred16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vpnot (mve_pred16_t __a) @@ -638,34 +602,6 @@ __arm_vpnot (mve_pred16_t __a) return __builtin_mve_vpnotv16bi (__a); } -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp8q_m (uint32_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vctp8q_mv16bi (__a, __p); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp64q_m (uint32_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vctp64q_mv2qi (__a, __p); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp32q_m (uint32_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vctp32q_mv4bi (__a, __p); -} - -__extension__ extern __inline mve_pred16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vctp16q_m (uint32_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vctp16q_mv8bi (__a, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq_s8 (int8x16_t __a, uint32_t * __b, const int __imm) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 3d8b199d9d6..62cffebd6ed 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -482,7 +482,7 @@ (define_insn "@mve_q_v4si" ;; ;; [vctp8q vctp16q vctp32q vctp64q]) ;; -(define_insn "mve_vctpq" +(define_insn "@mve_vctpq" [ (set (match_operand:MVE_7 0 "vpr_register_operand" "=Up") (unspec:MVE_7 [(match_operand:SI 1 "s_register_operand" "r")] @@ -1272,7 +1272,7 @@ (define_insn "@mve_vcmpq_n_f" ;; ;; [vctp8q_m vctp16q_m vctp32q_m vctp64q_m]) ;; -(define_insn "mve_vctpq_m" +(define_insn "@mve_vctpq_m" [ (set (match_operand:MVE_7 0 "vpr_register_operand" "=Up") (unspec:MVE_7 [(match_operand:SI 1 "s_register_operand" "r") From patchwork Wed Sep 4 13:26:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980813 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=J0tFF4wJ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNjq0QF8z1yg3 for ; Wed, 4 Sep 2024 23:33:35 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1EDD8386103A for ; Wed, 4 Sep 2024 13:33:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id AFCEF3861037 for ; Wed, 4 Sep 2024 13:27:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AFCEF3861037 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AFCEF3861037 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::329 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456459; cv=none; b=p6weIusj8dq1bb+oUuQTNtARWiPcI2Bhtd2CAbukxOfbLoqH2M+bK77KwEPO9Q5kXtWGibJJ4XBTORbP9T4aD4TJQrwfFbo14Ih3n/9mGB6wJQ0WQIfqTlwu23tynLT0aEHijUzEuGId8T75IKKCwY5twpadzYuGuj58079UxxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456459; c=relaxed/simple; bh=Or37xhi9cVIHiuh0xrsCqqy0H88bkJoYeVrz92U73iY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=l5eNmcXahb3Qmyl2fAqZEecj4mfv7LLiBGGq3NAHwvQrBR5QtkLFTe6JoipE0Vw1aG6J790ryN53jWD/3rgELPKLEaidqdjXLio5OzjaT9Jvd3xl5bGjODEiufnjQSS8fs7oZ0xpD1alLM6fJY3ZVt2YGHMG6zBMwFMBMimbYtg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-709340f1cb1so2013286a34.3 for ; Wed, 04 Sep 2024 06:27:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456456; x=1726061256; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iw5gCjNV5owPDVrcY+dK1ZNOAgtNkrcsF1ei+CkORTA=; b=J0tFF4wJpT0C9PPdZOfhamOPhm2mAQzj/7zK+i0DL5ZduKJi17nUSgp/aJfKZ2D7Ri epPVmZix/GersrX6L+02+sQIpaLAwJNZqPLAJmntAZoGXZpwM8xI7fZm+LlDBkxseAQ0 DdErjXHE72aj0Xs0rStX5gnoOsN6F0yotHcyfevg+TmvXZi2FDakafyPBjKXHZYA87z8 xO4vmQHnHQcZuyke8WkmVvo7eVT6/AnneGhvs0o/8oNTMGNgbKubXQFyISQklEPADo29 ODwNlcuk3D9p7Xi02+ejw8v3nyInNU7Vq3x69VkCbmDdWPBtJiqqWg2qfiWW2Sx3qPyt vWgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456456; x=1726061256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iw5gCjNV5owPDVrcY+dK1ZNOAgtNkrcsF1ei+CkORTA=; b=IhmS3+EkkgU6/k20qvi/DXiFEe8RlPJoJH0PaKLL6OV2Uszsm6KoynJ7c8iOs8fPd7 iShW4kiUNHLJ4aR+2Fp/iHMPkbGTqQb5mbYOkC3p06ksxghq9oP6+cShRPKRAAAEs5Ka IjDL5PMQyKh+vyRr2VawGFoquASGO/v3tuvDo4D1d3Zy+5EQRx0RCIVHwzJvyDAqTfah Weyq1+Lf5NIiwWQPV7T0RxBOISOz21HwVAuzQCp5YI3zvTYiNGNjFUntC8SWa4PKxxoB 54fj81w9LOGnpIKqQXO45EGL57Zp3xHr3alqdVvwMO3fLDL7Cx7hrEa8e/SZOpUZrTb8 RbQA== X-Gm-Message-State: AOJu0YwT2Ec73ZOzmCpvoP6XW83QqBdWmHUOOvgVLJ1y2vwMjjAjGGGK Bkg/wKOGJJ/I3LLMLLS3xDV3CnbtUrufOOtU+3fjUskVCxjeoxfoR9A78Rb3LwzxNieWBMnuj3c 86vNjTg== X-Google-Smtp-Source: AGHT+IH+sPIEYqK7buDBQHIDfsgL4g6zLPnR7OebaMlyFz3LrOX3ea+/dHVBsrkoKJSWfJ7cLgT8Zw== X-Received: by 2002:a05:6830:44a2:b0:709:3b06:d578 with SMTP id 46e09a7af769-70f7072ee46mr15316031a34.26.1725456455612; Wed, 04 Sep 2024 06:27:35 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:35 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 17/36] arm: [MVE intrinsics] factorize vddup vidup Date: Wed, 4 Sep 2024 13:26:31 +0000 Message-Id: <20240904132650.2720446-18-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vddup and vidup so that they use the same parameterized names. This patch updates only the (define_insn "@mve_q_u_insn") patterns and does not bother with the (define_expand "mve_vidupq_n_u") ones, because a subsequent patch avoids using them. 2024-08-21 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VIDUPQ, VDDUPQ, VIDUPQ_M, VDDUPQ_M. (viddupq_op): New. (viddupq_m_op): New. (VIDDUPQ): New. (VIDDUPQ_M): New. * config/arm/mve.md (mve_vddupq_u_insn) (mve_vidupq_u_insn): Merge into ... (mve_q_u_insn): ... this. (mve_vddupq_m_wb_u_insn, mve_vidupq_m_wb_u_insn): Merge into ... (mve_q_m_wb_u_insn): ... this. --- gcc/config/arm/iterators.md | 7 +++++ gcc/config/arm/mve.md | 58 +++++++++---------------------------- 2 files changed, 20 insertions(+), 45 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 3a1825ebab2..c0299117f26 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -1007,6 +1007,8 @@ (define_int_attr mve_insn [ (VHSUBQ_M_S "vhsub") (VHSUBQ_M_U "vhsub") (VHSUBQ_N_S "vhsub") (VHSUBQ_N_U "vhsub") (VHSUBQ_S "vhsub") (VHSUBQ_U "vhsub") + (VIDUPQ "vidup") (VDDUPQ "vddup") + (VIDUPQ_M "vidup") (VDDUPQ_M "vddup") (VMAXAQ_M_S "vmaxa") (VMAXAQ_S "vmaxa") (VMAXAVQ_P_S "vmaxav") @@ -1340,6 +1342,9 @@ (define_int_attr mve_mnemo [ (VRNDXQ_F "vrintx") (VRNDXQ_M_F "vrintx") ]) +(define_int_attr viddupq_op [ (VIDUPQ "plus") (VDDUPQ "minus")]) +(define_int_attr viddupq_m_op [ (VIDUPQ_M "plus") (VDDUPQ_M "minus")]) + ;; plus and minus are the only SHIFTABLE_OPS for which Thumb2 allows ;; a stack pointer operand. The minus operation is a candidate for an rsub ;; and hence only plus is supported. @@ -2961,6 +2966,8 @@ (define_int_iterator VCVTxQ_M_F16_F32 [VCVTBQ_M_F16_F32 VCVTTQ_M_F16_F32]) (define_int_iterator VCVTxQ_M_F32_F16 [VCVTBQ_M_F32_F16 VCVTTQ_M_F32_F16]) (define_int_iterator VCVTxQ [VCVTAQ_S VCVTAQ_U VCVTMQ_S VCVTMQ_U VCVTNQ_S VCVTNQ_U VCVTPQ_S VCVTPQ_U]) (define_int_iterator VCVTxQ_M [VCVTAQ_M_S VCVTAQ_M_U VCVTMQ_M_S VCVTMQ_M_U VCVTNQ_M_S VCVTNQ_M_U VCVTPQ_M_S VCVTPQ_M_U]) +(define_int_iterator VIDDUPQ [VIDUPQ VDDUPQ]) +(define_int_iterator VIDDUPQ_M [VIDUPQ_M VDDUPQ_M]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 62cffebd6ed..36117303fd6 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5105,18 +5105,18 @@ (define_expand "mve_vidupq_n_u" }) ;; -;; [vidupq_u_insn]) +;; [vddupq_u_insn, vidupq_u_insn] ;; -(define_insn "mve_vidupq_u_insn" +(define_insn "@mve_q_u_insn" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") (match_operand:SI 3 "mve_imm_selective_upto_8" "Rg")] - VIDUPQ)) + VIDDUPQ)) (set (match_operand:SI 1 "s_register_operand" "=Te") - (plus:SI (match_dup 2) - (match_operand:SI 4 "immediate_operand" "i")))] + (:SI (match_dup 2) + (match_operand:SI 4 "immediate_operand" "i")))] "TARGET_HAVE_MVE" - "vidup.u%#\t%q0, %1, %3") + ".u%#\t%q0, %1, %3") ;; ;; [vidupq_m_n_u]) @@ -5139,21 +5139,21 @@ (define_expand "mve_vidupq_m_n_u" }) ;; -;; [vidupq_m_wb_u_insn]) +;; [vddupq_m_wb_u_insn, vidupq_m_wb_u_insn] ;; -(define_insn "mve_vidupq_m_wb_u_insn" +(define_insn "@mve_q_m_wb_u_insn" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") (match_operand:SI 3 "s_register_operand" "2") (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg") (match_operand: 5 "vpr_register_operand" "Up")] - VIDUPQ_M)) + VIDDUPQ_M)) (set (match_operand:SI 2 "s_register_operand" "=Te") - (plus:SI (match_dup 3) - (match_operand:SI 6 "immediate_operand" "i")))] + (:SI (match_dup 3) + (match_operand:SI 6 "immediate_operand" "i")))] "TARGET_HAVE_MVE" - "vpst\;\tvidupt.u%#\t%q0, %2, %4" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vidupq_u_insn")) + "vpst\;t.u%#\t%q0, %2, %4" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_u_insn")) (set_attr "length""8")]) ;; @@ -5173,20 +5173,6 @@ (define_expand "mve_vddupq_n_u" DONE; }) -;; -;; [vddupq_u_insn]) -;; -(define_insn "mve_vddupq_u_insn" - [(set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") - (match_operand:SI 3 "immediate_operand" "i")] - VDDUPQ)) - (set (match_operand:SI 1 "s_register_operand" "=Te") - (minus:SI (match_dup 2) - (match_operand:SI 4 "immediate_operand" "i")))] - "TARGET_HAVE_MVE" - "vddup.u%#\t%q0, %1, %3") - ;; ;; [vddupq_m_n_u]) ;; @@ -5207,24 +5193,6 @@ (define_expand "mve_vddupq_m_n_u" DONE; }) -;; -;; [vddupq_m_wb_u_insn]) -;; -(define_insn "mve_vddupq_m_wb_u_insn" - [(set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:SI 3 "s_register_operand" "2") - (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg") - (match_operand: 5 "vpr_register_operand" "Up")] - VDDUPQ_M)) - (set (match_operand:SI 2 "s_register_operand" "=Te") - (minus:SI (match_dup 3) - (match_operand:SI 6 "immediate_operand" "i")))] - "TARGET_HAVE_MVE" - "vpst\;vddupt.u%#\t%q0, %2, %4" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vddupq_u_insn")) - (set_attr "length""8")]) - ;; ;; [vdwdupq_n_u]) ;; From patchwork Wed Sep 4 13:26:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980815 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=rQInIzEb; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNl13CVMz1yg3 for ; Wed, 4 Sep 2024 23:34:37 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 13A25384A4A3 for ; Wed, 4 Sep 2024 13:34:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by sourceware.org (Postfix) with ESMTPS id 259CA385C6C0 for ; Wed, 4 Sep 2024 13:27:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 259CA385C6C0 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 259CA385C6C0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; cv=none; b=dXBvmTRj90wmKqqrvjaThpdaN3/BwNm3rpprNQse6843vlPI+VippmfWCdppQUNOEXEIWsRbWeciwQPSs2M0dBAhhLRbRHj75XGq2ZdsWYStxJ4gI2OdTHggD9PA4LBTZONuoPauYJUvyROJKNBBeBYmsiXno2WSfnNOw+Lnycc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; c=relaxed/simple; bh=Phyz/d8Y6cT+rKmjCZZPjAyQ1AP8kqBP1Gc7OHCbtPg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=oRa/5Uck0Tu9aOBSvOZuec7nc38GEORqV0FDMiQDa5TkWqk7QRJf/CLxDYXLNcQkt0QOqYZpBokE3E6NZuKDf/U2qOCWjMpUqL7Q2QElhgkC4Raa3NyuObJzuYuTA3gVBYVzDZ9jDx+tLFR11NnakEbew2mtnFknWjWCBBPWl9Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc32.google.com with SMTP id 006d021491bc7-5e16cb56d4aso1926382eaf.1 for ; Wed, 04 Sep 2024 06:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456457; x=1726061257; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JGdOAonFxFcTh38HT3zQ7Og5cZWF3iDIv8VSU6omoYE=; b=rQInIzEbi6PcwrvKBwLNXbxGa/fm+5MlLBexI73tpDb2HYKn9Lbk7FhdawQ9vroeNP qXy8jcqG/XrX/dC69Icbrs2WcTv5Qc9XASPj/xots7FkHhNAub07pzQAj3VifA0G2RBl k9JW/Td4cyHNVeJhek50PElZ5wSbJiEQFRDXQd87XoW0O4AQ6tlwx8tS5I0VsArEFNvc GmpiJ5+UZ5GztdcCN+YERtClVMkkzNSKbIUVxJCR2dqQ0Lcj2fkhaHMexZ8+EsGV9Hyq W8aAAPCZRH5G+m6AZ10J3j742xIo7nKJ9+J7hB2qDePBZOcTizJKsca63TBwJbv6x1gg faQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456457; x=1726061257; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JGdOAonFxFcTh38HT3zQ7Og5cZWF3iDIv8VSU6omoYE=; b=B6vzEP/bRon8neMoiG2QfOf0Hs9ahtMYraWWSeM0fTe/Y1SWqg1RzK8o8xCULPPbPh 1EEUrpWO6WV+7p0tHrEP1NR30Jqv0+Wok8oJKq9kJlJeH9GMTYSr+0GUWhW5USiEWR6S /D0mmAT//nCSdj/SjW+IPkQ5/+DF4jTV0jX+RykRQVHxBV8qzScBHrmSVrhDRNoNF8cI CRYdQVD07gyd83LbEkpQWoWDcecXAzjNTueyVpMliqiLkL1XTLbYMI4hja1Ov3OmbRH2 tCKQtUme53rbxBJIymCg5pcyJuSnJ5x0fnrulA5D/Xf+cgY5FGbMgq5vEl8h7yDLPz/x 1AmQ== X-Gm-Message-State: AOJu0YxAh23GpbyZ0T4HRrdhdOzdGsu1AofgUCiB6R1REqQi8y3yyg+F 0fbUd4IyoDDvD/GZBPBU5WfYSsewm7GPoDi7FwlPVt3E/IWFlZhYD/+0Ckyw37cS0qQ/KISGTyt +WM/o3Q== X-Google-Smtp-Source: AGHT+IGwEPS+54xHTj1WLc2RozgiHTVRs/JAtXBWsrP170/ns+k4dWPF4oS1zFaxEM0zjKgp2hbeKg== X-Received: by 2002:a05:6820:545:b0:5c4:57d:691e with SMTP id 006d021491bc7-5dfacf23410mr18429522eaf.2.1725456456681; Wed, 04 Sep 2024 06:27:36 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:35 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 18/36] arm: [MVE intrinsics] add viddup shape Date: Wed, 4 Sep 2024 13:26:32 +0000 Message-Id: <20240904132650.2720446-19-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the viddup shape description for vidup and vddup. This requires the addition of report_not_one_of and function_checker::require_immediate_one_of to gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE counterpart). This patch also introduces MODE_wb. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (viddup): New. * config/arm/arm-mve-builtins-shapes.h (viddup): New. * config/arm/arm-mve-builtins.cc (report_not_one_of): New. (function_checker::require_immediate_one_of): New. * config/arm/arm-mve-builtins.def (wb): New mode. * config/arm/arm-mve-builtins.h (function_checker) Add require_immediate_one_of. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 85 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 44 ++++++++++++ gcc/config/arm/arm-mve-builtins.def | 1 + gcc/config/arm/arm-mve-builtins.h | 2 + 5 files changed, 133 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 8a849c2bc02..971e86a2727 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2191,6 +2191,91 @@ struct vcvtx_def : public overloaded_base<0> }; SHAPE (vcvtx) +/* _t vfoo[_n]_t0(uint32_t, const int) + _t vfoo[_wb]_t0(uint32_t *, const int) + + Shape for vector increment or decrement and duplicate operations that take + an integer or pointer to integer first argument and an immediate, and + produce a vector. + + Check that 'imm' is one of 1, 2, 4 or 8. + + Example: vddupq. + uint8x16_t [__arm_]vddupq[_n]_u8(uint32_t a, const int imm) + uint8x16_t [__arm_]vddupq[_wb]_u8(uint32_t *a, const int imm) + uint8x16_t [__arm_]vddupq_m[_n_u8](uint8x16_t inactive, uint32_t a, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vddupq_m[_wb_u8](uint8x16_t inactive, uint32_t *a, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vddupq_x[_n]_u8(uint32_t a, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vddupq_x[_wb]_u8(uint32_t *a, const int imm, mve_pred16_t p) */ +struct viddup_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info) const override + { + return ((i == 0) && (pred != PRED_m)); + } + + bool + skip_overload_p (enum predication_index, enum mode_suffix_index mode) const override + { + /* For MODE_wb, share the overloaded instance with MODE_n. */ + if (mode == MODE_wb) + return true; + + return false; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,su32,su64", group, MODE_n, preserve_user_namespace); + build_all (b, "v0,as,su64", group, MODE_wb, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index type_suffix = NUM_TYPE_SUFFIXES; + if (!r.check_gp_argument (2, i, nargs)) + return error_mark_node; + + type_suffix = r.type_suffix_ids[0]; + /* With PRED_m, ther is no type suffix, so infer it from the first (inactive) + argument. */ + if (type_suffix == NUM_TYPE_SUFFIXES) + type_suffix = r.infer_vector_type (0); + + unsigned int last_arg = i - 1; + /* Check that last_arg is either scalar or pointer. */ + if (!r.scalar_argument_p (last_arg)) + return error_mark_node; + + if (!r.require_integer_immediate (last_arg + 1)) + return error_mark_node; + + /* With MODE_n we expect a scalar, with MODE_wb we expect a pointer. */ + mode_suffix_index mode_suffix; + if (POINTER_TYPE_P (r.get_argument_type (last_arg))) + mode_suffix = MODE_wb; + else + mode_suffix = MODE_n; + + return r.resolve_to (mode_suffix, type_suffix); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_one_of (1, 1, 2, 4, 8); + } +}; +SHAPE (viddup) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 80340dc33ec..186287c1620 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -82,6 +82,7 @@ namespace arm_mve extern const function_shape *const vcvt_f16_f32; extern const function_shape *const vcvt_f32_f16; extern const function_shape *const vcvtx; + extern const function_shape *const viddup; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 84d94bb634f..1180421bf0a 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -630,6 +630,20 @@ report_not_enum (location_t location, tree fndecl, unsigned int argno, " a valid %qT value", actual, argno + 1, fndecl, enumtype); } +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has + the value ACTUAL, whereas the function requires one of VALUE0..3. + ARGNO counts from zero. */ +static void +report_not_one_of (location_t location, tree fndecl, unsigned int argno, + HOST_WIDE_INT actual, HOST_WIDE_INT value0, + HOST_WIDE_INT value1, HOST_WIDE_INT value2, + HOST_WIDE_INT value3) +{ + error_at (location, "passing %wd to argument %d of %qE, which expects" + " %wd, %wd, %wd or %wd", actual, argno + 1, fndecl, value0, value1, + value2, value3); +} + /* Checks that the mve.fp extension is enabled, given that REQUIRES_FLOAT indicates whether it is required or not for function FNDECL. Report an error against LOCATION if not. */ @@ -1969,6 +1983,36 @@ function_checker::require_immediate_enum (unsigned int rel_argno, tree type) return false; } +/* Check that argument REL_ARGNO is an integer constant expression that + has one of the given values. */ +bool +function_checker::require_immediate_one_of (unsigned int rel_argno, + HOST_WIDE_INT value0, + HOST_WIDE_INT value1, + HOST_WIDE_INT value2, + HOST_WIDE_INT value3) +{ + unsigned int argno = m_base_arg + rel_argno; + if (!argument_exists_p (argno)) + return true; + + HOST_WIDE_INT actual; + if (!require_immediate (argno, actual)) + return false; + + if (actual != value0 + && actual != value1 + && actual != value2 + && actual != value3) + { + report_not_one_of (location, fndecl, argno, actual, + value0, value1, value2, value3); + return false; + } + + return true; +} + /* Check that argument REL_ARGNO is an integer constant expression in the range [MIN, MAX]. REL_ARGNO counts from the end of the predication arguments. */ diff --git a/gcc/config/arm/arm-mve-builtins.def b/gcc/config/arm/arm-mve-builtins.def index 24ebb3375f0..265cc7b0c69 100644 --- a/gcc/config/arm/arm-mve-builtins.def +++ b/gcc/config/arm/arm-mve-builtins.def @@ -36,6 +36,7 @@ DEF_MVE_MODE (n, none, none, none) DEF_MVE_MODE (offset, none, none, bytes) DEF_MVE_MODE (r, none, none, none) +DEF_MVE_MODE (wb, none, none, none) #define REQUIRES_FLOAT false DEF_MVE_TYPE (mve_pred16_t, boolean_type_node) diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve-builtins.h index 3306736bff0..fe7adf0e054 100644 --- a/gcc/config/arm/arm-mve-builtins.h +++ b/gcc/config/arm/arm-mve-builtins.h @@ -433,6 +433,8 @@ public: bool require_immediate_enum (unsigned int, tree); bool require_immediate_lane_index (unsigned int, unsigned int = 1); + bool require_immediate_one_of (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT, + HOST_WIDE_INT, HOST_WIDE_INT); bool require_immediate_range (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT); bool check (); From patchwork Wed Sep 4 13:26:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980812 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BRlbcGnB; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNjg2Xffz1yg3 for ; Wed, 4 Sep 2024 23:33:27 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DDD883865488 for ; Wed, 4 Sep 2024 13:33:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by sourceware.org (Postfix) with ESMTPS id F1FA03858C50 for ; Wed, 4 Sep 2024 13:27:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F1FA03858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F1FA03858C50 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; cv=none; b=J3b4eG6opMHiukvq7sHqW4qtAkqRYsmKJjjzz65olik7CNJWrVLlINY1DgMiIWqfoPLqysVUtmYMftvgsTKWCed0pYQvP0/wVQakFOy2Ua4MrChMddsKhuQoPhYi6CQOapTgMTnvEsVxmBkBTicunO8V9WLjHOoYFzjkulXjDqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; c=relaxed/simple; bh=0AgSqBZ2qXFMELCvm8KJAfCtgvtATc9rvhiKZaxbQj0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Ywr3HsYoxZTU44Hyl/O0FybXjSMBVaxCA202ri3wWMC8JG0MS+6mxroK2ywqfOEN5lZMONZDUk35LFG43vv9X2h3sNXqDvYLeWpfWHtiSWOSbRnjMlebF3qmsUZP9UZjipxJZk9DhyWB6Yla/YDC+PQtXm415HY74YkJ05kFZ8w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-5df9343b5b8so4226118eaf.0 for ; Wed, 04 Sep 2024 06:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456458; x=1726061258; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y0fqHo4aQdR9reDiLKZYLcCevm5XQTIE1HpjctExAEs=; b=BRlbcGnBGwyplD8KMjWFsDrZwVwOTbNeoLohRZu9qjKSDxTvv4jb4pHbk2SsJ1I61y yN8fr136V11Bi3X360zpgc0nynX3r4IxjQOWEmL87rMOllOV4gLp6BUW2eX3JkaSlfoB PTh6wDCH/+Y2EcORLGAdrMTq1XplnxTfRB3iPts0t0OKsd2OtTXPL0g4gTYe+hmtmCU/ XSzUwIDdry/lNsQVTgMR8gTdo7w/sDhrcs+CrFLcis6zuFkmtr3eyXKdYdB6DVtKNe/M bpJyjOM/BmJ1/QbjM7BzxSAVWhe/VLK7v8jRWaMMY//ij6he/z/qX9vEvDejbNhsoDR3 Q3+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456458; x=1726061258; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y0fqHo4aQdR9reDiLKZYLcCevm5XQTIE1HpjctExAEs=; b=YKI8MtzSKxmVMhDSaI6ZeE19WykV1PbcsXf68hPdrCPfM+0yAMvFSKRgTMGSdUG9s2 4FOrTCMOmh4/xBmYsDJSFyGMI5PUUBVaynkH96gCDNtmHpOhGNtAtDXRV9lvMisOuEVR ygXZaPofmVJKASPp0jPmgfSvxOmUZQgs9S2HK+iVgDkO6MS23+/fG83jdLAZa20YBwFZ OII6M09ugJ5JH+RZJM+LeEVyMKFctwkQ4oU2BlQ0ttlYEqRx0NV2L5WcgbuEZ1U5w/cS BU9qRLfWIwO0KaWdy0TTU3OJYuwyPFqo4Vqs+eZEciqvG7qsXk7A/VhaaChz0OTBh6G4 P95w== X-Gm-Message-State: AOJu0YzgzWZW6uWoZi+o4b6w+D+uCtwzGEWy5DegTpeB7/Qpi8aJB0BH VfFAVNjwTQVM9hRQjjiULdFchRmwQAV4JwF09I7OERw5pqzJ9jAjZ/9b7Y5S29I1k6BIf1GgfER LoPWpRA== X-Google-Smtp-Source: AGHT+IEAZrQBwWtleDxM8hyHK11MtSazyfCXJ/WMsO6f5ueLemcuzSd5N/f/wxP8nGD1V4n4uRaGMg== X-Received: by 2002:a05:6820:545:b0:5d5:a343:d135 with SMTP id 006d021491bc7-5dfacddf0a9mr18593950eaf.1.1725456457514; Wed, 04 Sep 2024 06:27:37 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:37 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 19/36] arm: [MVE intrinsics] rework vddup vidup Date: Wed, 4 Sep 2024 13:26:33 +0000 Message-Id: <20240904132650.2720446-20-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vddup and vidup using the new MVE builtins framework. We generate better code because we take advantage of the two outputs produced by the v[id]dup instructions. For instance, before: ldr r3, [r0] sub r2, r3, #8 str r2, [r0] mov r2, r3 vddup.u16 q3, r2, #1 now: ldr r2, [r0] vddup.u16 q3, r2, #1 str r2, [r0] 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class viddup_impl): New. (vddup): New. (vidup): New. * config/arm/arm-mve-builtins-base.def (vddupq): New. (vidupq): New. * config/arm/arm-mve-builtins-base.h (vddupq): New. (vidupq): New. * config/arm/arm_mve.h (vddupq_m): Delete. (vddupq_u8): Delete. (vddupq_u32): Delete. (vddupq_u16): Delete. (vidupq_m): Delete. (vidupq_u8): Delete. (vidupq_u32): Delete. (vidupq_u16): Delete. (vddupq_x_u8): Delete. (vddupq_x_u16): Delete. (vddupq_x_u32): Delete. (vidupq_x_u8): Delete. (vidupq_x_u16): Delete. (vidupq_x_u32): Delete. (vddupq_m_n_u8): Delete. (vddupq_m_n_u32): Delete. (vddupq_m_n_u16): Delete. (vddupq_m_wb_u8): Delete. (vddupq_m_wb_u16): Delete. (vddupq_m_wb_u32): Delete. (vddupq_n_u8): Delete. (vddupq_n_u32): Delete. (vddupq_n_u16): Delete. (vddupq_wb_u8): Delete. (vddupq_wb_u16): Delete. (vddupq_wb_u32): Delete. (vidupq_m_n_u8): Delete. (vidupq_m_n_u32): Delete. (vidupq_m_n_u16): Delete. (vidupq_m_wb_u8): Delete. (vidupq_m_wb_u16): Delete. (vidupq_m_wb_u32): Delete. (vidupq_n_u8): Delete. (vidupq_n_u32): Delete. (vidupq_n_u16): Delete. (vidupq_wb_u8): Delete. (vidupq_wb_u16): Delete. (vidupq_wb_u32): Delete. (vddupq_x_n_u8): Delete. (vddupq_x_n_u16): Delete. (vddupq_x_n_u32): Delete. (vddupq_x_wb_u8): Delete. (vddupq_x_wb_u16): Delete. (vddupq_x_wb_u32): Delete. (vidupq_x_n_u8): Delete. (vidupq_x_n_u16): Delete. (vidupq_x_n_u32): Delete. (vidupq_x_wb_u8): Delete. (vidupq_x_wb_u16): Delete. (vidupq_x_wb_u32): Delete. (__arm_vddupq_m_n_u8): Delete. (__arm_vddupq_m_n_u32): Delete. (__arm_vddupq_m_n_u16): Delete. (__arm_vddupq_m_wb_u8): Delete. (__arm_vddupq_m_wb_u16): Delete. (__arm_vddupq_m_wb_u32): Delete. (__arm_vddupq_n_u8): Delete. (__arm_vddupq_n_u32): Delete. (__arm_vddupq_n_u16): Delete. (__arm_vidupq_m_n_u8): Delete. (__arm_vidupq_m_n_u32): Delete. (__arm_vidupq_m_n_u16): Delete. (__arm_vidupq_n_u8): Delete. (__arm_vidupq_m_wb_u8): Delete. (__arm_vidupq_m_wb_u16): Delete. (__arm_vidupq_m_wb_u32): Delete. (__arm_vidupq_n_u32): Delete. (__arm_vidupq_n_u16): Delete. (__arm_vidupq_wb_u8): Delete. (__arm_vidupq_wb_u16): Delete. (__arm_vidupq_wb_u32): Delete. (__arm_vddupq_wb_u8): Delete. (__arm_vddupq_wb_u16): Delete. (__arm_vddupq_wb_u32): Delete. (__arm_vddupq_x_n_u8): Delete. (__arm_vddupq_x_n_u16): Delete. (__arm_vddupq_x_n_u32): Delete. (__arm_vddupq_x_wb_u8): Delete. (__arm_vddupq_x_wb_u16): Delete. (__arm_vddupq_x_wb_u32): Delete. (__arm_vidupq_x_n_u8): Delete. (__arm_vidupq_x_n_u16): Delete. (__arm_vidupq_x_n_u32): Delete. (__arm_vidupq_x_wb_u8): Delete. (__arm_vidupq_x_wb_u16): Delete. (__arm_vidupq_x_wb_u32): Delete. (__arm_vddupq_m): Delete. (__arm_vddupq_u8): Delete. (__arm_vddupq_u32): Delete. (__arm_vddupq_u16): Delete. (__arm_vidupq_m): Delete. (__arm_vidupq_u8): Delete. (__arm_vidupq_u32): Delete. (__arm_vidupq_u16): Delete. (__arm_vddupq_x_u8): Delete. (__arm_vddupq_x_u16): Delete. (__arm_vddupq_x_u32): Delete. (__arm_vidupq_x_u8): Delete. (__arm_vidupq_x_u16): Delete. (__arm_vidupq_x_u32): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 112 ++++ gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm_mve.h | 676 ----------------------- 4 files changed, 116 insertions(+), 676 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 89724320d43..3d8bcdabe24 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -30,6 +30,7 @@ #include "basic-block.h" #include "function.h" #include "gimple.h" +#include "emit-rtl.h" #include "arm-mve-builtins.h" #include "arm-mve-builtins-shapes.h" #include "arm-mve-builtins-base.h" @@ -349,6 +350,115 @@ public: } }; +/* Map the vidup / vddup function directly to CODE (UNSPEC, M) where M is the + vector mode associated with type suffix 0. We need this special case + because in MODE_wb the builtins derefrence the first parameter and update + its contents. We also have to insert the two additional parameters needed + by the builtins compared to the intrinsics. */ +class viddup_impl : public function_base +{ +public: + CONSTEXPR viddup_impl (bool inc_dec) + : m_inc_dec (inc_dec) + {} + + /* Increment (true) or decrement (false). */ + bool m_inc_dec; + + unsigned int + call_properties (const function_instance &fi) const override + { + if (fi.mode_suffix_id == MODE_wb) + return CP_WRITE_MEMORY | CP_READ_MEMORY; + else + return 0; + } + + tree + memory_scalar_type (const function_instance &) const override + { + return get_typenode_from_name (UINT32_TYPE); + } + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + rtx insns, offset_ptr; + rtx new_offset; + int offset_arg_no; + rtx incr, total_incr; + + if (! e.type_suffix (0).integer_p) + gcc_unreachable (); + + if ((e.mode_suffix_id != MODE_n) + && (e.mode_suffix_id != MODE_wb)) + gcc_unreachable (); + + offset_arg_no = (e.pred == PRED_m) ? 1 : 0; + + /* In _wb mode, the start offset is passed via a pointer, + dereference it. */ + if (e.mode_suffix_id == MODE_wb) + { + rtx offset = gen_reg_rtx (SImode); + offset_ptr = e.args[offset_arg_no]; + emit_insn (gen_rtx_SET (offset, gen_rtx_MEM (SImode, offset_ptr))); + e.args[offset_arg_no] = offset; + } + + /* We have to shuffle parameters because the builtin needs additional + arguments: + - the updated "new_offset" + - total increment (incr * number of lanes) */ + new_offset = gen_reg_rtx (SImode); + e.args.quick_insert (offset_arg_no, new_offset); + + incr = e.args[offset_arg_no + 2]; + total_incr = gen_int_mode (INTVAL (incr) + * GET_MODE_NUNITS (e.vector_mode (0)), + SImode); + e.args.quick_push (total_incr); + + /* _wb mode uses the _n builtins and adds code to update the + offset. */ + switch (e.pred) + { + case PRED_none: + /* No predicate. */ + code = m_inc_dec + ? code_for_mve_q_u_insn (VIDUPQ, mode) + : code_for_mve_q_u_insn (VDDUPQ, mode); + insns = e.use_exact_insn (code); + break; + + case PRED_m: + case PRED_x: + /* "m" or "x" predicate. */ + code = m_inc_dec + ? code_for_mve_q_m_wb_u_insn (VIDUPQ_M, mode) + : code_for_mve_q_m_wb_u_insn (VDDUPQ_M, mode); + + if (e.pred == PRED_m) + insns = e.use_cond_insn (code, 0); + else + insns = e.use_pred_x_insn (code); + break; + + default: + gcc_unreachable (); + } + + /* Update offset as appropriate. */ + if (e.mode_suffix_id == MODE_wb) + emit_insn (gen_rtx_SET (gen_rtx_MEM (Pmode, offset_ptr), new_offset)); + + return insns; + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -561,7 +671,9 @@ FUNCTION_WITHOUT_N_NO_F (vcvtnq, VCVTNQ) FUNCTION_WITHOUT_N_NO_F (vcvtpq, VCVTPQ) FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) +FUNCTION (vddupq, viddup_impl, (false)) FUNCTION_ONLY_N (vdupq, VDUPQ) +FUNCTION (vidupq, viddup_impl, (true)) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) FUNCTION (vfmasq, unspec_mve_function_exact_insn, (-1, -1, -1, -1, -1, VFMASQ_N_F, -1, -1, -1, -1, -1, VFMASQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index dd46d882882..ed3048e219a 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -46,12 +46,14 @@ DEF_MVE_FUNCTION (vctp16q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vctp32q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vctp64q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vctp8q, vctp, none, m_or_none) +DEF_MVE_FUNCTION (vddupq, viddup, all_unsigned, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhcaddq_rot270, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhcaddq_rot90, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhsubq, binary_opt_n, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vidupq, viddup, all_unsigned, mx_or_none) DEF_MVE_FUNCTION (vld1q, load, all_integer, none) DEF_MVE_FUNCTION (vmaxaq, binary_maxamina, all_signed, m_or_none) DEF_MVE_FUNCTION (vmaxavq, binary_maxavminav, all_signed, p_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 41fcf666b11..526e0f8ee3a 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -66,6 +66,7 @@ extern const function_base *const vcvtnq; extern const function_base *const vcvtpq; extern const function_base *const vcvtq; extern const function_base *const vcvttq; +extern const function_base *const vddupq; extern const function_base *const vdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; @@ -75,6 +76,7 @@ extern const function_base *const vhaddq; extern const function_base *const vhcaddq_rot270; extern const function_base *const vhcaddq_rot90; extern const function_base *const vhsubq; +extern const function_base *const vidupq; extern const function_base *const vld1q; extern const function_base *const vmaxaq; extern const function_base *const vmaxavq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 49c4ea9afee..c3da491b9d1 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -82,18 +82,10 @@ #define vstrwq_scatter_shifted_offset_p(__base, __offset, __value, __p) __arm_vstrwq_scatter_shifted_offset_p(__base, __offset, __value, __p) #define vstrwq_scatter_shifted_offset(__base, __offset, __value) __arm_vstrwq_scatter_shifted_offset(__base, __offset, __value) #define vuninitializedq(__v) __arm_vuninitializedq(__v) -#define vddupq_m(__inactive, __a, __imm, __p) __arm_vddupq_m(__inactive, __a, __imm, __p) -#define vddupq_u8(__a, __imm) __arm_vddupq_u8(__a, __imm) -#define vddupq_u32(__a, __imm) __arm_vddupq_u32(__a, __imm) -#define vddupq_u16(__a, __imm) __arm_vddupq_u16(__a, __imm) #define vdwdupq_m(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m(__inactive, __a, __b, __imm, __p) #define vdwdupq_u8(__a, __b, __imm) __arm_vdwdupq_u8(__a, __b, __imm) #define vdwdupq_u32(__a, __b, __imm) __arm_vdwdupq_u32(__a, __b, __imm) #define vdwdupq_u16(__a, __b, __imm) __arm_vdwdupq_u16(__a, __b, __imm) -#define vidupq_m(__inactive, __a, __imm, __p) __arm_vidupq_m(__inactive, __a, __imm, __p) -#define vidupq_u8(__a, __imm) __arm_vidupq_u8(__a, __imm) -#define vidupq_u32(__a, __imm) __arm_vidupq_u32(__a, __imm) -#define vidupq_u16(__a, __imm) __arm_vidupq_u16(__a, __imm) #define viwdupq_m(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m(__inactive, __a, __b, __imm, __p) #define viwdupq_u8(__a, __b, __imm) __arm_viwdupq_u8(__a, __b, __imm) #define viwdupq_u32(__a, __b, __imm) __arm_viwdupq_u32(__a, __b, __imm) @@ -102,15 +94,9 @@ #define vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb(__addr, __offset, __value) -#define vddupq_x_u8(__a, __imm, __p) __arm_vddupq_x_u8(__a, __imm, __p) -#define vddupq_x_u16(__a, __imm, __p) __arm_vddupq_x_u16(__a, __imm, __p) -#define vddupq_x_u32(__a, __imm, __p) __arm_vddupq_x_u32(__a, __imm, __p) #define vdwdupq_x_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_u8(__a, __b, __imm, __p) #define vdwdupq_x_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_u16(__a, __b, __imm, __p) #define vdwdupq_x_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_u32(__a, __b, __imm, __p) -#define vidupq_x_u8(__a, __imm, __p) __arm_vidupq_x_u8(__a, __imm, __p) -#define vidupq_x_u16(__a, __imm, __p) __arm_vidupq_x_u16(__a, __imm, __p) -#define vidupq_x_u32(__a, __imm, __p) __arm_vidupq_x_u32(__a, __imm, __p) #define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) #define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) #define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) @@ -337,18 +323,6 @@ #define vuninitializedq_s64(void) __arm_vuninitializedq_s64(void) #define vuninitializedq_f16(void) __arm_vuninitializedq_f16(void) #define vuninitializedq_f32(void) __arm_vuninitializedq_f32(void) -#define vddupq_m_n_u8(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u8(__inactive, __a, __imm, __p) -#define vddupq_m_n_u32(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u32(__inactive, __a, __imm, __p) -#define vddupq_m_n_u16(__inactive, __a, __imm, __p) __arm_vddupq_m_n_u16(__inactive, __a, __imm, __p) -#define vddupq_m_wb_u8(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u8(__inactive, __a, __imm, __p) -#define vddupq_m_wb_u16(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u16(__inactive, __a, __imm, __p) -#define vddupq_m_wb_u32(__inactive, __a, __imm, __p) __arm_vddupq_m_wb_u32(__inactive, __a, __imm, __p) -#define vddupq_n_u8(__a, __imm) __arm_vddupq_n_u8(__a, __imm) -#define vddupq_n_u32(__a, __imm) __arm_vddupq_n_u32(__a, __imm) -#define vddupq_n_u16(__a, __imm) __arm_vddupq_n_u16(__a, __imm) -#define vddupq_wb_u8( __a, __imm) __arm_vddupq_wb_u8( __a, __imm) -#define vddupq_wb_u16( __a, __imm) __arm_vddupq_wb_u16( __a, __imm) -#define vddupq_wb_u32( __a, __imm) __arm_vddupq_wb_u32( __a, __imm) #define vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) #define vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) #define vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) @@ -361,18 +335,6 @@ #define vdwdupq_wb_u8( __a, __b, __imm) __arm_vdwdupq_wb_u8( __a, __b, __imm) #define vdwdupq_wb_u32( __a, __b, __imm) __arm_vdwdupq_wb_u32( __a, __b, __imm) #define vdwdupq_wb_u16( __a, __b, __imm) __arm_vdwdupq_wb_u16( __a, __b, __imm) -#define vidupq_m_n_u8(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u8(__inactive, __a, __imm, __p) -#define vidupq_m_n_u32(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u32(__inactive, __a, __imm, __p) -#define vidupq_m_n_u16(__inactive, __a, __imm, __p) __arm_vidupq_m_n_u16(__inactive, __a, __imm, __p) -#define vidupq_m_wb_u8(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u8(__inactive, __a, __imm, __p) -#define vidupq_m_wb_u16(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u16(__inactive, __a, __imm, __p) -#define vidupq_m_wb_u32(__inactive, __a, __imm, __p) __arm_vidupq_m_wb_u32(__inactive, __a, __imm, __p) -#define vidupq_n_u8(__a, __imm) __arm_vidupq_n_u8(__a, __imm) -#define vidupq_n_u32(__a, __imm) __arm_vidupq_n_u32(__a, __imm) -#define vidupq_n_u16(__a, __imm) __arm_vidupq_n_u16(__a, __imm) -#define vidupq_wb_u8( __a, __imm) __arm_vidupq_wb_u8( __a, __imm) -#define vidupq_wb_u16( __a, __imm) __arm_vidupq_wb_u16( __a, __imm) -#define vidupq_wb_u32( __a, __imm) __arm_vidupq_wb_u32( __a, __imm) #define viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) #define viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) #define viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) @@ -405,24 +367,12 @@ #define vstrwq_scatter_base_wb_s32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_s32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_u32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_u32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_f32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_f32(__addr, __offset, __value) -#define vddupq_x_n_u8(__a, __imm, __p) __arm_vddupq_x_n_u8(__a, __imm, __p) -#define vddupq_x_n_u16(__a, __imm, __p) __arm_vddupq_x_n_u16(__a, __imm, __p) -#define vddupq_x_n_u32(__a, __imm, __p) __arm_vddupq_x_n_u32(__a, __imm, __p) -#define vddupq_x_wb_u8(__a, __imm, __p) __arm_vddupq_x_wb_u8(__a, __imm, __p) -#define vddupq_x_wb_u16(__a, __imm, __p) __arm_vddupq_x_wb_u16(__a, __imm, __p) -#define vddupq_x_wb_u32(__a, __imm, __p) __arm_vddupq_x_wb_u32(__a, __imm, __p) #define vdwdupq_x_n_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u8(__a, __b, __imm, __p) #define vdwdupq_x_n_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u16(__a, __b, __imm, __p) #define vdwdupq_x_n_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u32(__a, __b, __imm, __p) #define vdwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u8(__a, __b, __imm, __p) #define vdwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u16(__a, __b, __imm, __p) #define vdwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u32(__a, __b, __imm, __p) -#define vidupq_x_n_u8(__a, __imm, __p) __arm_vidupq_x_n_u8(__a, __imm, __p) -#define vidupq_x_n_u16(__a, __imm, __p) __arm_vidupq_x_n_u16(__a, __imm, __p) -#define vidupq_x_n_u32(__a, __imm, __p) __arm_vidupq_x_n_u32(__a, __imm, __p) -#define vidupq_x_wb_u8(__a, __imm, __p) __arm_vidupq_x_wb_u8(__a, __imm, __p) -#define vidupq_x_wb_u16(__a, __imm, __p) __arm_vidupq_x_wb_u16(__a, __imm, __p) -#define vidupq_x_wb_u32(__a, __imm, __p) __arm_vidupq_x_wb_u32(__a, __imm, __p) #define viwdupq_x_n_u8(__a, __b, __imm, __p) __arm_viwdupq_x_n_u8(__a, __b, __imm, __p) #define viwdupq_x_n_u16(__a, __b, __imm, __p) __arm_viwdupq_x_n_u16(__a, __b, __imm, __p) #define viwdupq_x_n_u32(__a, __b, __imm, __p) __arm_viwdupq_x_n_u32(__a, __b, __imm, __p) @@ -1722,75 +1672,6 @@ __arm_vstrwq_scatter_shifted_offset_u32 (uint32_t * __base, uint32x4_t __offset, __builtin_mve_vstrwq_scatter_shifted_offset_uv4si ((__builtin_neon_si *) __base, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv16qi (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv4si (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv8hi (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint8x16_t __res = __builtin_mve_vddupq_m_n_uv16qi (__inactive, * __a, __imm, __p); - *__a -= __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint16x8_t __res = __builtin_mve_vddupq_m_n_uv8hi (__inactive, *__a, __imm, __p); - *__a -= __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vddupq_m_n_uv4si (__inactive, *__a, __imm, __p); - *__a -= __imm * 4u; - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_n_u8 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vddupq_n_uv16qi (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_n_u32 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vddupq_n_uv4si (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_n_u16 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vddupq_n_uv8hi (__a, __imm); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vdwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -1899,129 +1780,6 @@ __arm_vdwdupq_wb_u16 (uint32_t * __a, uint32_t __b, const int __imm) return __res; } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv16qi (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv4si (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv8hi (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_n_u8 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vidupq_n_uv16qi (__a, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint8x16_t __res = __builtin_mve_vidupq_m_n_uv16qi (__inactive, *__a, __imm, __p); - *__a += __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint16x8_t __res = __builtin_mve_vidupq_m_n_uv8hi (__inactive, *__a, __imm, __p); - *__a += __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vidupq_m_n_uv4si (__inactive, *__a, __imm, __p); - *__a += __imm * 4u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_n_u32 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vidupq_n_uv4si (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_n_u16 (uint32_t __a, const int __imm) -{ - return __builtin_mve_vidupq_n_uv8hi (__a, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_wb_u8 (uint32_t * __a, const int __imm) -{ - uint8x16_t __res = __builtin_mve_vidupq_n_uv16qi (*__a, __imm); - *__a += __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_wb_u16 (uint32_t * __a, const int __imm) -{ - uint16x8_t __res = __builtin_mve_vidupq_n_uv8hi (*__a, __imm); - *__a += __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_wb_u32 (uint32_t * __a, const int __imm) -{ - uint32x4_t __res = __builtin_mve_vidupq_n_uv4si (*__a, __imm); - *__a += __imm * 4u; - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_wb_u8 (uint32_t * __a, const int __imm) -{ - uint8x16_t __res = __builtin_mve_vddupq_n_uv16qi (*__a, __imm); - *__a -= __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_wb_u16 (uint32_t * __a, const int __imm) -{ - uint16x8_t __res = __builtin_mve_vddupq_n_uv8hi (*__a, __imm); - *__a -= __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_wb_u32 (uint32_t * __a, const int __imm) -{ - uint32x4_t __res = __builtin_mve_vddupq_n_uv4si (*__a, __imm); - *__a -= __imm * 4u; - return __res; -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_viwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -2267,57 +2025,6 @@ __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int __offset, uint3 *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_n_u8 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv16qi (__arm_vuninitializedq_u8 (), __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_n_u16 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv8hi (__arm_vuninitializedq_u16 (), __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_n_u32 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vddupq_m_n_uv4si (__arm_vuninitializedq_u32 (), __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_wb_u8 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint8x16_t __arg1 = __arm_vuninitializedq_u8 (); - uint8x16_t __res = __builtin_mve_vddupq_m_n_uv16qi (__arg1, * __a, __imm, __p); - *__a -= __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_wb_u16 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint16x8_t __arg1 = __arm_vuninitializedq_u16 (); - uint16x8_t __res = __builtin_mve_vddupq_m_n_uv8hi (__arg1, *__a, __imm, __p); - *__a -= __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_wb_u32 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint32x4_t __arg1 = __arm_vuninitializedq_u32 (); - uint32x4_t __res = __builtin_mve_vddupq_m_n_uv4si (__arg1, *__a, __imm, __p); - *__a -= __imm * 4u; - return __res; -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vdwdupq_x_n_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -2375,57 +2082,6 @@ __arm_vdwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16 return __res; } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_n_u8 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv16qi (__arm_vuninitializedq_u8 (), __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_n_u16 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv8hi (__arm_vuninitializedq_u16 (), __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_n_u32 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vidupq_m_n_uv4si (__arm_vuninitializedq_u32 (), __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_wb_u8 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint8x16_t __arg1 = __arm_vuninitializedq_u8 (); - uint8x16_t __res = __builtin_mve_vidupq_m_n_uv16qi (__arg1, *__a, __imm, __p); - *__a += __imm * 16u; - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_wb_u16 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint16x8_t __arg1 = __arm_vuninitializedq_u16 (); - uint16x8_t __res = __builtin_mve_vidupq_m_n_uv8hi (__arg1, *__a, __imm, __p); - *__a += __imm * 8u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_wb_u32 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - uint32x4_t __arg1 = __arm_vuninitializedq_u32 (); - uint32x4_t __res = __builtin_mve_vidupq_m_n_uv4si (__arg1, *__a, __imm, __p); - *__a += __imm * 4u; - return __res; -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_viwdupq_x_n_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -4475,69 +4131,6 @@ __arm_vstrwq_scatter_shifted_offset (uint32_t * __base, uint32x4_t __offset, uin __arm_vstrwq_scatter_shifted_offset_u32 (__base, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_n_u8 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_n_u32 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_n_u16 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_wb_u8 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_wb_u16 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_m (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_m_wb_u32 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u8 (uint32_t __a, const int __imm) -{ - return __arm_vddupq_n_u8 (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u32 (uint32_t __a, const int __imm) -{ - return __arm_vddupq_n_u32 (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u16 (uint32_t __a, const int __imm) -{ - return __arm_vddupq_n_u16 (__a, __imm); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vdwdupq_m (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -4622,111 +4215,6 @@ __arm_vdwdupq_u16 (uint32_t * __a, uint32_t __b, const int __imm) return __arm_vdwdupq_wb_u16 (__a, __b, __imm); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint8x16_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_n_u8 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint32x4_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_n_u32 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint16x8_t __inactive, uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_n_u16 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u8 (uint32_t __a, const int __imm) -{ - return __arm_vidupq_n_u8 (__a, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint8x16_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_wb_u8 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint16x8_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_wb_u16 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_m (uint32x4_t __inactive, uint32_t * __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_m_wb_u32 (__inactive, __a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u32 (uint32_t __a, const int __imm) -{ - return __arm_vidupq_n_u32 (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u16 (uint32_t __a, const int __imm) -{ - return __arm_vidupq_n_u16 (__a, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u8 (uint32_t * __a, const int __imm) -{ - return __arm_vidupq_wb_u8 (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u16 (uint32_t * __a, const int __imm) -{ - return __arm_vidupq_wb_u16 (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_u32 (uint32_t * __a, const int __imm) -{ - return __arm_vidupq_wb_u32 (__a, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u8 (uint32_t * __a, const int __imm) -{ - return __arm_vddupq_wb_u8 (__a, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u16 (uint32_t * __a, const int __imm) -{ - return __arm_vddupq_wb_u16 (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_u32 (uint32_t * __a, const int __imm) -{ - return __arm_vddupq_wb_u32 (__a, __imm); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_viwdupq_m (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -4867,48 +4355,6 @@ __arm_vstrwq_scatter_base_wb (uint32x4_t * __addr, const int __offset, uint32x4_ __arm_vstrwq_scatter_base_wb_u32 (__addr, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u8 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_n_u8 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u16 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_n_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u32 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_n_u32 (__a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u8 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_wb_u8 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u16 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_wb_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vddupq_x_u32 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vddupq_x_wb_u32 (__a, __imm, __p); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vdwdupq_x_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -4951,48 +4397,6 @@ __arm_vdwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t return __arm_vdwdupq_x_wb_u32 (__a, __b, __imm, __p); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u8 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_n_u8 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u16 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_n_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u32 (uint32_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_n_u32 (__a, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u8 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_wb_u8 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u16 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_wb_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vidupq_x_u32 (uint32_t *__a, const int __imm, mve_pred16_t __p) -{ - return __arm_vidupq_x_wb_u32 (__a, __imm, __p); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_viwdupq_x_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) @@ -6773,36 +6177,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_x_n_u32 ((uint32_t) __p1, p2, p3, p4), \ int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) -#define __arm_vidupq_x_u8(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_x_n_u8 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_x_wb_u8 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vddupq_x_u8(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_x_n_u8 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_x_wb_u8 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vidupq_x_u16(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_x_n_u16 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_x_wb_u16 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vddupq_x_u16(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_x_n_u16 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_x_wb_u16 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vidupq_x_u32(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_x_n_u32 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vddupq_x_u32(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_x_n_u32 ((uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - #define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -6905,56 +6279,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint16x8_t]: __arm_vldrbq_gather_offset_u16(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint32x4_t]: __arm_vldrbq_gather_offset_u32(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vidupq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: __arm_vidupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vidupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vidupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vddupq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: __arm_vddupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vddupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vddupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), (uint32_t) __p1, p2, p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) - -#define __arm_vidupq_u16(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_n_u16 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u16 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - -#define __arm_vidupq_u32(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_n_u32 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u32 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - -#define __arm_vidupq_u8(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vidupq_n_u8 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vidupq_wb_u8 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - -#define __arm_vddupq_u16(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_n_u16 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u16 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - -#define __arm_vddupq_u32(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_n_u32 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u32 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - -#define __arm_vddupq_u8(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vddupq_n_u8 ((uint32_t) __p0, p1), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_wb_u8 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1));}) - #define __arm_viwdupq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ From patchwork Wed Sep 4 13:26:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980806 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SsYx/Yv7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNgb40KJz1yg3 for ; Wed, 4 Sep 2024 23:31:39 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AEEDA385020B for ; Wed, 4 Sep 2024 13:31:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id AD8CC385842A for ; Wed, 4 Sep 2024 13:27:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD8CC385842A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AD8CC385842A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; cv=none; b=g3MEqjCV6irFcnJmuJw8lqCdBcYh7BAavsk3cZn201u3q3porM9WaWZRSKzFWPgrdTleptglCQ8OrncilQnlMp48tbczzuvdsQAdQUdoSpU9CymM9iJWIBe3Lu+Gfzz+c6yseszmNX2LmE6Uhs6zhK35tisM2VM9B1XjNl4lKaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456462; c=relaxed/simple; bh=P10VZfhsAH26k+OMcsCrtAVhgyLP9sasTFzwt+hgFNU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=rOKOX16GP4N1pi5tsy3rkhC6nK074M0C9ZtOyErbzc5hqPW3nd1W1STi/DD5At4AYFkvRRBZFarGHr+EENT0KTIRM2UXJl7Jr3WB0AHGFuw+XVci3YqGcXwTWSVPLT7oupt6fLx6+Uw0paolSMYxGedXEnJ2QNPsZlf5i+3c/yQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5de8a3f1cc6so360457eaf.1 for ; Wed, 04 Sep 2024 06:27:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456459; x=1726061259; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UieXH1/JLiRtleeWRP1t0sn7smb0MbFDLTLX+lyI0Ls=; b=SsYx/Yv7sINbKGJ7ATyjOdfhKNt1rwGezBPfTAjke8R3YHPb4A/RsX/JaAkoPXaWYB l4rwMFB9K5llBUxY4N0yF9NS0iIttSpgUSeAz/EOiu4oAMH5gkygHCLS0yju23vY0RJT uGUQl0olDfP1jczH5fJ1IPdr1dFsfsxzoPln522+0kdmXIWO7gXSU8ZZh0dlbTymc5Ub 7J/y5wqD2QK8gRcoQmkz9Z8v2QxRnrfUFIqPb7lZdoUC68Pf6aXWoosA0sfNuDBlf+F4 D7NHUakWxgwGAa3vD3vFmli7g6LSzcpOWuD5ZKjbR4WREqKmDDuDfigNmE03wPdMp9El jGBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456459; x=1726061259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UieXH1/JLiRtleeWRP1t0sn7smb0MbFDLTLX+lyI0Ls=; b=pgXHb+vFRB870VQBqbtJWKgt9Xi3Ihbh0g60uAdPjcv3j+S+WCAxR2ig/s+rBUbdHz lCQpoTfQ1pQHg4/8n8Z9JemsuP863xxcJaCnl/ZNvM4w39Ks0EgIaFDRV5lfJFd+x5yw CSXPpvVxuVoL/s0kDU24/5N/enAxiQEhcIfKWL2mhTufdobbC08U4OHsZnHvVMoxcC4W 9RBUpCcMhSzICnJUsLL4mb9HXas9f/cGsjHLvSTmGLXbCTgNQ/No16kF3eTeHi+aqLw+ wr5YaagLbiN6QaGZ2asHwrJAhoS/tn4PSoWm9y29qZV5LgL5AmVgI8/iJ/MZL0XilhXG jthw== X-Gm-Message-State: AOJu0Yy3pKp9Ef/HR3JyzGgdRgpaenQMLdakJvvzf2eW/GXt6hrg2tIz iBf1El8R7RXQkmOMvs25kDWTrn1veP6mMSHQ/8iRM831s6NiDISPd5bY7/LTbQTyAdx/gYAHTrH lC9mP+A== X-Google-Smtp-Source: AGHT+IEkub73OTix8dHNN3dUq58JDgFiDgeKu+EPbmBsqt0C1gleec3qC7TWxXwK5oveIox2H4DWBA== X-Received: by 2002:a4a:ea82:0:b0:5df:9614:11ff with SMTP id 006d021491bc7-5e18ec681ddmr806250eaf.3.1725456458290; Wed, 04 Sep 2024 06:27:38 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:37 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 20/36] arm: [MVE intrinsics] update v[id]dup tests Date: Wed, 4 Sep 2024 13:26:34 +0000 Message-Id: <20240904132650.2720446-21-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Testing v[id]dup overloads with '1' as argument for uint32_t* does not make sense: instead of choosing the '_wb' overload, we choose the '_n', but we already do that in the '_n' tests. This patch removes all such bogus foo2 functions. 2024-08-28 Christophe Lyon gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c: Remove foo2. --- .../arm/mve/intrinsics/vddupq_m_wb_u16.c | 18 +----------------- .../arm/mve/intrinsics/vddupq_m_wb_u32.c | 18 +----------------- .../arm/mve/intrinsics/vddupq_m_wb_u8.c | 18 +----------------- .../arm/mve/intrinsics/vddupq_wb_u16.c | 14 +------------- .../arm/mve/intrinsics/vddupq_wb_u32.c | 14 +------------- .../arm/mve/intrinsics/vddupq_wb_u8.c | 14 +------------- .../arm/mve/intrinsics/vddupq_x_wb_u16.c | 18 +----------------- .../arm/mve/intrinsics/vddupq_x_wb_u32.c | 18 +----------------- .../arm/mve/intrinsics/vddupq_x_wb_u8.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_m_wb_u16.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_m_wb_u32.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_m_wb_u8.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_wb_u16.c | 14 +------------- .../arm/mve/intrinsics/vidupq_wb_u32.c | 14 +------------- .../arm/mve/intrinsics/vidupq_wb_u8.c | 14 +------------- .../arm/mve/intrinsics/vidupq_x_wb_u16.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_x_wb_u32.c | 18 +----------------- .../arm/mve/intrinsics/vidupq_x_wb_u8.c | 18 +----------------- 18 files changed, 18 insertions(+), 282 deletions(-) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c index 2a907417b40..d4391358fc2 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c @@ -42,24 +42,8 @@ foo1 (uint16x8_t inactive, uint32_t *a, mve_pred16_t p) return vddupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 (uint16x8_t inactive, mve_pred16_t p) -{ - return vddupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c index ffaf3734923..58609dae29f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c @@ -42,24 +42,8 @@ foo1 (uint32x4_t inactive, uint32_t *a, mve_pred16_t p) return vddupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 (uint32x4_t inactive, mve_pred16_t p) -{ - return vddupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c index ae7a4e25fe2..a4d820b3628 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c @@ -42,24 +42,8 @@ foo1 (uint8x16_t inactive, uint32_t *a, mve_pred16_t p) return vddupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 (uint8x16_t inactive, mve_pred16_t p) -{ - return vddupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c index 6c54e325155..79e47bd867d 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vddupq_u16 (a, 1); } -/* -**foo2: -** ... -** vddup.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 () -{ - return vddupq_u16 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c index a8de90f7b12..d5cb77d3201 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vddupq_u32 (a, 1); } -/* -**foo2: -** ... -** vddup.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 () -{ - return vddupq_u32 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c index 5a90e069b1d..62b0f824307 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vddupq_u8 (a, 1); } -/* -**foo2: -** ... -** vddup.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 () -{ - return vddupq_u8 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c index dab65e08320..b765bc0d60e 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vddupq_x_u16 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 (mve_pred16_t p) -{ - return vddupq_x_u16 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c index c7abcaef942..ddbd04f22e9 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vddupq_x_u32 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 (mve_pred16_t p) -{ - return vddupq_x_u32 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c index d2c299d4e3f..bbbdaa6c7b6 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vddupq_x_u8 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vddupt.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 (mve_pred16_t p) -{ - return vddupq_x_u8 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c index 19d04601809..9b4afdf177f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c @@ -42,24 +42,8 @@ foo1 (uint16x8_t inactive, uint32_t *a, mve_pred16_t p) return vidupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 (uint16x8_t inactive, mve_pred16_t p) -{ - return vidupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c index 36a8ac30564..5793d02d261 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c @@ -42,24 +42,8 @@ foo1 (uint32x4_t inactive, uint32_t *a, mve_pred16_t p) return vidupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 (uint32x4_t inactive, mve_pred16_t p) -{ - return vidupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c index 75695304c65..e1d45b3b114 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c @@ -42,24 +42,8 @@ foo1 (uint8x16_t inactive, uint32_t *a, mve_pred16_t p) return vidupq_m (inactive, a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 (uint8x16_t inactive, mve_pred16_t p) -{ - return vidupq_m (inactive, 1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c index 83d9cc2a563..80cc9a08c6f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vidupq_u16 (a, 1); } -/* -**foo2: -** ... -** vidup.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 () -{ - return vidupq_u16 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c index d73face505d..2dc77c14363 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vidupq_u32 (a, 1); } -/* -**foo2: -** ... -** vidup.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 () -{ - return vidupq_u32 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c index 75187b0eb25..87068e4e1d3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c @@ -34,20 +34,8 @@ foo1 (uint32_t *a) return vidupq_u8 (a, 1); } -/* -**foo2: -** ... -** vidup.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 () -{ - return vidupq_u8 (1, 1); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c index 31ddde4bd3a..7524780d19e 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vidupq_x_u16 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint16x8_t -foo2 (mve_pred16_t p) -{ - return vidupq_x_u16 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c index c8193465a72..0d05657b886 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vidupq_x_u32 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u32 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint32x4_t -foo2 (mve_pred16_t p) -{ - return vidupq_x_u32 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c index f7a628990c9..e2b077ff974 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c @@ -42,24 +42,8 @@ foo1 (uint32_t *a, mve_pred16_t p) return vidupq_x_u8 (a, 1, p); } -/* -**foo2: -** ... -** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) -** ... -** vpst(?: @.*|) -** ... -** vidupt.u8 q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?: @.*|) -** ... -*/ -uint8x16_t -foo2 (mve_pred16_t p) -{ - return vidupq_x_u8 (1, 1, p); -} - #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ From patchwork Wed Sep 4 13:26:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980810 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=EJxFrwv/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNhr63zgz1yg3 for ; Wed, 4 Sep 2024 23:32:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2EC08385B529 for ; Wed, 4 Sep 2024 13:32:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) by sourceware.org (Postfix) with ESMTPS id 532CE3861034 for ; Wed, 4 Sep 2024 13:27:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 532CE3861034 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 532CE3861034 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c34 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456464; cv=none; b=LmA2zuP/zzLf52PKdOvhzJHbRZMNGg79TIJDJwOSHt3L+UXOmAFqlsu2oiYCM8mt9+ciC4hfMaUzSyKN4vIPNpFYHJ6erJXofZGb6YJvh5neXZmV2pChwOG9hfaAIjTKkhBv/S8HRrRpsmOjabLmXiN0x+rCM6KlJmwK9uVbedg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456464; c=relaxed/simple; bh=4/9zTQqRf0nFeRFIpbi6/ccIJJNavsq8tqnQzmKwQJM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=msEXyQ7BEUAvrjrC+Fg2FQ7pKKwfbQU4CtftoOQzOZEBgv0PWcp4BeBTYsGyQj1iO/uCT2+etMuzfCveUzIHF16zpkhmZUnVMCga1xwet8XGdfZVF59/fnOrkTx60s/rzATG2RsjKsNlfTuMCipqi6+9il07VMZsA2TxXHAasOk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-5df96759e37so4297822eaf.2 for ; Wed, 04 Sep 2024 06:27:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456459; x=1726061259; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pXnt9bYPAVJabztx6DbaAWL2H5eEnOEJbwc4JmPk+vY=; b=EJxFrwv/Tq8ZqFGQi/gxNkTNm3sFbounWsGdKZIdcsqJjTfw5fdLSASoTQbJpv2BHg 4jsQwoYgLSxz25WwzuHJh98JgvlzdikUMwuk9drVDD++KL8Kerqvw4avdkDF2JwYswAy Gf0ZTMeN65VsYkXQXNBJVpff0Ane+IfbyAwfSYq1EFnGHVgygxZDMHaVAkyxKxFSNZM2 NCDDkpFfV3K0Ln49CC6eElpVctPRP0seBXncxZTRFCcRpY5ybhuJOvMzwvi8tzGFWeBJ C/E85KHyp2UwJBLYyrmDIgVgOaKgKnAm4rbqzOyel68HRQsu3ZZu1wCo5WILwfkrmka3 u3uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456459; x=1726061259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pXnt9bYPAVJabztx6DbaAWL2H5eEnOEJbwc4JmPk+vY=; b=rn4Rlyw6OLLZ06HWOMLaEm9Xx+3tkS+7OelXF/GvLLcG0zPsdov709GQqIGp+o1axc u15IAzv1pmXoUy6BTKGZgeD9D7Ey/CYPevsgFitr+SZG8NyYrNjjf2swATFHcRhBDEUh ZZLJQMi59e+1EY5lJ3Al7UCjt4DGGRCc58Z7fQzo6lSW0av1GyX1WTTDkQxWZZE0CeSy BcvKWQzk4Avam+px0LDeLN+x7s4V/5jYMyeNExjglLMfQfQf16PJt1mwZIGsvR98Opir WzhYz69Q0BLD7Em45BPbJlWKBQHZbDb1P+AodgS9+XkSowVEU+fFRpumE4I2nyPpf3Z6 YV3A== X-Gm-Message-State: AOJu0YxLKEHi/IbjP/J7krXGMJr1MEyEH+NO5qs6HJNG4YQuJ30v6oD1 zIbBbZLa0vHut9Up4WrS8UUcl1/3TBx1z0bHfVlZqV4vrYTT2BFpNwbFt11xo0OAQ2zy/DiejMH eGwIVkw== X-Google-Smtp-Source: AGHT+IGVU9uKVPn187ZHVTsRz7e9arNhEbLqJLkr8uiSJ0YWSX5GLeRhwX8cFfwGkaTd3yY/gAj/Dw== X-Received: by 2002:a05:6820:1c9b:b0:5da:9204:6716 with SMTP id 006d021491bc7-5dfacfdff85mr19915927eaf.7.1725456459061; Wed, 04 Sep 2024 06:27:39 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:38 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 21/36] arm: [MVE intrinsics] remove v[id]dup expanders Date: Wed, 4 Sep 2024 13:26:35 +0000 Message-Id: <20240904132650.2720446-22-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org We use code_for_mve_q_u_insn, rather than the expanders used by the previous implementation, so we can remove the expanders and their declaration as builtins. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u) (vddupq_m_n_u, vidupq_m_n_u): Delete. * config/arm/mve.md (mve_vidupq_n_u, mve_vidupq_m_n_u) (mve_vddupq_n_u, mve_vddupq_m_n_u): Delete. --- gcc/config/arm/arm_mve_builtins.def | 4 -- gcc/config/arm/mve.md | 73 ----------------------------- 2 files changed, 77 deletions(-) diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index f141aab816c..7e88db4e4c3 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -805,10 +805,6 @@ VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, viwdupq_m_wb_u, v16qi, v8hi, v4si VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, vdwdupq_m_wb_u, v16qi, v8hi, v4si) VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, viwdupq_m_n_u, v16qi, v8hi, v4si) VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, vdwdupq_m_n_u, v16qi, v8hi, v4si) -VAR3 (BINOP_UNONE_UNONE_IMM, vddupq_n_u, v16qi, v8hi, v4si) -VAR3 (BINOP_UNONE_UNONE_IMM, vidupq_n_u, v16qi, v8hi, v4si) -VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vddupq_m_n_u, v16qi, v8hi, v4si) -VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vidupq_m_n_u, v16qi, v8hi, v4si) VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_n_u, v16qi, v4si, v8hi) VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_n_u, v16qi, v4si, v8hi) VAR1 (STRSBWBU, vstrwq_scatter_base_wb_u, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 36117303fd6..3477bbdda7b 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5088,22 +5088,6 @@ (define_insn "mve_vstrwq_scatter_shifted_offset_v4si_insn" (set_attr "length" "4")]) ;; -;; [vidupq_n_u]) -;; -(define_expand "mve_vidupq_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:SI 1 "s_register_operand") - (match_operand:SI 2 "mve_imm_selective_upto_8")] - "TARGET_HAVE_MVE" -{ - rtx temp = gen_reg_rtx (SImode); - emit_move_insn (temp, operands[1]); - rtx inc = gen_int_mode (INTVAL(operands[2]) * , SImode); - emit_insn (gen_mve_vidupq_u_insn (operands[0], temp, operands[1], - operands[2], inc)); - DONE; -}) - ;; ;; [vddupq_u_insn, vidupq_u_insn] ;; @@ -5118,26 +5102,6 @@ (define_insn "@mve_q_u_insn" "TARGET_HAVE_MVE" ".u%#\t%q0, %1, %3") -;; -;; [vidupq_m_n_u]) -;; -(define_expand "mve_vidupq_m_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_selective_upto_8") - (match_operand: 4 "vpr_register_operand")] - "TARGET_HAVE_MVE" -{ - rtx temp = gen_reg_rtx (SImode); - emit_move_insn (temp, operands[2]); - rtx inc = gen_int_mode (INTVAL(operands[3]) * , SImode); - emit_insn (gen_mve_vidupq_m_wb_u_insn(operands[0], operands[1], temp, - operands[2], operands[3], - operands[4], inc)); - DONE; -}) - ;; ;; [vddupq_m_wb_u_insn, vidupq_m_wb_u_insn] ;; @@ -5156,43 +5120,6 @@ (define_insn "@mve_q_m_wb_u_insn" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_u_insn")) (set_attr "length""8")]) -;; -;; [vddupq_n_u]) -;; -(define_expand "mve_vddupq_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:SI 1 "s_register_operand") - (match_operand:SI 2 "mve_imm_selective_upto_8")] - "TARGET_HAVE_MVE" -{ - rtx temp = gen_reg_rtx (SImode); - emit_move_insn (temp, operands[1]); - rtx inc = gen_int_mode (INTVAL(operands[2]) * , SImode); - emit_insn (gen_mve_vddupq_u_insn (operands[0], temp, operands[1], - operands[2], inc)); - DONE; -}) - -;; -;; [vddupq_m_n_u]) -;; -(define_expand "mve_vddupq_m_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_selective_upto_8") - (match_operand: 4 "vpr_register_operand")] - "TARGET_HAVE_MVE" -{ - rtx temp = gen_reg_rtx (SImode); - emit_move_insn (temp, operands[2]); - rtx inc = gen_int_mode (INTVAL(operands[3]) * , SImode); - emit_insn (gen_mve_vddupq_m_wb_u_insn(operands[0], operands[1], temp, - operands[2], operands[3], - operands[4], inc)); - DONE; -}) - ;; ;; [vdwdupq_n_u]) ;; From patchwork Wed Sep 4 13:26:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980808 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=pijqZtsD; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNhJ5NCdz1yg3 for ; Wed, 4 Sep 2024 23:32:16 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 93CF33864835 for ; Wed, 4 Sep 2024 13:32:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) by sourceware.org (Postfix) with ESMTPS id 807ED385F028 for ; Wed, 4 Sep 2024 13:27:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 807ED385F028 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 807ED385F028 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c34 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456464; cv=none; b=mpXDNIoo68igtrCJcmVUACICIRsSmGaor/JKHGiYe193j/mE7L5F6j5eOunN4FXeALikAEcSQC4VPSbmJgfx2srNKjKKe1xQpxdtPdPTrun4vyTPb3hU8NIYBPNkOW2JBSJzb66BWCuZlYFrEU2CX18TxJSvfTy68FAy2GRlXE8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456464; c=relaxed/simple; bh=RWSrh1cC5VGo1YAxBs7oPHVpUOr4ghDmZTH08aJhPbg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=V/3GgPqbAHN2p3PrIUz7M4edQFWo19bUG1yudGnfa/6bFR1kpD4/ii+ISLwNvNiQ0y5n72mqGn9TZHEUyT5Bsim20L9PEEwsm+FcwDglV7E4U7bgWMifVvPiGbRDjZnRTmRXDolYEbBPkRsSccJWHpsujCOJpRPNRwAphUHkaXk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-5da6865312eso4158988eaf.3 for ; Wed, 04 Sep 2024 06:27:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456460; x=1726061260; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i3DaR+0yjaE+U9eesfKNu1o0srroBt4hQa1k1QjByGg=; b=pijqZtsDF0qO6iI7bh3FvxsZ1ewvjPWWNVBbjmwT6AwFdRA4dw2CzhSrhiDQlCPwRk Zw9dGLED/wJ74m+uIMR8uOp6czAKfl3mI8OHQT42WmCIkrufaEegcZUYGaFxakRmKhK3 JsVfUkpzcSIETgNtPfQl3fs9X1lwS7Q5S2VV67PwWkllk620k7kIzisKfTEme/hfoo4R KzxwImR4SwSD4BTS5OO419TOmBoWMTtTNg8C8+sy07hPqaAsIpZnk+Ze1Id7W4CH7QX0 D5jAy17IewLg/OaEAVhTdGgyHEgFvuwJ3/0ffHqRRDaoUaRzYWBIrsOOopoTuw7RRXEM sW8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456460; x=1726061260; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i3DaR+0yjaE+U9eesfKNu1o0srroBt4hQa1k1QjByGg=; b=IzdO1IfHo9MOFmpnND3XfwZZCBe7PmITmwhBm89cGlt15FKQq6FNsayjIXbXCrsQSk upZ/5mcIHKOmQ3S1XnDjvvNblneYUUQ1EJRXfKgEcS2Fo+wqBt53uXoVb0LEk11PNa9e IUbIUkxAB7nL9fJOqNN7xZPA31f+t7G3IUirvBFmJfJSy75J04GkdKWlHsyeCDSkaLiL EuuVZdYLJoUWtZO2KeUu6Q0z0GLF35/J0ZA51MUv5TP8copbNUIvxjM+OvODtsqJk/EG ujjJSu96VSma91ilpG3Unr/h3gLaE8TJj9W6YPltLehavy3Hy5EDGsC5O6wvdj6tDMQ0 0nsA== X-Gm-Message-State: AOJu0YwlGXksKtfHv8GXJxZ9qeVwdTOBQW0IUNCAloopKw0UwJlFDTso 16ghQ8m8U07DsdZ+XMFJQiUsUE4kXU/G2GEm3TG1Eb0AvvexAlA9KQMy76LMiyP/YWUFbN6KtZr tXknWFw== X-Google-Smtp-Source: AGHT+IGcpf13mH+1r8d9g1yUgcmIdVK9aQd9NANWOlZVdnZii+5KNbqlCDtImkNXNKLkOk0WZDHZTg== X-Received: by 2002:a05:6820:278b:b0:5ba:ec8b:44b5 with SMTP id 006d021491bc7-5dfacef8751mr15974503eaf.3.1725456460316; Wed, 04 Sep 2024 06:27:40 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:39 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 22/36] arm: [MVE intrinsics] fix checks of immediate arguments Date: Wed, 4 Sep 2024 13:26:36 +0000 Message-Id: <20240904132650.2720446-23-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org As discussed in [1], it is better to use "su64" for immediates in intrinsics signatures in order to provide better diagnostics (erroneous constants are not truncated for instance). This patch thus uses su64 instead of ss32 in binary_lshift_unsigned, binary_rshift_narrow, binary_rshift_narrow_unsigned, ternary_lshift, ternary_rshift. In addition, we fix cases where we called require_integer_immediate whereas we just want to check that the argument is a scalar, and thus use require_scalar_type in binary_acca_int32, binary_acca_int64, unary_int32_acc. Finally, in binary_lshift_unsigned we just want to check that 'imm' is an immediate, not the optional predicates. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660262.html 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): Fix check of scalar argument. (binary_acca_int64): Likewise. (binary_lshift_unsigned): Likewise. (binary_rshift_narrow): Likewise. (binary_rshift_narrow_unsigned): Likewise. (ternary_lshift): Likewise. (ternary_rshift): Likewise. (unary_int32_acc): Likewise. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 47 +++++++++++++++-------- 1 file changed, 31 insertions(+), 16 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 971e86a2727..a1d2e243128 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -477,18 +477,23 @@ struct binary_acca_int32_def : public overloaded_base<0> { unsigned int i, nargs; type_suffix_index type; + const char *first_type_name; + if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES) return error_mark_node; + first_type_name = (type_suffixes[type].unsigned_p + ? "uint32_t" + : "int32_t"); + if (!r.require_scalar_type (0, first_type_name)) + return error_mark_node; + unsigned int last_arg = i + 1; for (i = 1; i < last_arg; i++) if (!r.require_matching_vector_type (i, type)) return error_mark_node; - if (!r.require_integer_immediate (0)) - return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); } }; @@ -514,18 +519,24 @@ struct binary_acca_int64_def : public overloaded_base<0> { unsigned int i, nargs; type_suffix_index type; + const char *first_type_name; + if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES) return error_mark_node; + + first_type_name = (type_suffixes[type].unsigned_p + ? "uint64_t" + : "int64_t"); + if (!r.require_scalar_type (0, first_type_name)) + return error_mark_node; + unsigned int last_arg = i + 1; for (i = 1; i < last_arg; i++) if (!r.require_matching_vector_type (i, type)) return error_mark_node; - if (!r.require_integer_immediate (0)) - return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); } }; @@ -613,7 +624,7 @@ struct binary_lshift_unsigned_def : public overloaded_base<0> bool preserve_user_namespace) const override { b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); - build_all (b, "vu0,vs0,ss32", group, MODE_n, preserve_user_namespace); + build_all (b, "vu0,vs0,su64", group, MODE_n, preserve_user_namespace); } tree @@ -622,6 +633,7 @@ struct binary_lshift_unsigned_def : public overloaded_base<0> unsigned int i, nargs; type_suffix_index type; if (!r.check_gp_argument (2, i, nargs) + || !r.require_integer_immediate (i) || (type = r.infer_vector_type (i-1)) == NUM_TYPE_SUFFIXES) return error_mark_node; @@ -636,10 +648,6 @@ struct binary_lshift_unsigned_def : public overloaded_base<0> return error_mark_node; } - for (; i < nargs; ++i) - if (!r.require_integer_immediate (i)) - return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); } @@ -1097,7 +1105,7 @@ struct binary_rshift_narrow_def : public overloaded_base<0> bool preserve_user_namespace) const override { b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); - build_all (b, "vh0,vh0,v0,ss32", group, MODE_n, preserve_user_namespace); + build_all (b, "vh0,vh0,v0,su64", group, MODE_n, preserve_user_namespace); } tree @@ -1144,7 +1152,7 @@ struct binary_rshift_narrow_unsigned_def : public overloaded_base<0> bool preserve_user_namespace) const override { b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); - build_all (b, "vhu0,vhu0,v0,ss32", group, MODE_n, preserve_user_namespace); + build_all (b, "vhu0,vhu0,v0,su64", group, MODE_n, preserve_user_namespace); } tree @@ -1588,7 +1596,7 @@ struct ternary_lshift_def : public overloaded_base<0> bool preserve_user_namespace) const override { b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); - build_all (b, "v0,v0,v0,ss32", group, MODE_n, preserve_user_namespace); + build_all (b, "v0,v0,v0,su64", group, MODE_n, preserve_user_namespace); } tree @@ -1683,7 +1691,7 @@ struct ternary_rshift_def : public overloaded_base<0> bool preserve_user_namespace) const override { b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); - build_all (b, "v0,v0,v0,ss32", group, MODE_n, preserve_user_namespace); + build_all (b, "v0,v0,v0,su64", group, MODE_n, preserve_user_namespace); } tree @@ -1838,11 +1846,18 @@ struct unary_int32_acc_def : public overloaded_base<0> { unsigned int i, nargs; type_suffix_index type; + const char *first_type_name; + if (!r.check_gp_argument (2, i, nargs) - || !r.require_integer_immediate (0) || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES) return error_mark_node; + first_type_name = (type_suffixes[type].unsigned_p + ? "uint32_t" + : "int32_t"); + if (!r.require_scalar_type (0, first_type_name)) + return error_mark_node; + return r.resolve_to (r.mode_suffix_id, type); } }; From patchwork Wed Sep 4 13:26:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980820 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=y6jfVv3E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNmJ4gJMz1yXY for ; Wed, 4 Sep 2024 23:35:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EB3E43871011 for ; Wed, 4 Sep 2024 13:35:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by sourceware.org (Postfix) with ESMTPS id E8233386103C for ; Wed, 4 Sep 2024 13:27:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E8233386103C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E8233386103C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; cv=none; b=JnOUw5h55bFNCvTAl/CE3F3X7dFGCB1i1o+eyU/aJlAXsSehgAesktGWemPS46Fms9QMU1ZjbrmK6GcJvTghnIPpipeVGQK3ycciXxoes3lAzbpQdRHa8NoqTOk59C6oUeyuRdi54qiE0fwh4+bl5dlgmYbmZ1Z/TeyL7v6zoX0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; c=relaxed/simple; bh=2Dy1KkJGN0XJxPHnM6qxqyfSZIGzX4VL5g67z2G6GgU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=I+HJwQtgqAnUd4Hz56FvtWR+n41XAGW5w/i+TWxE4JusKWg5pKyu2uhUQZLzyEhvNv1x0xG+XLlFr7gswMvivjKC1kRVo+t6X3iTR1Hoqxu962Z/5HkppuTL43FwwzpC5IKn6dJEhvFRajLftoK5yn/p4ubii9q+xDmcpcebhiA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-5dc93fa5639so3917351eaf.1 for ; Wed, 04 Sep 2024 06:27:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456461; x=1726061261; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GO0d/XZWT++5hx14Wf+pZHwRAYhoZxE/Xo8DA8GLKsc=; b=y6jfVv3EvD7So2glFFdtzAxEFJLNaBD2cvshQYoU8IRdUYOhl/NBm6Sm/elNI8MiRx vjjBK9yJQ6/E9JrEMyefV8VEMyu40S56JZmiRdWQnCMAElUEcDGq9qKKT0r/pRooTu6P fMePuX7vhxPI4Ro7xtxraxptFkUmlILVuklDvqhNkZnSmIWiZrmT5s26UsvdE77Ad1V+ 6zmcAzJgyZv7A//r4Ig3yPgJHL/d44K+U6YJL7b1uzweU9HeAgD/9SF7u8cYprOcbXcd MoanXUwyMFumcK0ergZvAB/LCqk/YWFKg1oVtKl80g95hIVANOuuqhL65SsFkfs6r3l+ mtxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456461; x=1726061261; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GO0d/XZWT++5hx14Wf+pZHwRAYhoZxE/Xo8DA8GLKsc=; b=bNPaOmngdRTBvA3CoHcMVMmqL8+FniUv/kiiEGVZ1vbylOQ7vvu94NdZRRxKc9hh5i kIBtkvIZwywW3aMs7dUi/oaLHia0KNd04p/ZXYAztL8Wg1Sd0W8zEzyAXx+Nb5OcUFaN k5MreY9GKqUEDHa9+p2CoSyQL4+CJmKK9Ago7P/TzIaRNd7FKO9HT/48Npy5RGrsKQ3b OljhSe/gjzYqCWg16zKWgMY+yiKf0p55FA8QRBQP7vO5ipcGRhbv6XYBmORFOT7nYwyq LFn7kaNJTADoVUC70evRNh5nH1c5Ee0gccKMu4qcMbVT4C6bNgSvyA83PgyQgcBD+pSp h7Jg== X-Gm-Message-State: AOJu0Yznj4wIiJvceud2vPQLXYgSf1cC/Nnx+hZuns2XtHsLp7JXG4KY v+36nvfUGbLn60ypSeBRka8e313G2+X1g76JOVI09/z6AHRaYSxPjSN6UEGh/rtZWZk+BSxmAUk unC3kbg== X-Google-Smtp-Source: AGHT+IFCJNPxA0ymgc0crc+lDlCY1QOB8Hp3b0+kuUENYA4vQyrKLs5KqDKUcIFiHDf1YaUt3DXpXw== X-Received: by 2002:a05:6820:168d:b0:5da:9bde:1c0b with SMTP id 006d021491bc7-5dfacc16c7cmr19746949eaf.0.1725456460886; Wed, 04 Sep 2024 06:27:40 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:40 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 23/36] arm: [MVE intrinsics] factorize vdwdup viwdup Date: Wed, 4 Sep 2024 13:26:37 +0000 Message-Id: <20240904132650.2720446-24-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vdwdup and viwdup so that they use the same parameterized names. Like with vddup and vidup, we do not bother with the corresponding expanders, as we stop using them in a subsequent patch. The patch also adds the missing attributes to vdwdupq_wb_u_insn and viwdupq_wb_u_insn patterns. 2024-08-21 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VIWDUPQ, VDWDUPQ, VIWDUPQ_M, VDWDUPQ_M. (VIDWDUPQ): New iterator. (VIDWDUPQ_M): New iterator. * config/arm/mve.md (mve_vdwdupq_wb_u_insn) (mve_viwdupq_wb_u_insn): Merge into ... (@mve_q_wb_u_insn): ... this. Add missing mve_unpredicated_insn and mve_move attributes. (mve_vdwdupq_m_wb_u_insn, mve_viwdupq_m_wb_u_insn): Merge into ... (@mve_q_m_wb_u_insn): ... this. --- gcc/config/arm/iterators.md | 4 +++ gcc/config/arm/mve.md | 68 +++++++------------------------------ 2 files changed, 17 insertions(+), 55 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index c0299117f26..2fb3b25040f 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -1009,6 +1009,8 @@ (define_int_attr mve_insn [ (VHSUBQ_S "vhsub") (VHSUBQ_U "vhsub") (VIDUPQ "vidup") (VDDUPQ "vddup") (VIDUPQ_M "vidup") (VDDUPQ_M "vddup") + (VIWDUPQ "viwdup") (VDWDUPQ "vdwdup") + (VIWDUPQ_M "viwdup") (VDWDUPQ_M "vdwdup") (VMAXAQ_M_S "vmaxa") (VMAXAQ_S "vmaxa") (VMAXAVQ_P_S "vmaxav") @@ -2968,6 +2970,8 @@ (define_int_iterator VCVTxQ [VCVTAQ_S VCVTAQ_U VCVTMQ_S VCVTMQ_U VCVTNQ_S VCVTNQ (define_int_iterator VCVTxQ_M [VCVTAQ_M_S VCVTAQ_M_U VCVTMQ_M_S VCVTMQ_M_U VCVTNQ_M_S VCVTNQ_M_U VCVTPQ_M_S VCVTPQ_M_U]) (define_int_iterator VIDDUPQ [VIDUPQ VDDUPQ]) (define_int_iterator VIDDUPQ_M [VIDUPQ_M VDDUPQ_M]) +(define_int_iterator VIDWDUPQ [VIWDUPQ VDWDUPQ]) +(define_int_iterator VIDWDUPQ_M [VIWDUPQ_M VDWDUPQ_M]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 3477bbdda7b..be3be67a144 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5156,22 +5156,23 @@ (define_expand "mve_vdwdupq_wb_u" }) ;; -;; [vdwdupq_wb_u_insn]) +;; [vdwdupq_wb_u_insn, viwdupq_wb_u_insn] ;; -(define_insn "mve_vdwdupq_wb_u_insn" +(define_insn "@mve_q_wb_u_insn" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") (subreg:SI (match_operand:DI 3 "s_register_operand" "r") 4) (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg")] - VDWDUPQ)) + VIDWDUPQ)) (set (match_operand:SI 1 "s_register_operand" "=Te") (unspec:SI [(match_dup 2) (subreg:SI (match_dup 3) 4) (match_dup 4)] - VDWDUPQ))] + VIDWDUPQ))] "TARGET_HAVE_MVE" - "vdwdup.u%#\t%q0, %2, %R3, %4" -) + ".u%#\t%q0, %2, %R3, %4" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_wb_u_insn")) + (set_attr "type" "mve_move")]) ;; ;; [vdwdupq_m_n_u]) @@ -5214,27 +5215,27 @@ (define_expand "mve_vdwdupq_m_wb_u" }) ;; -;; [vdwdupq_m_wb_u_insn]) +;; [vdwdupq_m_wb_u_insn, viwdupq_m_wb_u_insn] ;; -(define_insn "mve_vdwdupq_m_wb_u_insn" +(define_insn "@mve_q_m_wb_u_insn" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") (match_operand:SI 3 "s_register_operand" "1") (subreg:SI (match_operand:DI 4 "s_register_operand" "r") 4) (match_operand:SI 5 "mve_imm_selective_upto_8" "Rg") (match_operand: 6 "vpr_register_operand" "Up")] - VDWDUPQ_M)) + VIDWDUPQ_M)) (set (match_operand:SI 1 "s_register_operand" "=Te") (unspec:SI [(match_dup 2) (match_dup 3) (subreg:SI (match_dup 4) 4) (match_dup 5) (match_dup 6)] - VDWDUPQ_M)) + VIDWDUPQ_M)) ] "TARGET_HAVE_MVE" - "vpst\;vdwdupt.u%#\t%q2, %3, %R4, %5" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdwdupq_wb_u_insn")) + "vpst\;t.u%#\t%q2, %3, %R4, %5" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_wb_u_insn")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -5273,24 +5274,6 @@ (define_expand "mve_viwdupq_wb_u" DONE; }) -;; -;; [viwdupq_wb_u_insn]) -;; -(define_insn "mve_viwdupq_wb_u_insn" - [(set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:SI 2 "s_register_operand" "1") - (subreg:SI (match_operand:DI 3 "s_register_operand" "r") 4) - (match_operand:SI 4 "mve_imm_selective_upto_8" "Rg")] - VIWDUPQ)) - (set (match_operand:SI 1 "s_register_operand" "=Te") - (unspec:SI [(match_dup 2) - (subreg:SI (match_dup 3) 4) - (match_dup 4)] - VIWDUPQ))] - "TARGET_HAVE_MVE" - "viwdup.u%#\t%q0, %2, %R3, %4" -) - ;; ;; [viwdupq_m_n_u]) ;; @@ -5331,31 +5314,6 @@ (define_expand "mve_viwdupq_m_wb_u" DONE; }) -;; -;; [viwdupq_m_wb_u_insn]) -;; -(define_insn "mve_viwdupq_m_wb_u_insn" - [(set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") - (match_operand:SI 3 "s_register_operand" "1") - (subreg:SI (match_operand:DI 4 "s_register_operand" "r") 4) - (match_operand:SI 5 "mve_imm_selective_upto_8" "Rg") - (match_operand: 6 "vpr_register_operand" "Up")] - VIWDUPQ_M)) - (set (match_operand:SI 1 "s_register_operand" "=Te") - (unspec:SI [(match_dup 2) - (match_dup 3) - (subreg:SI (match_dup 4) 4) - (match_dup 5) - (match_dup 6)] - VIWDUPQ_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;\tviwdupt.u%#\t%q2, %3, %R4, %5" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_viwdupq_wb_u_insn")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vstrwq_scatter_base_wb_s vstrwq_scatter_base_wb_u] ;; From patchwork Wed Sep 4 13:26:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980818 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=KuTaDsk2; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNlw3L1kz1yXY for ; Wed, 4 Sep 2024 23:35:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50B68386EC02 for ; Wed, 4 Sep 2024 13:35:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 4402F385020D for ; Wed, 4 Sep 2024 13:27:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4402F385020D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4402F385020D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; cv=none; b=O+T3GkP0UwpUhJsTpjQBL8r4Zd+oNcrtOylFaj6PXXhliztJ3lTOnZdcgvqqwXeLof0798v2X7IvXgYIrMrNG/fHsORJCPfZiP2kNoaeDVez3jeqWXBREqpmyPdnsEWXMNjm8C3HWYEBkzK1WwEoY1klEf7zXrxyVPcYkQ8mGNQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; c=relaxed/simple; bh=Vy4ZI9CEImiJeS05woRZKVMQsAgVue/IWCv0jcKgNH0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IXCLnuYiFHDKPVChaS2a+0ufkCXdSxX6zAnTHq+ykdqpmlczp/2MwVTobJEL96LaWNwVL59Gd+Uz+GVDmModxRWBpptzgsJ1wH7ey6MGsJIKVnrC0falHXyYakvkfuOwC/eUa3IOOoq1JQ6BPLQvsfR6MM6KcrO5xjFWvait+w0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5de8a3f1cc6so360491eaf.1 for ; Wed, 04 Sep 2024 06:27:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456462; x=1726061262; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=g+3LCwSdhBetUIcbyx1hnl5ttonqbScOUBo403cdi4Q=; b=KuTaDsk2j15xt8OaSbxFHRXiDH251ajTjcWanzrZErggE6VnGw6xwp8Ojhk10wLXhU Bg+sFTjhKDnCuG8JFER6KdtZLDfElGKvTgUlz6lDrCPzXG4u63u+GrPIzIg4SZcpQIvR fEfMBEn5CybSM5x9Do3nY0gJ0Ag/2XtwM5QB6FHeKM6Cf3CbvyrhXzzi683St65F0yUL m3ISy0svxhfd/b+2RugW/dRQ8oN90pRBliuNNt1v2o8cwQoiMxx+GRRjYGGMJ6N6k3/h rG463cO2cOp/XIcUn+MBs+P8IoSgSISlmbAKzEffPSP42PqxwmXpfkrInmfGs4+DqS31 C9xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456462; x=1726061262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g+3LCwSdhBetUIcbyx1hnl5ttonqbScOUBo403cdi4Q=; b=ZAjyBWbVpDaWdztIHgprt1YnkK0oEZ0hZ8YeNt3KrZ6mMIZMAZxNbUHxTEDJvd+wmj wKv7j9nBmHAe1whlWUrSRMPkzoXMC07vedw2kEq4p+c7HgPEs+ZXMvzkiPMG59A3w35o yoeFK5nzBPARKA5D1CqLiqmoMlnbXlOQo0751tHvOjf5wMS+bprhf6o9z0AHtkviM3U2 GIJHHGwehsYp9xuIskA+WGcDpXRKw9nc+CCG3mykfFTCIkv+zbjrXh323rLTbaxn/9Ki pLuVUuIKUexUs9encYxbi1MVwZg0MIYhzOYPoJlB6Hywg+Gy54FM3r1/zNqw4dAImIo3 GOxQ== X-Gm-Message-State: AOJu0YxvnBOh+Pqx1msQ7Z+ydMmaeZNeA8OQkMZ1IwPCDsGXJkB9RyDY 5wRHoWT4r36XmHjKgfn6R6jpiHcWrJ32E3dlN5+EvPOkep5yywz7QPUMuDHNwe9BWit3lQHO4n5 XOrjLUw== X-Google-Smtp-Source: AGHT+IGl06qI0yhRvAOMGcfRc4t94itQNK7fR7mn4uAGhOAbRdTFk3Mt34+pFcgkR8HC9dyTA431Lw== X-Received: by 2002:a05:6820:61c:b0:5d5:d5e9:4e38 with SMTP id 006d021491bc7-5e18ec45f73mr1019955eaf.2.1725456462239; Wed, 04 Sep 2024 06:27:42 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:41 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape Date: Wed, 4 Sep 2024 13:26:38 +0000 Message-Id: <20240904132650.2720446-25-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-builtins-shapes.h (vidwdup): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 88 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 89 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index a1d2e243128..510f15ae73a 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2291,6 +2291,94 @@ struct viddup_def : public overloaded_base<0> }; SHAPE (viddup) +/* _t vfoo[_n]_t0(uint32_t, uint32_t, const int) + _t vfoo[_wb]_t0(uint32_t *, uint32_t, const int) + + Shape for vector increment or decrement with wrap and duplicate operations + that take an integer or pointer to integer first argument, an integer second + argument and an immediate, and produce a vector. + + Check that 'imm' is one of 1, 2, 4 or 8. + + Example: vdwdupq. + uint8x16_t [__arm_]vdwdupq[_n]_u8(uint32_t a, uint32_t b, const int imm) + uint8x16_t [__arm_]vdwdupq[_wb]_u8(uint32_t *a, uint32_t b, const int imm) + uint8x16_t [__arm_]vdwdupq_m[_n_u8](uint8x16_t inactive, uint32_t a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_m[_wb_u8](uint8x16_t inactive, uint32_t *a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_x[_n]_u8(uint32_t a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_x[_wb]_u8(uint32_t *a, uint32_t b, const int imm, mve_pred16_t p) */ +struct vidwdup_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info) const override + { + return ((i == 0) && (pred != PRED_m)); + } + + bool + skip_overload_p (enum predication_index, enum mode_suffix_index mode) const override + { + /* For MODE_wb, share the overloaded instance with MODE_n. */ + if (mode == MODE_wb) + return true; + + return false; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,su32,su32,su64", group, MODE_n, preserve_user_namespace); + build_all (b, "v0,as,su32,su64", group, MODE_wb, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index type_suffix = NUM_TYPE_SUFFIXES; + if (!r.check_gp_argument (3, i, nargs)) + return error_mark_node; + + type_suffix = r.type_suffix_ids[0]; + /* With PRED_m, ther is no type suffix, so infer it from the first (inactive) + argument. */ + if (type_suffix == NUM_TYPE_SUFFIXES) + type_suffix = r.infer_vector_type (0); + + unsigned int last_arg = i - 2; + /* Check that last_arg is either scalar or pointer. */ + if (!r.scalar_argument_p (last_arg)) + return error_mark_node; + + if (!r.scalar_argument_p (last_arg + 1)) + return error_mark_node; + + if (!r.require_integer_immediate (last_arg + 2)) + return error_mark_node; + + /* With MODE_n we expect a scalar, with MODE_wb we expect a pointer. */ + mode_suffix_index mode_suffix; + if (POINTER_TYPE_P (r.get_argument_type (last_arg))) + mode_suffix = MODE_wb; + else + mode_suffix = MODE_n; + + return r.resolve_to (mode_suffix, type_suffix); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_one_of (2, 1, 2, 4, 8); + } +}; +SHAPE (vidwdup) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 186287c1620..b3d08ab3866 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -83,6 +83,7 @@ namespace arm_mve extern const function_shape *const vcvt_f32_f16; extern const function_shape *const vcvtx; extern const function_shape *const viddup; + extern const function_shape *const vidwdup; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ From patchwork Wed Sep 4 13:26:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980823 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=QN4aO3s0; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNpg2Ty3z1yXY for ; Wed, 4 Sep 2024 23:37:47 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 03EEE384A45C for ; Wed, 4 Sep 2024 13:37:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by sourceware.org (Postfix) with ESMTPS id 39848386103E for ; Wed, 4 Sep 2024 13:27:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 39848386103E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 39848386103E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; cv=none; b=ElVQXS3/w77KeFnWO4JJKfiamWZ3yxUa/TZYOtf0GFmBvw9lknsH4YipkQ6fz+wRThWNNVZ9W5kqMKgeuALWPuaOeepe29nNYmcadmS3+wsaRjUnjfx8fP04yrtYYpCGabmToiLAjDzKoswlX8SEluV5tpY6uO4txDFOHhEqGag= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; c=relaxed/simple; bh=PRQ7c6rDSQid3s7JsT5BDcOIVfthW3MuqoTZ5V9dAJI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=PPygkDniUe7JR88kKSV1Jo8LefrxFlmVJmMz8nLI0UtRM7++mYrZuKrch+0wfFFiE08atnl9vNflsa+AbCrJzK828J+vUMi1ylLyfZLws56AclZrv+bqs7kXe0GLKjevtHwEECmZlEExkC1Bbgl0jMYPrngBu7suiB4Bc9NMJ78= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-5d5c7f24372so3984084eaf.0 for ; Wed, 04 Sep 2024 06:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456464; x=1726061264; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KcwdN7aU9Vjunm0+/4wy6MdbBz1IfqrgaezftJeLccM=; b=QN4aO3s0NWDPGdp3OglYsyAf1G+J1geagjXrF2jaCl3BSnKyrmfNh4WGmFhXn0Az8H JoZ4PGcFZOiT4l2GjE2Ux259k4ZyxZfHQzh/JHnZzdC02m9sR9QtResggZk4Kh0F432h GkzUL4gAp8Qu1fNcRJiqpC1jAVxlK4HZZdc3aE0CzDOsVkMNoRcZMbU5AQUKP0hMT9sk LpkqMHNHCJNyjRtZAeKTXDHAbXlf3KsRPExkp5pRQD9C4mgVKTGEJsS4UNmU8yJWDkiu BrM6qniQHiD530A5iiP1u7+7wYe00A7LBwRSEfAHhyYZdTW31ZPj8LU641VZBe0qa3G3 kUzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456464; x=1726061264; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KcwdN7aU9Vjunm0+/4wy6MdbBz1IfqrgaezftJeLccM=; b=W8t5G8DBsdC1S7KlZL3+3q/p0Gubv5Lae+p5H9B1rGBanLUmTDt+RtIH6JbpkiMYxs jukMFuutTVnJ1uX4UB3xBB5HkWxBRh5H/uAdhuU3hSkcNpfM/CUxi15ZufR5TfVmnvn3 TNpzHgNwAmdKGUW9eJfY4O/0zPRWJkYr80M7aVFyKNhrWvt8h+C2uqH41xbbIno6CUBS aIi4ppEZT8Xe4K1G9tTLgGdL7sZDx705F0mG14P/U0m0+ifEGTMOP78i+3A7whVBaHMc pQLailI09ifNetrM6rGDe2X5tBOgzmTuzCUW6pQMsf7zSmEDPuQeQo3z9TtJEo4yEG2N kSEw== X-Gm-Message-State: AOJu0YzE5v2Z6YXEVZZWxTk9vw/mXET6Ay0mKUlBQS0fEMRtH9+uH+l2 SZWLiV0gN4NjeJPvpWBdpFapGy87Rp8+YrUIA0qXCKidW+PiEciWobDIRXN9Zs/G5ACIpGle5A7 gU5ogrw== X-Google-Smtp-Source: AGHT+IETU+70hZUHqH5dfpSFjYM7bqFnGdCZeWja3+YSr8OOOt76GJE6jRP9BvQINzDAHvozMrZwug== X-Received: by 2002:a05:6820:2292:b0:5c6:9293:ed8a with SMTP id 006d021491bc7-5dfacfb1542mr19720915eaf.6.1725456463435; Wed, 04 Sep 2024 06:27:43 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:42 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 25/36] arm: [MVE intrinsics] rework vdwdup viwdup Date: Wed, 4 Sep 2024 13:26:39 +0000 Message-Id: <20240904132650.2720446-26-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vdwdup and viwdup using the new MVE builtins framework. In order to share more code with viddup_impl, the patch swaps operands 1 and 2 in @mve_v[id]wdupq_m_wb_u_insn, so that the parameter order is similar to what @mve_v[id]dupq_m_wb_u_insn uses. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (viddup_impl): Add support for wrapping versions. (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.def (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.h (vdwdupq): New. (viwdupq): New. * config/arm/arm_mve.h (vdwdupq_m): Delete. (vdwdupq_u8): Delete. (vdwdupq_u32): Delete. (vdwdupq_u16): Delete. (viwdupq_m): Delete. (viwdupq_u8): Delete. (viwdupq_u32): Delete. (viwdupq_u16): Delete. (vdwdupq_x_u8): Delete. (vdwdupq_x_u16): Delete. (vdwdupq_x_u32): Delete. (viwdupq_x_u8): Delete. (viwdupq_x_u16): Delete. (viwdupq_x_u32): Delete. (vdwdupq_m_n_u8): Delete. (vdwdupq_m_n_u32): Delete. (vdwdupq_m_n_u16): Delete. (vdwdupq_m_wb_u8): Delete. (vdwdupq_m_wb_u32): Delete. (vdwdupq_m_wb_u16): Delete. (vdwdupq_n_u8): Delete. (vdwdupq_n_u32): Delete. (vdwdupq_n_u16): Delete. (vdwdupq_wb_u8): Delete. (vdwdupq_wb_u32): Delete. (vdwdupq_wb_u16): Delete. (viwdupq_m_n_u8): Delete. (viwdupq_m_n_u32): Delete. (viwdupq_m_n_u16): Delete. (viwdupq_m_wb_u8): Delete. (viwdupq_m_wb_u32): Delete. (viwdupq_m_wb_u16): Delete. (viwdupq_n_u8): Delete. (viwdupq_n_u32): Delete. (viwdupq_n_u16): Delete. (viwdupq_wb_u8): Delete. (viwdupq_wb_u32): Delete. (viwdupq_wb_u16): Delete. (vdwdupq_x_n_u8): Delete. (vdwdupq_x_n_u16): Delete. (vdwdupq_x_n_u32): Delete. (vdwdupq_x_wb_u8): Delete. (vdwdupq_x_wb_u16): Delete. (vdwdupq_x_wb_u32): Delete. (viwdupq_x_n_u8): Delete. (viwdupq_x_n_u16): Delete. (viwdupq_x_n_u32): Delete. (viwdupq_x_wb_u8): Delete. (viwdupq_x_wb_u16): Delete. (viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m_n_u8): Delete. (__arm_vdwdupq_m_n_u32): Delete. (__arm_vdwdupq_m_n_u16): Delete. (__arm_vdwdupq_m_wb_u8): Delete. (__arm_vdwdupq_m_wb_u32): Delete. (__arm_vdwdupq_m_wb_u16): Delete. (__arm_vdwdupq_n_u8): Delete. (__arm_vdwdupq_n_u32): Delete. (__arm_vdwdupq_n_u16): Delete. (__arm_vdwdupq_wb_u8): Delete. (__arm_vdwdupq_wb_u32): Delete. (__arm_vdwdupq_wb_u16): Delete. (__arm_viwdupq_m_n_u8): Delete. (__arm_viwdupq_m_n_u32): Delete. (__arm_viwdupq_m_n_u16): Delete. (__arm_viwdupq_m_wb_u8): Delete. (__arm_viwdupq_m_wb_u32): Delete. (__arm_viwdupq_m_wb_u16): Delete. (__arm_viwdupq_n_u8): Delete. (__arm_viwdupq_n_u32): Delete. (__arm_viwdupq_n_u16): Delete. (__arm_viwdupq_wb_u8): Delete. (__arm_viwdupq_wb_u32): Delete. (__arm_viwdupq_wb_u16): Delete. (__arm_vdwdupq_x_n_u8): Delete. (__arm_vdwdupq_x_n_u16): Delete. (__arm_vdwdupq_x_n_u32): Delete. (__arm_vdwdupq_x_wb_u8): Delete. (__arm_vdwdupq_x_wb_u16): Delete. (__arm_vdwdupq_x_wb_u32): Delete. (__arm_viwdupq_x_n_u8): Delete. (__arm_viwdupq_x_n_u16): Delete. (__arm_viwdupq_x_n_u32): Delete. (__arm_viwdupq_x_wb_u8): Delete. (__arm_viwdupq_x_wb_u16): Delete. (__arm_viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m): Delete. (__arm_vdwdupq_u8): Delete. (__arm_vdwdupq_u32): Delete. (__arm_vdwdupq_u16): Delete. (__arm_viwdupq_m): Delete. (__arm_viwdupq_u8): Delete. (__arm_viwdupq_u32): Delete. (__arm_viwdupq_u16): Delete. (__arm_vdwdupq_x_u8): Delete. (__arm_vdwdupq_x_u16): Delete. (__arm_vdwdupq_x_u32): Delete. (__arm_viwdupq_x_u8): Delete. (__arm_viwdupq_x_u16): Delete. (__arm_viwdupq_x_u32): Delete. * config/arm/mve.md (@mve_q_m_wb_u_insn): Swap operands 1 and 2. --- gcc/config/arm/arm-mve-builtins-base.cc | 62 +- gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm_mve.h | 714 ----------------------- gcc/config/arm/mve.md | 10 +- 5 files changed, 53 insertions(+), 737 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 3d8bcdabe24..eaf054d9823 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -354,16 +354,19 @@ public: vector mode associated with type suffix 0. We need this special case because in MODE_wb the builtins derefrence the first parameter and update its contents. We also have to insert the two additional parameters needed - by the builtins compared to the intrinsics. */ + by the builtins compared to the intrinsics. In wrapping mode, we have to + match the 'hack' to make sure the 'wrap' parameters is in odd register. */ class viddup_impl : public function_base { public: - CONSTEXPR viddup_impl (bool inc_dec) - : m_inc_dec (inc_dec) + CONSTEXPR viddup_impl (bool inc_dec, bool wrap) + : m_inc_dec (inc_dec), m_wrap (wrap) {} /* Increment (true) or decrement (false). */ bool m_inc_dec; + /* v[id]wdup (true) or v[id]dup (false). */ + bool m_wrap; unsigned int call_properties (const function_instance &fi) const override @@ -388,7 +391,6 @@ public: rtx insns, offset_ptr; rtx new_offset; int offset_arg_no; - rtx incr, total_incr; if (! e.type_suffix (0).integer_p) gcc_unreachable (); @@ -412,15 +414,29 @@ public: /* We have to shuffle parameters because the builtin needs additional arguments: - the updated "new_offset" - - total increment (incr * number of lanes) */ + - total increment (incr * number of lanes) in the non-wrapping case + - hack to pass wrap in the top end of DImode operand so that it is + actually in a odd register */ new_offset = gen_reg_rtx (SImode); e.args.quick_insert (offset_arg_no, new_offset); - incr = e.args[offset_arg_no + 2]; - total_incr = gen_int_mode (INTVAL (incr) - * GET_MODE_NUNITS (e.vector_mode (0)), - SImode); - e.args.quick_push (total_incr); + if (m_wrap) + { + rtx wrap = gen_reg_rtx (DImode); + emit_insn (gen_rtx_SET (gen_rtx_SUBREG (SImode, wrap, 4), + e.args[offset_arg_no + 2])); + emit_insn (gen_rtx_SET (gen_rtx_SUBREG (SImode, wrap, 0), + GEN_INT (0))); + e.args[offset_arg_no + 2] = wrap; + } + else + { + rtx incr = e.args[offset_arg_no + 2]; + rtx total_incr = gen_int_mode (INTVAL (incr) + * GET_MODE_NUNITS (e.vector_mode (0)), + SImode); + e.args.quick_push (total_incr); + } /* _wb mode uses the _n builtins and adds code to update the offset. */ @@ -428,18 +444,26 @@ public: { case PRED_none: /* No predicate. */ - code = m_inc_dec - ? code_for_mve_q_u_insn (VIDUPQ, mode) - : code_for_mve_q_u_insn (VDDUPQ, mode); + code = m_wrap + ? (m_inc_dec + ? code_for_mve_q_wb_u_insn (VIWDUPQ, mode) + : code_for_mve_q_wb_u_insn (VDWDUPQ, mode)) + : (m_inc_dec + ? code_for_mve_q_u_insn (VIDUPQ, mode) + : code_for_mve_q_u_insn (VDDUPQ, mode)); insns = e.use_exact_insn (code); break; case PRED_m: case PRED_x: /* "m" or "x" predicate. */ - code = m_inc_dec - ? code_for_mve_q_m_wb_u_insn (VIDUPQ_M, mode) - : code_for_mve_q_m_wb_u_insn (VDDUPQ_M, mode); + code = m_wrap + ? (m_inc_dec + ? code_for_mve_q_m_wb_u_insn (VIWDUPQ_M, mode) + : code_for_mve_q_m_wb_u_insn (VDWDUPQ_M, mode)) + : (m_inc_dec + ? code_for_mve_q_m_wb_u_insn (VIDUPQ_M, mode) + : code_for_mve_q_m_wb_u_insn (VDDUPQ_M, mode)); if (e.pred == PRED_m) insns = e.use_cond_insn (code, 0); @@ -671,9 +695,11 @@ FUNCTION_WITHOUT_N_NO_F (vcvtnq, VCVTNQ) FUNCTION_WITHOUT_N_NO_F (vcvtpq, VCVTPQ) FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) -FUNCTION (vddupq, viddup_impl, (false)) +FUNCTION (vddupq, viddup_impl, (false, false)) FUNCTION_ONLY_N (vdupq, VDUPQ) -FUNCTION (vidupq, viddup_impl, (true)) +FUNCTION (vdwdupq, viddup_impl, (false, true)) +FUNCTION (vidupq, viddup_impl, (true, false)) +FUNCTION (viwdupq, viddup_impl, (true, true)) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) FUNCTION (vfmasq, unspec_mve_function_exact_insn, (-1, -1, -1, -1, -1, VFMASQ_N_F, -1, -1, -1, -1, -1, VFMASQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index ed3048e219a..c5f1e8a197b 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -48,12 +48,14 @@ DEF_MVE_FUNCTION (vctp64q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vctp8q, vctp, none, m_or_none) DEF_MVE_FUNCTION (vddupq, viddup, all_unsigned, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vdwdupq, vidwdup, all_unsigned, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhcaddq_rot270, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhcaddq_rot90, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhsubq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vidupq, viddup, all_unsigned, mx_or_none) +DEF_MVE_FUNCTION (viwdupq, vidwdup, all_unsigned, mx_or_none) DEF_MVE_FUNCTION (vld1q, load, all_integer, none) DEF_MVE_FUNCTION (vmaxaq, binary_maxamina, all_signed, m_or_none) DEF_MVE_FUNCTION (vmaxavq, binary_maxavminav, all_signed, p_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 526e0f8ee3a..ed8761318bb 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -68,6 +68,7 @@ extern const function_base *const vcvtq; extern const function_base *const vcvttq; extern const function_base *const vddupq; extern const function_base *const vdupq; +extern const function_base *const vdwdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; extern const function_base *const vfmasq; @@ -77,6 +78,7 @@ extern const function_base *const vhcaddq_rot270; extern const function_base *const vhcaddq_rot90; extern const function_base *const vhsubq; extern const function_base *const vidupq; +extern const function_base *const viwdupq; extern const function_base *const vld1q; extern const function_base *const vmaxaq; extern const function_base *const vmaxavq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index c3da491b9d1..37b0fedc4ff 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -82,24 +82,10 @@ #define vstrwq_scatter_shifted_offset_p(__base, __offset, __value, __p) __arm_vstrwq_scatter_shifted_offset_p(__base, __offset, __value, __p) #define vstrwq_scatter_shifted_offset(__base, __offset, __value) __arm_vstrwq_scatter_shifted_offset(__base, __offset, __value) #define vuninitializedq(__v) __arm_vuninitializedq(__v) -#define vdwdupq_m(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m(__inactive, __a, __b, __imm, __p) -#define vdwdupq_u8(__a, __b, __imm) __arm_vdwdupq_u8(__a, __b, __imm) -#define vdwdupq_u32(__a, __b, __imm) __arm_vdwdupq_u32(__a, __b, __imm) -#define vdwdupq_u16(__a, __b, __imm) __arm_vdwdupq_u16(__a, __b, __imm) -#define viwdupq_m(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m(__inactive, __a, __b, __imm, __p) -#define viwdupq_u8(__a, __b, __imm) __arm_viwdupq_u8(__a, __b, __imm) -#define viwdupq_u32(__a, __b, __imm) __arm_viwdupq_u32(__a, __b, __imm) -#define viwdupq_u16(__a, __b, __imm) __arm_viwdupq_u16(__a, __b, __imm) #define vstrdq_scatter_base_wb(__addr, __offset, __value) __arm_vstrdq_scatter_base_wb(__addr, __offset, __value) #define vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb(__addr, __offset, __value) -#define vdwdupq_x_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_u8(__a, __b, __imm, __p) -#define vdwdupq_x_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_u16(__a, __b, __imm, __p) -#define vdwdupq_x_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_u32(__a, __b, __imm, __p) -#define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) -#define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) -#define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) #define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) #define vadcq(__a, __b, __carry) __arm_vadcq(__a, __b, __carry) @@ -323,30 +309,6 @@ #define vuninitializedq_s64(void) __arm_vuninitializedq_s64(void) #define vuninitializedq_f16(void) __arm_vuninitializedq_f16(void) #define vuninitializedq_f32(void) __arm_vuninitializedq_f32(void) -#define vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) -#define vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) -#define vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) -#define vdwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) -#define vdwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) -#define vdwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) __arm_vdwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) -#define vdwdupq_n_u8(__a, __b, __imm) __arm_vdwdupq_n_u8(__a, __b, __imm) -#define vdwdupq_n_u32(__a, __b, __imm) __arm_vdwdupq_n_u32(__a, __b, __imm) -#define vdwdupq_n_u16(__a, __b, __imm) __arm_vdwdupq_n_u16(__a, __b, __imm) -#define vdwdupq_wb_u8( __a, __b, __imm) __arm_vdwdupq_wb_u8( __a, __b, __imm) -#define vdwdupq_wb_u32( __a, __b, __imm) __arm_vdwdupq_wb_u32( __a, __b, __imm) -#define vdwdupq_wb_u16( __a, __b, __imm) __arm_vdwdupq_wb_u16( __a, __b, __imm) -#define viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u8(__inactive, __a, __b, __imm, __p) -#define viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u32(__inactive, __a, __b, __imm, __p) -#define viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_n_u16(__inactive, __a, __b, __imm, __p) -#define viwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u8(__inactive, __a, __b, __imm, __p) -#define viwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u32(__inactive, __a, __b, __imm, __p) -#define viwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) __arm_viwdupq_m_wb_u16(__inactive, __a, __b, __imm, __p) -#define viwdupq_n_u8(__a, __b, __imm) __arm_viwdupq_n_u8(__a, __b, __imm) -#define viwdupq_n_u32(__a, __b, __imm) __arm_viwdupq_n_u32(__a, __b, __imm) -#define viwdupq_n_u16(__a, __b, __imm) __arm_viwdupq_n_u16(__a, __b, __imm) -#define viwdupq_wb_u8( __a, __b, __imm) __arm_viwdupq_wb_u8( __a, __b, __imm) -#define viwdupq_wb_u32( __a, __b, __imm) __arm_viwdupq_wb_u32( __a, __b, __imm) -#define viwdupq_wb_u16( __a, __b, __imm) __arm_viwdupq_wb_u16( __a, __b, __imm) #define vldrdq_gather_base_wb_s64(__addr, __offset) __arm_vldrdq_gather_base_wb_s64(__addr, __offset) #define vldrdq_gather_base_wb_u64(__addr, __offset) __arm_vldrdq_gather_base_wb_u64(__addr, __offset) #define vldrdq_gather_base_wb_z_s64(__addr, __offset, __p) __arm_vldrdq_gather_base_wb_z_s64(__addr, __offset, __p) @@ -367,18 +329,6 @@ #define vstrwq_scatter_base_wb_s32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_s32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_u32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_u32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_f32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_f32(__addr, __offset, __value) -#define vdwdupq_x_n_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u8(__a, __b, __imm, __p) -#define vdwdupq_x_n_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u16(__a, __b, __imm, __p) -#define vdwdupq_x_n_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_n_u32(__a, __b, __imm, __p) -#define vdwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u8(__a, __b, __imm, __p) -#define vdwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u16(__a, __b, __imm, __p) -#define vdwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_vdwdupq_x_wb_u32(__a, __b, __imm, __p) -#define viwdupq_x_n_u8(__a, __b, __imm, __p) __arm_viwdupq_x_n_u8(__a, __b, __imm, __p) -#define viwdupq_x_n_u16(__a, __b, __imm, __p) __arm_viwdupq_x_n_u16(__a, __b, __imm, __p) -#define viwdupq_x_n_u32(__a, __b, __imm, __p) __arm_viwdupq_x_n_u32(__a, __b, __imm, __p) -#define viwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u8(__a, __b, __imm, __p) -#define viwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u16(__a, __b, __imm, __p) -#define viwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u32(__a, __b, __imm, __p) #define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) #define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) #define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) @@ -1672,223 +1622,6 @@ __arm_vstrwq_scatter_shifted_offset_u32 (uint32_t * __base, uint32x4_t __offset, __builtin_mve_vstrwq_scatter_shifted_offset_uv4si ((__builtin_neon_si *) __base, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv4si (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __res = __builtin_mve_vdwdupq_m_n_uv16qi (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv16qi (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __res = __builtin_mve_vdwdupq_m_n_uv4si (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv4si (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __res = __builtin_mve_vdwdupq_m_n_uv8hi (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv8hi (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_n_u8 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_n_uv16qi (__a, __c, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_n_u32 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_n_uv4si (__a, __c, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_n_u16 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_n_uv8hi (__a, __c, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_wb_u8 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __res = __builtin_mve_vdwdupq_n_uv16qi (*__a, __c, __imm); - *__a = __builtin_mve_vdwdupq_wb_uv16qi (*__a, __c, __imm); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_wb_u32 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __res = __builtin_mve_vdwdupq_n_uv4si (*__a, __c, __imm); - *__a = __builtin_mve_vdwdupq_wb_uv4si (*__a, __c, __imm); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_wb_u16 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __res = __builtin_mve_vdwdupq_n_uv8hi (*__a, __c, __imm); - *__a = __builtin_mve_vdwdupq_wb_uv8hi (*__a, __c, __imm); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_n_u8 (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv16qi (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_n_u32 (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv4si (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_n_u16 (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv8hi (__inactive, __a, __c, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_wb_u8 (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __res = __builtin_mve_viwdupq_m_n_uv16qi (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv16qi (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_wb_u32 (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __res = __builtin_mve_viwdupq_m_n_uv4si (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv4si (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m_wb_u16 (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __res = __builtin_mve_viwdupq_m_n_uv8hi (__inactive, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv8hi (__inactive, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_n_u8 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_n_uv16qi (__a, __c, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_n_u32 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_n_uv4si (__a, __c, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_n_u16 (uint32_t __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_n_uv8hi (__a, __c, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_wb_u8 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __res = __builtin_mve_viwdupq_n_uv16qi (*__a, __c, __imm); - *__a = __builtin_mve_viwdupq_wb_uv16qi (*__a, __c, __imm); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_wb_u32 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __res = __builtin_mve_viwdupq_n_uv4si (*__a, __c, __imm); - *__a = __builtin_mve_viwdupq_wb_uv4si (*__a, __c, __imm); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_wb_u16 (uint32_t * __a, uint32_t __b, const int __imm) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __res = __builtin_mve_viwdupq_n_uv8hi (*__a, __c, __imm); - *__a = __builtin_mve_viwdupq_wb_uv8hi (*__a, __c, __imm); - return __res; -} - - __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_s64 (uint64x2_t * __addr, const int __offset) @@ -2025,120 +1758,6 @@ __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int __offset, uint3 *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_n_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv16qi (__arm_vuninitializedq_u8 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_n_u16 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv8hi (__arm_vuninitializedq_u16 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_n_u32 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_vdwdupq_m_n_uv4si (__arm_vuninitializedq_u32 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_wb_u8 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __arg1 = __arm_vuninitializedq_u8 (); - uint8x16_t __res = __builtin_mve_vdwdupq_m_n_uv16qi (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv16qi (__arg1, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_wb_u16 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __arg1 = __arm_vuninitializedq_u16 (); - uint16x8_t __res = __builtin_mve_vdwdupq_m_n_uv8hi (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv8hi (__arg1, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __arg1 = __arm_vuninitializedq_u32 (); - uint32x4_t __res = __builtin_mve_vdwdupq_m_n_uv4si (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_vdwdupq_m_wb_uv4si (__arg1, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_n_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv16qi (__arm_vuninitializedq_u8 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_n_u16 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv8hi (__arm_vuninitializedq_u16 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_n_u32 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - return __builtin_mve_viwdupq_m_n_uv4si (__arm_vuninitializedq_u32 (), __a, __c, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_wb_u8 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint8x16_t __arg1 = __arm_vuninitializedq_u8 (); - uint8x16_t __res = __builtin_mve_viwdupq_m_n_uv16qi (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv16qi (__arg1, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_wb_u16 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint16x8_t __arg1 = __arm_vuninitializedq_u16 (); - uint16x8_t __res = __builtin_mve_viwdupq_m_n_uv8hi (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv8hi (__arg1, *__a, __c, __imm, __p); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - uint64_t __c = ((uint64_t) __b) << 32; - uint32x4_t __arg1 = __arm_vuninitializedq_u32 (); - uint32x4_t __res = __builtin_mve_viwdupq_m_n_uv4si (__arg1, *__a, __c, __imm, __p); - *__a = __builtin_mve_viwdupq_m_wb_uv4si (__arg1, *__a, __c, __imm, __p); - return __res; -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -4131,174 +3750,6 @@ __arm_vstrwq_scatter_shifted_offset (uint32_t * __base, uint32x4_t __offset, uin __arm_vstrwq_scatter_shifted_offset_u32 (__base, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_n_u8 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_n_u32 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_n_u16 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_wb_u8 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_wb_u32 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_m (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_m_wb_u16 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u8 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_n_u8 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u32 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_n_u32 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u16 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_n_u16 (__a, __b, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u8 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_wb_u8 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u32 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_wb_u32 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_u16 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_vdwdupq_wb_u16 (__a, __b, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint8x16_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_n_u8 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint32x4_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_n_u32 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint16x8_t __inactive, uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_n_u16 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint8x16_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_wb_u8 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint32x4_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_wb_u32 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_m (uint16x8_t __inactive, uint32_t * __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_m_wb_u16 (__inactive, __a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u8 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_n_u8 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u32 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_n_u32 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u16 (uint32_t __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_n_u16 (__a, __b, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u8 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_wb_u8 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u32 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_wb_u32 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_u16 (uint32_t * __a, uint32_t __b, const int __imm) -{ - return __arm_viwdupq_wb_u16 (__a, __b, __imm); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrdq_scatter_base_wb (uint64x2_t * __addr, const int __offset, int64x2_t __value) @@ -4355,90 +3806,6 @@ __arm_vstrwq_scatter_base_wb (uint32x4_t * __addr, const int __offset, uint32x4_ __arm_vstrwq_scatter_base_wb_u32 (__addr, __offset, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_n_u8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u16 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_n_u16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u32 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_n_u32 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u8 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_wb_u8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u16 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_wb_u16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vdwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vdwdupq_x_wb_u32 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u8 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_n_u8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u16 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_n_u16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u32 (uint32_t __a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_n_u32 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u8 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_wb_u8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u16 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_wb_u16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_viwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t __p) -{ - return __arm_viwdupq_x_wb_u32 (__a, __b, __imm, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -6146,37 +5513,6 @@ extern void *__ARM_undef; #endif /* MVE Integer. */ - -#define __arm_vdwdupq_x_u8(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_x_n_u8 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_x_wb_u8 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_vdwdupq_x_u16(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_x_n_u16 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_x_wb_u16 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_vdwdupq_x_u32(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_x_n_u32 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_viwdupq_x_u8(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_x_n_u8 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_x_wb_u8 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_viwdupq_x_u16(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_x_n_u16 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_x_wb_u16 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_viwdupq_x_u32(p1,p2,p3,p4) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_x_n_u32 ((uint32_t) __p1, p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - #define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -6279,56 +5615,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint16x8_t]: __arm_vldrbq_gather_offset_u16(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint32x4_t]: __arm_vldrbq_gather_offset_u32(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_viwdupq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: __arm_viwdupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_viwdupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_viwdupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_viwdupq_u16(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_n_u16 (__ARM_mve_coerce_i_scalar(__p0, int), p1, (const int) p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u16 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, (const int) p2));}) - -#define __arm_viwdupq_u32(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_n_u32 (__ARM_mve_coerce_i_scalar(__p0, int), p1, p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u32 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, p2));}) - -#define __arm_viwdupq_u8(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_viwdupq_n_u8 (__ARM_mve_coerce_i_scalar(__p0, int), p1, p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_viwdupq_wb_u8 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, p2));}) - -#define __arm_vdwdupq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: __arm_vdwdupq_m_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vdwdupq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vdwdupq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar(__p1, int), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_m_wb_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3, p4));}) - -#define __arm_vdwdupq_u16(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_n_u16 (__ARM_mve_coerce_i_scalar(__p0, int), p1, p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u16 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, p2));}) - -#define __arm_vdwdupq_u32(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_n_u32 (__ARM_mve_coerce_i_scalar(__p0, int), p1, p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u32 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, p2));}) - -#define __arm_vdwdupq_u8(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int_n]: __arm_vdwdupq_n_u8 (__ARM_mve_coerce_i_scalar(__p0, int), p1, p2), \ - int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vdwdupq_wb_u8 (__ARM_mve_coerce_u32_ptr(__p0, uint32_t *), p1, p2));}) - #define __arm_vshlcq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2, p3), \ diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index be3be67a144..72a7e4dc868 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5219,14 +5219,14 @@ (define_expand "mve_vdwdupq_m_wb_u" ;; (define_insn "@mve_q_m_wb_u_insn" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") - (match_operand:SI 3 "s_register_operand" "1") + (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") + (match_operand:SI 3 "s_register_operand" "2") (subreg:SI (match_operand:DI 4 "s_register_operand" "r") 4) (match_operand:SI 5 "mve_imm_selective_upto_8" "Rg") (match_operand: 6 "vpr_register_operand" "Up")] VIDWDUPQ_M)) - (set (match_operand:SI 1 "s_register_operand" "=Te") - (unspec:SI [(match_dup 2) + (set (match_operand:SI 2 "s_register_operand" "=Te") + (unspec:SI [(match_dup 1) (match_dup 3) (subreg:SI (match_dup 4) 4) (match_dup 5) @@ -5234,7 +5234,7 @@ (define_insn "@mve_q_m_wb_u_insn" VIDWDUPQ_M)) ] "TARGET_HAVE_MVE" - "vpst\;t.u%#\t%q2, %3, %R4, %5" + "vpst\;t.u%#\t%q1, %3, %R4, %5" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_wb_u_insn")) (set_attr "type" "mve_move") (set_attr "length""8")]) From patchwork Wed Sep 4 13:26:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980821 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=M9PolALi; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNn837shz1yXY for ; Wed, 4 Sep 2024 23:36:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 42920382EF3D for ; Wed, 4 Sep 2024 13:36:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id 7C3ED3860C34 for ; Wed, 4 Sep 2024 13:27:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7C3ED3860C34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7C3ED3860C34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456469; cv=none; b=JgoVHLh/UBlF7RZugqxfJZD4k/+gPqTn1YQj6V72Cq/CAidoIBfSaNCU+rdFXMton8LXZChzG4eZw7dwEJpbCypWKDKlXhYE1I0mSAC+hbHBDOuy8k2CikEi1XlMbNU0APm0xbGLujY5QsfBvUc2WXmbHd554ghkTrZNFehoR/s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456469; c=relaxed/simple; bh=Um6WS6+jQV76gsY1Uk1juukOATZjEOU1rwcCQi8ZP84=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=EAD7KnK4mPmY6P02VgHelRkpPYiAxS0RIYkYCYRE3efDtEkfRqpNe8WkwaVOeX0PrUMdoYlSABdPSd3bBktAic9fE00oKtlcUJCPqrxZNWR2e3DcUP72GTXRLLLA4+lmxl3KGP1V9TWBmZyacqPl5X5Zv1ESxy3tDFk2ms/hCMo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5df9542f3d8so5537115eaf.0 for ; Wed, 04 Sep 2024 06:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456464; x=1726061264; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YCZcZg++m0i54PjvPGMAHUahV0OqlHvkYKgOKkmgm+s=; b=M9PolALivTdzrvcPAOjKmTz5UhALh9QRZs98ksUcopf4NMrZe0uVXwTwRsNg2HKhVE eSC9hZpUBsTGr62fXmMHkX97sTz1G3J9st2bWN6MwGvN6BqOb0QD3B/6RiTkJQsnWT5X l4odnOlu3kXCerWzYRNhG3qvjf3sO1FLrOSNrgLJQK9qIFC1D9deSytets2rYBkPEZpd ftAIKq+hggeG45H0bAG6ZRl+JJ2RTp5wUZpuDXvEA53lFc+u/CDsq1CEu4jQ+VbtSAvl XTYGMT21UO6xy8LRR7JWFJU+YtVBN2yYRRqo3sAP3qs+YSWjqIwT4FVOSzhsq8Azq6Gm erMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456464; x=1726061264; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YCZcZg++m0i54PjvPGMAHUahV0OqlHvkYKgOKkmgm+s=; b=xRgrJT4RmVRYN027aOrORvaqmjO1aDmgSez4iRxa+O+U5d3x61+1w6ooNwOMGOlycd gJIzAgYyQTj2WzwpJYrNRiU7+8u9FU0WWjTFrMUzDWDN916+9F1XRjWqvaiUf40JuLRd CT/nUK9Bt6+AmxZ6Vu1DgDaxIPB31UI7sACF9SrwAnYGPR7H58WAPDFCq9UrhjBtHqTQ dOSSA/bzebjvAGD41Pxcc9YJW6Tz6fv7G1WMDQRG/N/WUgGx/HqbcpjDzXa+DmllAfvf 55Inq87zYx9cRvRLiG9Ak8xz3tb6dU2EZmSMYLsw9xOJk/89gZh9TtzMSW9lcX+izB+D 4UOg== X-Gm-Message-State: AOJu0Yyh4cvZDaZm7BqKo/3+z4un+Zd2FipjxZIY2wOxr5x3P/ef8KsL bCRqb7s4BQRlLoT0z9FwatLEyg+cxgm2WWMmN7xdjh4mp9XoGf6d3K6O4vsjI4b/CtO+AlY48Gf xgyTt4Q== X-Google-Smtp-Source: AGHT+IHlcNeJDiNIxyFqJ5Zt2dH01s56wBoCAhTi+/FFAjt2mOuS9hchikBTztrdEd4fDkbwM0k1HA== X-Received: by 2002:a05:6820:2202:b0:5d8:6769:9d85 with SMTP id 006d021491bc7-5dfad01f5bamr18284428eaf.6.1725456464214; Wed, 04 Sep 2024 06:27:44 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:43 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 26/36] arm: [MVE intrinsics] update v[id]wdup tests Date: Wed, 4 Sep 2024 13:26:40 +0000 Message-Id: <20240904132650.2720446-27-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Testing v[id]wdup overloads with '1' as argument for uint32_t* does not make sense: this patch adds a new 'unit32_t *a' parameter to foo2 in such tests. The difference with v[id]dup tests (where we removed 'foo2') is that in 'foo1' we test the overload with a variable 'wrap' parameter (b) and we need foo2 to test the overload with an immediate (1). 2024-08-28 Christophe Lyon gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c: Use pointer parameter in foo2. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c: Likewise. --- .../gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c | 6 +++--- gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c | 6 +++--- gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c | 6 +++--- .../gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c | 6 +++--- 18 files changed, 54 insertions(+), 54 deletions(-) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c index b24e7a2f5af..e6004056c2c 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c @@ -53,13 +53,13 @@ foo1 (uint16x8_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint16x8_t -foo2 (uint16x8_t inactive, mve_pred16_t p) +foo2 (uint16x8_t inactive, uint32_t *a, mve_pred16_t p) { - return vdwdupq_m (inactive, 1, 1, 1, p); + return vdwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c index 75c41450a38..b36dbcd8585 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c @@ -53,13 +53,13 @@ foo1 (uint32x4_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint32x4_t -foo2 (uint32x4_t inactive, mve_pred16_t p) +foo2 (uint32x4_t inactive, uint32_t *a, mve_pred16_t p) { - return vdwdupq_m (inactive, 1, 1, 1, p); + return vdwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c index 90d64671dcf..b1577065a48 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c @@ -53,13 +53,13 @@ foo1 (uint8x16_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint8x16_t -foo2 (uint8x16_t inactive, mve_pred16_t p) +foo2 (uint8x16_t inactive, uint32_t *a, mve_pred16_t p) { - return vdwdupq_m (inactive, 1, 1, 1, p); + return vdwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c index 87af2b6817a..f1fae4b47e7 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint16x8_t -foo2 () +foo2 (uint32_t *a) { - return vdwdupq_u16 (1, 1, 1); + return vdwdupq_u16 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c index ec136dc3222..4282826a4f4 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint32x4_t -foo2 () +foo2 (uint32_t *a) { - return vdwdupq_u32 (1, 1, 1); + return vdwdupq_u32 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c index 3653d00bc5d..afc8eb281ae 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint8x16_t -foo2 () +foo2 (uint32_t *a) { - return vdwdupq_u8 (1, 1, 1); + return vdwdupq_u8 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c index e9d994ccfc5..fd250c2652c 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint16x8_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return vdwdupq_x_u16 (1, 1, 1, p); + return vdwdupq_x_u16 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c index 07438b02351..dbb1961dea5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint32x4_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return vdwdupq_x_u32 (1, 1, 1, p); + return vdwdupq_x_u32 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c index 96280225351..5f004717487 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint8x16_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return vdwdupq_x_u8 (1, 1, 1, p); + return vdwdupq_x_u8 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c index 84733f94e7c..4d7b0a194ac 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c @@ -53,13 +53,13 @@ foo1 (uint16x8_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint16x8_t -foo2 (uint16x8_t inactive, mve_pred16_t p) +foo2 (uint16x8_t inactive, uint32_t *a, mve_pred16_t p) { - return viwdupq_m (inactive, 1, 1, 1, p); + return viwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c index a175744b654..e78f818ffb8 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c @@ -53,13 +53,13 @@ foo1 (uint32x4_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint32x4_t -foo2 (uint32x4_t inactive, mve_pred16_t p) +foo2 (uint32x4_t inactive, uint32_t *a, mve_pred16_t p) { - return viwdupq_m (inactive, 1, 1, 1, p); + return viwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c index 7240b6e72bc..2c2f44c87ea 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c @@ -53,13 +53,13 @@ foo1 (uint8x16_t inactive, uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint8x16_t -foo2 (uint8x16_t inactive, mve_pred16_t p) +foo2 (uint8x16_t inactive, uint32_t *a, mve_pred16_t p) { - return viwdupq_m (inactive, 1, 1, 1, p); + return viwdupq_m (inactive, a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c index eaa496bb2da..ccdc3d4ad33 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint16x8_t -foo2 () +foo2 (uint32_t *a) { - return viwdupq_u16 (1, 1, 1); + return viwdupq_u16 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c index c1912d77486..1faffff4d21 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint32x4_t -foo2 () +foo2 (uint32_t *a) { - return viwdupq_u32 (1, 1, 1); + return viwdupq_u32 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c index f0d66a9ba29..91b0ef41bdc 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c @@ -41,13 +41,13 @@ foo1 (uint32_t *a, uint32_t b) ** ... */ uint8x16_t -foo2 () +foo2 (uint32_t *a) { - return viwdupq_u8 (1, 1, 1); + return viwdupq_u8 (a, 1, 1); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c index 265aef42c92..8b474e192c3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint16x8_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return viwdupq_x_u16 (1, 1, 1, p); + return viwdupq_x_u16 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c index 585e41075db..30bf37c7d4f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint32x4_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return viwdupq_x_u32 (1, 1, 1, p); + return viwdupq_x_u32 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c index ca39081dfc5..ae9dd2baa41 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c @@ -53,13 +53,13 @@ foo1 (uint32_t *a, uint32_t b, mve_pred16_t p) ** ... */ uint8x16_t -foo2 (mve_pred16_t p) +foo2 (uint32_t *a, mve_pred16_t p) { - return viwdupq_x_u8 (1, 1, 1, p); + return viwdupq_x_u8 (a, 1, 1, p); } #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ From patchwork Wed Sep 4 13:26:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980814 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=lR4kaAMV; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNk95qDXz1yg3 for ; Wed, 4 Sep 2024 23:33:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85350386183B for ; Wed, 4 Sep 2024 13:33:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id 2D27B386182F for ; Wed, 4 Sep 2024 13:27:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D27B386182F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2D27B386182F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c30 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456470; cv=none; b=vgtcaGgHPT4YQu932IiLXUyVVUk/iHfyzcJUA8JBFngFJZ906c9VTJbPzErGOsuMEr8H6yRFk5bGt7Q7ixJMnm50zqOKNeYoYBbCvWm4Iv91yMiy/lPJco0fhzGJL3TsZ0ye2UpSbYpRithvDTAKVHhzScDGnejhaCsWwKykAqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456470; c=relaxed/simple; bh=zXqVxy9OEixofpoVOjtVd8MLTBoU0LptTPvwm+NTz/w=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=VRaHjdRL4L711PMbYIQ7WaqSYO0YxdNNSJG6h7zZLKVSkkt6J1tqGPJAmE3DzsDe4wLzubQNi2QKWRt7MYNLW2WKvvltvFsJ282ehzSixehtHaWrhCzFPfDgfz1WCsRaom7sWfco1rf4RbOs3mcE6kQ5HHbZxGerJ1T+zrleeZo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-5dfa315ccf1so530873eaf.3 for ; Wed, 04 Sep 2024 06:27:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456467; x=1726061267; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NAAL9QVyuQQgiXiVWYnkflGw2+UlZWUBESgIi1EXSLw=; b=lR4kaAMVJd6k4WcPkvwKTpzJqQlL0SuK7ogxUYfRByy1OARZnrYOmcPS3kcPoYxH5b ++D+n8CPvbQFvnwNuqCQkj6Y4bF3tmKShuxlhkRTHdo8xvd0/ki6TJ7HMAa/ayzTxmsc hfCjK55qFs+4Hc02lUSxxvx6eB+qkgEKfnzr4jKsJXcsWuoKvJ38fIQIhKiaGtgw/C21 zviySKPHfdLNACuY/4H0TZmIMmoRKMBmGCdhgcmzwHRVACOSyei4SME6xL5JeYMXumrh 0ZcCXjBtr2DMRDPf2mRw9lpvI7GPR5zroA/TedjA7FPUglbVsBOXRGxO/BQYRjeGSnmG X4jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456467; x=1726061267; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NAAL9QVyuQQgiXiVWYnkflGw2+UlZWUBESgIi1EXSLw=; b=sKnHD5CzdrkF8h+8nOBcczV2D5qJYDfk7/dKoLvHFgy9f/aJh4KsLtTm6YXUqb2dr8 a8Y8JG0xW+p2zSU+24Gk3NyRS1D4MK2x2r4HiifftOQ2k+1Uz58l/jDF/SDwA0eDaZYf s8EG//U7h4+bYpUSDdzMY3D4mvZhDpHcR2lph493qscJeO6U4KCoAvUvcT0En9HprcFZ P7pnvkiDrzIg4Crnm/cJJiYCAGMqrCISJ1+yy6XOdVJh/1npFZCFHVv5Pcmd7vrEycQR cDQEsjDTY0Ky0jsg0wrj+29xDjIXSU/0V/YRNr/ev9tKIsnSVBjCqBG7+tJOjmij2X0T ipvw== X-Gm-Message-State: AOJu0YwB97YATsoYzre+aFL0CB4aRrDF/fgMUSkGdO9GDugQQmlZ6afj lYfO0OkfJK4eChwG9AMp73uXtxRKDcS6XW62Q3BO+EWpLLNwzMMbsJHGuOC3WVsEePrqsIIaar9 FddoBDA== X-Google-Smtp-Source: AGHT+IGNaBNgP+UCs6PE1+v0Nx1n7JvH0NcEGYfLYq8lKff8sM8lnw7xkwhjX2A+sZiS9ZjcR/nPVw== X-Received: by 2002:a05:6820:22a9:b0:5d5:b49c:b6f7 with SMTP id 006d021491bc7-5e18c217529mr3249398eaf.7.1725456465660; Wed, 04 Sep 2024 06:27:45 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:44 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 27/36] arm: [MVE intrinsics] remove useless v[id]wdup expanders Date: Wed, 4 Sep 2024 13:26:41 +0000 Message-Id: <20240904132650.2720446-28-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop the expanders and their declarations as builtins, now useless. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete. * config/arm/arm_mve_builtins.def (viwdupq_wb_u, vdwdupq_wb_u) (viwdupq_m_wb_u, vdwdupq_m_wb_u, viwdupq_m_n_u, vdwdupq_m_n_u) (vdwdupq_n_u, viwdupq_n_u): Delete. * config/arm/mve.md (mve_vdwdupq_n_u): Delete. (mve_vdwdupq_wb_u): Delete. (mve_vdwdupq_m_n_u): Delete. (mve_vdwdupq_m_wb_u): Delete. --- gcc/config/arm/arm-builtins.cc | 7 --- gcc/config/arm/arm_mve_builtins.def | 8 --- gcc/config/arm/mve.md | 75 ----------------------------- 3 files changed, 90 deletions(-) diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc index c9d50bf8fbb..697b91911dd 100644 --- a/gcc/config/arm/arm-builtins.cc +++ b/gcc/config/arm/arm-builtins.cc @@ -755,13 +755,6 @@ arm_ldru_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_pointer, qualifier_predicate}; #define LDRU_Z_QUALIFIERS (arm_ldru_z_qualifiers) -static enum arm_type_qualifiers -arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers[SIMD_MAX_BUILTIN_ARGS] - = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned, - qualifier_unsigned, qualifier_immediate, qualifier_predicate }; -#define QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS \ - (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers) - static enum arm_type_qualifiers arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate}; diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 7e88db4e4c3..f6962cd8cf5 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -799,14 +799,6 @@ VAR1 (STRSU_P, vstrdq_scatter_offset_p_u, v2di) VAR1 (STRSU_P, vstrdq_scatter_shifted_offset_p_u, v2di) VAR1 (STRSU_P, vstrwq_scatter_offset_p_u, v4si) VAR1 (STRSU_P, vstrwq_scatter_shifted_offset_p_u, v4si) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_wb_u, v16qi, v4si, v8hi) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_wb_u, v16qi, v4si, v8hi) -VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, viwdupq_m_wb_u, v16qi, v8hi, v4si) -VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, vdwdupq_m_wb_u, v16qi, v8hi, v4si) -VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, viwdupq_m_n_u, v16qi, v8hi, v4si) -VAR3 (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED, vdwdupq_m_n_u, v16qi, v8hi, v4si) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vdwdupq_n_u, v16qi, v4si, v8hi) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, viwdupq_n_u, v16qi, v4si, v8hi) VAR1 (STRSBWBU, vstrwq_scatter_base_wb_u, v4si) VAR1 (STRSBWBU, vstrdq_scatter_base_wb_u, v2di) VAR1 (STRSBWBU_P, vstrwq_scatter_base_wb_p_u, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 72a7e4dc868..0507e117f51 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5120,41 +5120,6 @@ (define_insn "@mve_q_m_wb_u_insn" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_u_insn")) (set_attr "length""8")]) -;; -;; [vdwdupq_n_u]) -;; -(define_expand "mve_vdwdupq_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:SI 1 "s_register_operand") - (match_operand:DI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_selective_upto_8")] - "TARGET_HAVE_MVE" -{ - rtx ignore_wb = gen_reg_rtx (SImode); - emit_insn (gen_mve_vdwdupq_wb_u_insn (operands[0], ignore_wb, - operands[1], operands[2], - operands[3])); - DONE; -}) - -;; -;; [vdwdupq_wb_u]) -;; -(define_expand "mve_vdwdupq_wb_u" - [(match_operand:SI 0 "s_register_operand") - (match_operand:SI 1 "s_register_operand") - (match_operand:DI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_selective_upto_8") - (unspec:MVE_2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] - "TARGET_HAVE_MVE" -{ - rtx ignore_vec = gen_reg_rtx (mode); - emit_insn (gen_mve_vdwdupq_wb_u_insn (ignore_vec, operands[0], - operands[1], operands[2], - operands[3])); - DONE; -}) - ;; ;; [vdwdupq_wb_u_insn, viwdupq_wb_u_insn] ;; @@ -5174,46 +5139,6 @@ (define_insn "@mve_q_wb_u_insn" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_wb_u_insn")) (set_attr "type" "mve_move")]) -;; -;; [vdwdupq_m_n_u]) -;; -(define_expand "mve_vdwdupq_m_n_u" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:DI 3 "s_register_operand") - (match_operand:SI 4 "mve_imm_selective_upto_8") - (match_operand: 5 "vpr_register_operand")] - "TARGET_HAVE_MVE" -{ - rtx ignore_wb = gen_reg_rtx (SImode); - emit_insn (gen_mve_vdwdupq_m_wb_u_insn (operands[0], ignore_wb, - operands[1], operands[2], - operands[3], operands[4], - operands[5])); - DONE; -}) - -;; -;; [vdwdupq_m_wb_u]) -;; -(define_expand "mve_vdwdupq_m_wb_u" - [(match_operand:SI 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:DI 3 "s_register_operand") - (match_operand:SI 4 "mve_imm_selective_upto_8") - (match_operand: 5 "vpr_register_operand")] - "TARGET_HAVE_MVE" -{ - rtx ignore_vec = gen_reg_rtx (mode); - emit_insn (gen_mve_vdwdupq_m_wb_u_insn (ignore_vec, operands[0], - operands[1], operands[2], - operands[3], operands[4], - operands[5])); - DONE; -}) - ;; ;; [vdwdupq_m_wb_u_insn, viwdupq_m_wb_u_insn] ;; From patchwork Wed Sep 4 13:26:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980822 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SJf9wH7d; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNnZ1jZVz1yXY for ; Wed, 4 Sep 2024 23:36:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D6ED83885C06 for ; Wed, 4 Sep 2024 13:36:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x32e.google.com (mail-ot1-x32e.google.com [IPv6:2607:f8b0:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id 6291F3861024 for ; Wed, 4 Sep 2024 13:27:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6291F3861024 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6291F3861024 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456469; cv=none; b=YTdykc+JGHHlcghaIFeptbFcybly77ubMZD8X1mMQcb+4F36WUA00lIStf7hZIEzjnOavCIFMF5J2cw3R5OHATZeoz6Sfi1T+c8xiVfUG90ng7/OOmZGNJlnY7VD1jI0j3yodDYzoakecub/0xjgBggHyFbobRrYh43qrcAq214= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456469; c=relaxed/simple; bh=gF2PJWXDmQIltTIUSOwB7VI4aN3ItMlga59dNkqgUcM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Aw19HevSsSIgToureVHCM7SjjtNi2zLrgYImgCoyBjHXWushv/PlNWKnf7We2BxJGbPjMfwGzyVkdjmnVrpC1BPGO1kZIX/7YwZFeumE3ocgtfugl2tUGudIxTzhMj/Yg0okbhKy4re78vir3C2/7SxfC1fU9htk2MUWj4475cE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32e.google.com with SMTP id 46e09a7af769-7092dd03223so2182194a34.1 for ; Wed, 04 Sep 2024 06:27:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456466; x=1726061266; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z9S0brR7p6pDp9epxeJSE8bo0puluAhN5hCweS/Qx28=; b=SJf9wH7dwhZe0ddjyiNVQCGpnXM8L4ctppqqxvVojZpMZD2vbyfq1fmIqUzYgx5A0H t2vPOMPeRzbJ1zK+kesRnaHLIAuHiKXiVmwJGdgKgZiYFHkVaX9aRbXmZLSTWY/OOCaF z42eHRulqZDa0SaWJoR84XRSzRccbmlwuNd42JlQ3zMW0qve4q+ZBDqKPp5i6jrcYdSE ljkxSG7FBV6PRpVyrVHrIU9WNGsDiN5H1DZCKqY361Xwc9gNRUGerauGsYp4/xO5q2dY yd/f4QIEGFx+4FfeU5mnBemq9DokfWeeKvbMA6/oamWCmWqylTijqdT5TD4y5pQa5/KU UL9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456466; x=1726061266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z9S0brR7p6pDp9epxeJSE8bo0puluAhN5hCweS/Qx28=; b=xPm1EY5C2JkOUjZaWWLKsM8fYeG/JbMBohY8rMLPM5L8gPj2rFugKkOgPpVLeYkdkG BssbJ/CTFdEUQqjVBDMzKI2XyVgY3+b89UnYFKuw422cmSnkNV+OR2l84a0ORaI/dA47 nZy8UIxbIaq/3Ju/56/vdTLBAecV0lQmjPapb7AhXIWwBSlkKp3GP6Urri5AhywYTZci rlcx8KvkJRwtd2kH60Qkug5/vvcZglmO8oCyABVtVQ/SKXFfz0K871VR6/OoFuJNngQf gR3NU96GCLuDnArRFApPx/9D5vealK5w2q4Qtcx2gee3CEhuXC5PelOfELoRXuiz15el B+UQ== X-Gm-Message-State: AOJu0YztTcw5XB/USjbtT+8C/B46//7K7+qgGPhlZvlEDmG6km5JkhZm WvlBMZj049RwGlm2KbGmljPwa/T6UKXXc+btK7bWfaU93yiw9yR97Av00PtZ+fLimpBK+yzL2N4 4Sg5MrA== X-Google-Smtp-Source: AGHT+IH48C/nO5UZmYVTMCUUN4lSjJB3iunOvhtFb2oUy5FvoEYMnNN/UXnuODs++TBYpNLfHdKb5A== X-Received: by 2002:a05:6830:44a8:b0:709:5890:691d with SMTP id 46e09a7af769-70f706e36b4mr14886810a34.4.1725456466329; Wed, 04 Sep 2024 06:27:46 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:45 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 28/36] arm: [MVE intrinsics] add vshlc shape Date: Wed, 4 Sep 2024 13:26:42 +0000 Message-Id: <20240904132650.2720446-29-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vshlc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vshlc): New. * config/arm/arm-mve-builtins-shapes.h (vshlc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 44 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 45 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 510f15ae73a..ee6b5b0a7b1 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2418,6 +2418,50 @@ struct vpsel_def : public overloaded_base<0> }; SHAPE (vpsel) +/* _t vfoo[_t0](T0, uint32_t* , const int) + + Check that 'imm' is in [1..32]. + + Example: vshlcq. + uint8x16_t [__arm_]vshlcq[_u8](uint8x16_t a, uint32_t *b, const int imm) + uint8x16_t [__arm_]vshlcq_m[_u8](uint8x16_t a, uint32_t *b, const int imm, mve_pred16_t p) */ +struct vshlc_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v0,as,su64", group, MODE_none, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index type; + if (!r.check_gp_argument (3, i, nargs) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + /* Check that arg #2 is a pointer. */ + if (!POINTER_TYPE_P (r.get_argument_type (i - 1))) + return error_mark_node; + + if (!r.require_integer_immediate (i)) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (2, 1, 32); + } +}; +SHAPE (vshlc) + } /* end namespace arm_mve */ #undef SHAPE diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index b3d08ab3866..d73c74c8ad7 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -85,6 +85,7 @@ namespace arm_mve extern const function_shape *const viddup; extern const function_shape *const vidwdup; extern const function_shape *const vpsel; + extern const function_shape *const vshlc; } /* end namespace arm_mve::shapes */ } /* end namespace arm_mve */ From patchwork Wed Sep 4 13:26:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980824 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=v3IEitGZ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNq424YBz1yXY for ; Wed, 4 Sep 2024 23:38:08 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 607EF3864835 for ; Wed, 4 Sep 2024 13:38:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id 7853F3858428 for ; Wed, 4 Sep 2024 13:27:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7853F3858428 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7853F3858428 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; cv=none; b=Jwi/aRseTOX/2olivwFcYrYJQFPdQjn/idyGhWN9jIpn9wEybJHCz0Pe/DmbDxmRxywUACH8reblax9q8NnQ6TKD46i69aP2EHNqqxYILxXhz/tk/UiiuWECEBhkPkQPx5M5uxnVOkARVLjoS0Kiqq/2bIlgosMhXaCv6n1GAcA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; c=relaxed/simple; bh=jDl4+zBwgbbCgoifQ9DNbNhc3Lv7Ji9KsUJ/6/RtNsk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IIf1SN2i/fuYPIxTxoyoDTY7pjRHtSDdUUcX7fIdC64JA6p2aEy90O/nI8gyitrURKp3MWhN12KoVDXKDUnJW8EKHnWxlX8ZurRkN6VlLGKbeNQtiWukApHNO1bG+gR0tfJNrJGWrgw/IcvyBwJMmvSbYHl0ESxCXu23Ewd3S4A= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5df9a9f7fe2so4084311eaf.2 for ; Wed, 04 Sep 2024 06:27:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456467; x=1726061267; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JZLzS8rVq+IHeAOSl7bdx9rWFvQ0iYWIYr1jgBMgh3g=; b=v3IEitGZy57PSM8IqIYRQNSmMzcJT1ecGG4yXAllfxAgJYh7cTDkDGNKYFwhXuIDOa dY2JLArbNzYuFzUuWnIf4bdvNf695FRY0LLUzMplsAYTVyLuB8TtY3YdWdA3CoSHGAmy V+/q+gsSkbyy9hpE9b8JEmldA75P9jMNKvNsnsZ4nC/b+BHepvkA3wIupkUrZ7uasVnf bTHyuNlhOeLZGLefNg5BH4WRy0eD/uUhchMrFc1x/gNKLvGfHV5G48h1I+97znA7IAzY GCIAinkVFF+dOgl9GtHpWglbOE2rXBYonxv9NqGnE+QDWmRLGq34oTmr1dwHmDCmGWJ2 oIIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456467; x=1726061267; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JZLzS8rVq+IHeAOSl7bdx9rWFvQ0iYWIYr1jgBMgh3g=; b=SjOWi/k4BX/6I4cPSWp/PVAUJVzSlSV42USZoqnBwTZO0tOISFsksjmec/MyRQCLV1 HPdgTW+KM3AlXcAGrz8/suwluzigooZUTinThPvEp8srEboOw1Y0PQsf+gIz/o/JcALm drNdltVoWENJe1cXBI0PDREmNRP5s6FCoTBoEf2in8hoEKjpJFr/CdiIsoZUN98L+Ha0 eNPi6ESy7lwE3xUWTqH1Eg5wLrwplaXN3tO06IWIny2A5qa531CsOVKWCrzCPv6UUMdd cwY7RhwLiE9NONXoV9yrWgjEN72gi2Ux3g4FVssBlPbhoyCCddfj+N2pHWj0VJPtVyRt PTWg== X-Gm-Message-State: AOJu0YxwsZeQz+GSgx+G5dDF1S41v69Vg4Nejn3H+lzPpFALJOnRDpZT DaQP+Q3gMWW2YjU42zFbzOd6eG8UDTuUKh2GuwLQnmkNPzH8lbG1tuRZaM2hS+xheTMNxN7FsfG BDKxVdw== X-Google-Smtp-Source: AGHT+IH8TjZSqL8vTBgZUyt3lM7WptSHIGdm/PiHIjUL0ON6XqfT22XGIsjRpTkzOwHv2q11hHX8ew== X-Received: by 2002:a05:6820:1ca4:b0:5da:a26b:ce6e with SMTP id 006d021491bc7-5dfacf20303mr17442110eaf.3.1725456467387; Wed, 04 Sep 2024 06:27:47 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:46 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 29/36] arm: [MVE intrinsics] rework vshlcq Date: Wed, 4 Sep 2024 13:26:43 +0000 Message-Id: <20240904132650.2720446-30-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vshlc using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New. (vshlc): New. * config/arm/arm-mve-builtins-base.def (vshlcq): New. * config/arm/arm-mve-builtins-base.h (vshlcq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vshlc. * config/arm/arm_mve.h (vshlcq): Delete. (vshlcq_m): Delete. (vshlcq_s8): Delete. (vshlcq_u8): Delete. (vshlcq_s16): Delete. (vshlcq_u16): Delete. (vshlcq_s32): Delete. (vshlcq_u32): Delete. (vshlcq_m_s8): Delete. (vshlcq_m_u8): Delete. (vshlcq_m_s16): Delete. (vshlcq_m_u16): Delete. (vshlcq_m_s32): Delete. (vshlcq_m_u32): Delete. (__arm_vshlcq_s8): Delete. (__arm_vshlcq_u8): Delete. (__arm_vshlcq_s16): Delete. (__arm_vshlcq_u16): Delete. (__arm_vshlcq_s32): Delete. (__arm_vshlcq_u32): Delete. (__arm_vshlcq_m_s8): Delete. (__arm_vshlcq_m_u8): Delete. (__arm_vshlcq_m_s16): Delete. (__arm_vshlcq_m_u16): Delete. (__arm_vshlcq_m_s32): Delete. (__arm_vshlcq_m_u32): Delete. (__arm_vshlcq): Delete. (__arm_vshlcq_m): Delete. * config/arm/mve.md (mve_vshlcq_): Add '@' prefix. (mve_vshlcq_m_): Likewise. --- gcc/config/arm/arm-mve-builtins-base.cc | 72 +++++++ gcc/config/arm/arm-mve-builtins-base.def | 1 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 1 + gcc/config/arm/arm_mve.h | 233 ----------------------- gcc/config/arm/mve.md | 4 +- 6 files changed, 77 insertions(+), 235 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index eaf054d9823..9f1f7e69c57 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -483,6 +483,77 @@ public: } }; +/* Map the vshlc function directly to CODE (UNSPEC, M) where M is the vector + mode associated with type suffix 0. We need this special case because the + intrinsics derefrence the second parameter and update its contents. */ +class vshlc_impl : public function_base +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_MEMORY | CP_READ_MEMORY; + } + + tree + memory_scalar_type (const function_instance &) const override + { + return get_typenode_from_name (UINT32_TYPE); + } + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + rtx insns, carry_ptr, carry, new_carry; + int carry_arg_no; + + if (! e.type_suffix (0).integer_p) + gcc_unreachable (); + + if (e.mode_suffix_id != MODE_none) + gcc_unreachable (); + + carry_arg_no = 1; + + carry = gen_reg_rtx (SImode); + carry_ptr = e.args[carry_arg_no]; + emit_insn (gen_rtx_SET (carry, gen_rtx_MEM (SImode, carry_ptr))); + e.args[carry_arg_no] = carry; + + new_carry = gen_reg_rtx (SImode); + e.args.quick_insert (0, new_carry); + + switch (e.pred) + { + case PRED_none: + /* No predicate. */ + code = e.type_suffix (0).unsigned_p + ? code_for_mve_vshlcq (VSHLCQ_U, mode) + : code_for_mve_vshlcq (VSHLCQ_S, mode); + insns = e.use_exact_insn (code); + break; + + case PRED_m: + /* "m" predicate. */ + code = e.type_suffix (0).unsigned_p + ? code_for_mve_vshlcq_m (VSHLCQ_M_U, mode) + : code_for_mve_vshlcq_m (VSHLCQ_M_S, mode); + insns = e.use_cond_insn (code, 0); + break; + + default: + gcc_unreachable (); + } + + /* Update carry. */ + emit_insn (gen_rtx_SET (gen_rtx_MEM (Pmode, carry_ptr), new_carry)); + + return insns; + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -815,6 +886,7 @@ FUNCTION_WITH_M_N_NO_F (vrshlq, VRSHLQ) FUNCTION_ONLY_N_NO_F (vrshrnbq, VRSHRNBQ) FUNCTION_ONLY_N_NO_F (vrshrntq, VRSHRNTQ) FUNCTION_ONLY_N_NO_F (vrshrq, VRSHRQ) +FUNCTION (vshlcq, vshlc_impl,) FUNCTION_ONLY_N_NO_F (vshllbq, VSHLLBQ) FUNCTION_ONLY_N_NO_F (vshlltq, VSHLLTQ) FUNCTION_WITH_M_N_R (vshlq, VSHLQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index c5f1e8a197b..bd69f06d7e4 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -152,6 +152,7 @@ DEF_MVE_FUNCTION (vrshlq, binary_round_lshift, all_integer, mx_or_none) DEF_MVE_FUNCTION (vrshrnbq, binary_rshift_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vrshrntq, binary_rshift_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vrshrq, binary_rshift, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vshlcq, vshlc, all_integer, m_or_none) DEF_MVE_FUNCTION (vshllbq, binary_widen_n, integer_8_16, mx_or_none) DEF_MVE_FUNCTION (vshlltq, binary_widen_n, integer_8_16, mx_or_none) DEF_MVE_FUNCTION (vshlq, binary_lshift, all_integer, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index ed8761318bb..1eff50d3c6d 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -188,6 +188,7 @@ extern const function_base *const vrshlq; extern const function_base *const vrshrnbq; extern const function_base *const vrshrntq; extern const function_base *const vrshrq; +extern const function_base *const vshlcq; extern const function_base *const vshllbq; extern const function_base *const vshlltq; extern const function_base *const vshlq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 1180421bf0a..252744596ce 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -810,6 +810,7 @@ function_instance::has_inactive_argument () const || (base == functions::vrshlq && mode_suffix_id == MODE_n) || base == functions::vrshrnbq || base == functions::vrshrntq + || base == functions::vshlcq || base == functions::vshrnbq || base == functions::vshrntq || base == functions::vsliq diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 37b0fedc4ff..c577c373e98 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -42,7 +42,6 @@ #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) -#define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) #define vstrbq_scatter_offset(__base, __offset, __value) __arm_vstrbq_scatter_offset(__base, __offset, __value) #define vstrbq(__addr, __value) __arm_vstrbq(__addr, __value) #define vstrwq_scatter_base(__addr, __offset, __value) __arm_vstrwq_scatter_base(__addr, __offset, __value) @@ -101,7 +100,6 @@ #define vld4q(__addr) __arm_vld4q(__addr) #define vsetq_lane(__a, __b, __idx) __arm_vsetq_lane(__a, __b, __idx) #define vgetq_lane(__a, __idx) __arm_vgetq_lane(__a, __idx) -#define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) #define vst4q_s8( __addr, __value) __arm_vst4q_s8( __addr, __value) @@ -113,12 +111,6 @@ #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) #define vpnot(__a) __arm_vpnot(__a) -#define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) -#define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) -#define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) -#define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) -#define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) -#define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) #define vstrbq_u8( __addr, __value) __arm_vstrbq_u8( __addr, __value) #define vstrbq_u16( __addr, __value) __arm_vstrbq_u16( __addr, __value) @@ -421,12 +413,6 @@ #define urshrl(__p0, __p1) __arm_urshrl(__p0, __p1) #define lsll(__p0, __p1) __arm_lsll(__p0, __p1) #define asrl(__p0, __p1) __arm_asrl(__p0, __p1) -#define vshlcq_m_s8(__a, __b, __imm, __p) __arm_vshlcq_m_s8(__a, __b, __imm, __p) -#define vshlcq_m_u8(__a, __b, __imm, __p) __arm_vshlcq_m_u8(__a, __b, __imm, __p) -#define vshlcq_m_s16(__a, __b, __imm, __p) __arm_vshlcq_m_s16(__a, __b, __imm, __p) -#define vshlcq_m_u16(__a, __b, __imm, __p) __arm_vshlcq_m_u16(__a, __b, __imm, __p) -#define vshlcq_m_s32(__a, __b, __imm, __p) __arm_vshlcq_m_s32(__a, __b, __imm, __p) -#define vshlcq_m_u32(__a, __b, __imm, __p) __arm_vshlcq_m_u32(__a, __b, __imm, __p) #endif /* For big-endian, GCC's vector indices are reversed within each 64 bits @@ -502,60 +488,6 @@ __arm_vpnot (mve_pred16_t __a) return __builtin_mve_vpnotv16bi (__a); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_s8 (int8x16_t __a, uint32_t * __b, const int __imm) -{ - int8x16_t __res = __builtin_mve_vshlcq_vec_sv16qi (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_sv16qi (__a, *__b, __imm); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_u8 (uint8x16_t __a, uint32_t * __b, const int __imm) -{ - uint8x16_t __res = __builtin_mve_vshlcq_vec_uv16qi (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_uv16qi (__a, *__b, __imm); - return __res; -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_s16 (int16x8_t __a, uint32_t * __b, const int __imm) -{ - int16x8_t __res = __builtin_mve_vshlcq_vec_sv8hi (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_sv8hi (__a, *__b, __imm); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_u16 (uint16x8_t __a, uint32_t * __b, const int __imm) -{ - uint16x8_t __res = __builtin_mve_vshlcq_vec_uv8hi (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_uv8hi (__a, *__b, __imm); - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_s32 (int32x4_t __a, uint32_t * __b, const int __imm) -{ - int32x4_t __res = __builtin_mve_vshlcq_vec_sv4si (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_sv4si (__a, *__b, __imm); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_u32 (uint32x4_t __a, uint32_t * __b, const int __imm) -{ - uint32x4_t __res = __builtin_mve_vshlcq_vec_uv4si (__a, *__b, __imm); - *__b = __builtin_mve_vshlcq_carry_uv4si (__a, *__b, __imm); - return __res; -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset_s8 (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -2404,60 +2336,6 @@ __arm_srshr (int32_t value, const int shift) return __builtin_mve_srshr_si (value, shift); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_s8 (int8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - int8x16_t __res = __builtin_mve_vshlcq_m_vec_sv16qi (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_sv16qi (__a, *__b, __imm, __p); - return __res; -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_u8 (uint8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - uint8x16_t __res = __builtin_mve_vshlcq_m_vec_uv16qi (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_uv16qi (__a, *__b, __imm, __p); - return __res; -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_s16 (int16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - int16x8_t __res = __builtin_mve_vshlcq_m_vec_sv8hi (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_sv8hi (__a, *__b, __imm, __p); - return __res; -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_u16 (uint16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - uint16x8_t __res = __builtin_mve_vshlcq_m_vec_uv8hi (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_uv8hi (__a, *__b, __imm, __p); - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_s32 (int32x4_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - int32x4_t __res = __builtin_mve_vshlcq_m_vec_sv4si (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_sv4si (__a, *__b, __imm, __p); - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m_u32 (uint32x4_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vshlcq_m_vec_uv4si (__a, *__b, __imm, __p); - *__b = __builtin_mve_vshlcq_m_carry_uv4si (__a, *__b, __imm, __p); - return __res; -} - #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void @@ -2868,48 +2746,6 @@ __arm_vst4q (uint32_t * __addr, uint32x4x4_t __value) __arm_vst4q_u32 (__addr, __value); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s8 (__a, __b, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint8x16_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u8 (__a, __b, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int16x8_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s16 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint16x8_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u16 (__a, __b, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int32x4_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s32 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u32 (__a, __b, __imm); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -4240,48 +4076,6 @@ __arm_vgetq_lane (uint64x2_t __a, const int __idx) return __arm_vgetq_lane_u64 (__a, __idx); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (int8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_s8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (uint8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_u8 (__a, __b, __imm, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (int16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_s16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (uint16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_u16 (__a, __b, __imm, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (int32x4_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_s32 (__a, __b, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq_m (uint32x4_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) -{ - return __arm_vshlcq_m_u32 (__a, __b, __imm, __p); -} - #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void @@ -4887,15 +4681,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlcq_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlcq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - #define __arm_vld1q_z(p0,p1) ( \ _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ @@ -5234,15 +5019,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16_t_ptr][__ARM_mve_type_uint16x8x4_t]: __arm_vst4q_u16 (__ARM_mve_coerce_u16_ptr(p0, uint16_t *), __ARM_mve_coerce(__p1, uint16x8x4_t)), \ int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4x4_t]: __arm_vst4q_u32 (__ARM_mve_coerce_u32_ptr(p0, uint32_t *), __ARM_mve_coerce(__p1, uint32x4x4_t)));}) -#define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlcq_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlcq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - #define __arm_vstrwq_scatter_base(p0,p1,p2) ({ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p2)])0, \ int (*)[__ARM_mve_type_int32x4_t]: __arm_vstrwq_scatter_base_s32(p0, p1, __ARM_mve_coerce(__p2, int32x4_t)), \ @@ -5615,15 +5391,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint16x8_t]: __arm_vldrbq_gather_offset_u16(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint32x4_t]: __arm_vldrbq_gather_offset_u32(__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vshlcq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2, p3), \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vshlcq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2, p3), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vshlcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2, p3), \ - int (*)[__ARM_mve_type_uint8x16_t]: __arm_vshlcq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1, p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2, p3));}) - #define __arm_vstrbq(p0,p1) ({ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p1)])0, \ int (*)[__ARM_mve_type_int8_t_ptr][__ARM_mve_type_int8x16_t]: __arm_vstrbq_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), __ARM_mve_coerce(__p1, int8x16_t)), \ diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 0507e117f51..83a1eb48533 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1719,7 +1719,7 @@ (define_expand "mve_vshlcq_carry_" DONE; }) -(define_insn "mve_vshlcq_" +(define_insn "@mve_vshlcq_" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") (match_operand:SI 3 "s_register_operand" "1") @@ -6279,7 +6279,7 @@ (define_expand "mve_vshlcq_m_carry_" DONE; }) -(define_insn "mve_vshlcq_m_" +(define_insn "@mve_vshlcq_m_" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") (match_operand:SI 3 "s_register_operand" "1") From patchwork Wed Sep 4 13:26:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980825 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=oYaautHM; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNqy3CYJz1yXY for ; Wed, 4 Sep 2024 23:38:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 44896385020D for ; Wed, 4 Sep 2024 13:38:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 7B218385B508 for ; Wed, 4 Sep 2024 13:27:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B218385B508 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7B218385B508 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; cv=none; b=Omfkv7drW71rTWsb1xN2B9ghPPg27dSrkJ1ITqN9y/DVp/G3UrmZ60yWEu8vuPcCkkfM2xVCIH/U3rWrOVJyPQw4is/d7Q5cnP8/ETuGTGox9mmvCnQbDnrnC7X0LpZ+B/aIDF6k1384oj+s6ABaNcnKkiFzjsU1if21rppdvqI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; c=relaxed/simple; bh=HdQlgu/qxSGS7sko2By8VyhDpaOlAz2FYPoWbotC9Mg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=h0sZKZIgHDVTCjTRsyUex5A75jM9TG4Kkg0APx1QvCAdYO6cD7wkfva80KJVivgRq3H9vAd7/7Ao3tFVS1aQV+0Ea1AUrWSP17p37e3UYB6op6PjsxzQs4fxUU+Du6Z7ObHh72Y2hON39LNBfFtuL91H0XYHc9Ri9CY8ZACF+Dk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5e017808428so2245191eaf.1 for ; Wed, 04 Sep 2024 06:27:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456468; x=1726061268; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=egFHNrqiNVVJPy/voPdBlY0wFuD86ERJYIyYLNoUVlg=; b=oYaautHM/BD2Fhr3U5fhKba61noQsj0bs4aXQUT6O2cfhsTXbzjPlvPCoMRIJh0zAt z/ZB3kAH0WvVSr/D0P301gv/rJg7UQiSPeQED1r8uDWDzIqSXK/LLgUjtxmoYqq7eqeB 62msN4aFGewoQ/Xm5vuI5BuaflRHJYPKf6423JNbSExPdGgvcyH5FRUpM69pPsVWc1pM DXl1Uww5pUn55ix1x6+nJKbAXUY7o+H2UNHQlisasiCTy2mCqjh6GIseIYe/hP2ZjqPd A3NZuq4o/TBDT7wT0pufURbo/sH2wUaTvujQFRk2PGZgJYKApCSCwfz36ophf4WzEYGK /akQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456468; x=1726061268; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=egFHNrqiNVVJPy/voPdBlY0wFuD86ERJYIyYLNoUVlg=; b=X2dweEoGLFe6xCeuk/+1b9uV2EsuKpLTfYFYSD8CrfU2bEjTcDZPPyV7zMF5kl3Ix1 Za29+yeUntxkuMAhMYfKzEfpWRS9wBCReZH9Rfxi3kzdjEOFpubEoUBev+ttblslPUsU iT4yuj3xsiesx3EpSe2QQX2gHU57yHLRWl4vi21rhEx6Bi+hpE2T8r65zx0/QCfCb+g7 dsl1PTmVwZhPMDC/7guWyJ0x9yp/juq8xFWeLV4SYO4GcIrYM60cV7Opryk2E9aF5zjE 4tWTkRa6YrtL4XteaW64K79G7+UvcavXKTdB8eLi17mZdr6H1RCV9ceBhXsTK8Z2coZT feyw== X-Gm-Message-State: AOJu0YwBW9MtuKhn7D9PHDyxYRgruITX4L7SEGD1dv74ZmWNgRCmOK6Y GiVKFAXw3X9SCnFN0/iHiiV1J2htolM1/UvruA8jk7GseA/+sBg4TgnREtVqrY0qTucNfKpg7Aq QPiKQAw== X-Google-Smtp-Source: AGHT+IH8yocTEDUXsG35A2enx9MJDA28sRD5n3FdA6sGMXyJNZu0obxhIrmHPfVXwN243fe/FQM5/Q== X-Received: by 2002:a05:6820:2013:b0:5dc:f89e:d77f with SMTP id 006d021491bc7-5dfad00acebmr20022887eaf.7.1725456468509; Wed, 04 Sep 2024 06:27:48 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:47 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 30/36] arm: [MVE intrinsics] remove vshlcq useless expanders Date: Wed, 4 Sep 2024 13:26:44 +0000 Message-Id: <20240904132650.2720446-31-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Since we rewrote the implementation of vshlcq intrinsics, we no longer need these expanders. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_ternop_unone_none_unone_imm_qualifiers) (-arm_ternop_none_none_unone_imm_qualifiers): Delete. * config/arm/arm_mve_builtins.def (vshlcq_m_vec_s) (vshlcq_m_carry_s, vshlcq_m_vec_u, vshlcq_m_carry_u): Delete. * config/arm/mve.md (mve_vshlcq_vec_): Delete. (mve_vshlcq_carry_): Delete. (mve_vshlcq_m_vec_): Delete. (mve_vshlcq_m_carry_): Delete. --- gcc/config/arm/arm-builtins.cc | 13 ------- gcc/config/arm/arm_mve_builtins.def | 8 ---- gcc/config/arm/mve.md | 60 ----------------------------- 3 files changed, 81 deletions(-) diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc index 697b91911dd..621fffec6d3 100644 --- a/gcc/config/arm/arm-builtins.cc +++ b/gcc/config/arm/arm-builtins.cc @@ -476,19 +476,6 @@ arm_ternop_unone_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS \ (arm_ternop_unone_unone_none_none_qualifiers) -static enum arm_type_qualifiers -arm_ternop_unone_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] - = { qualifier_unsigned, qualifier_none, qualifier_unsigned, - qualifier_immediate }; -#define TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS \ - (arm_ternop_unone_none_unone_imm_qualifiers) - -static enum arm_type_qualifiers -arm_ternop_none_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] - = { qualifier_none, qualifier_none, qualifier_unsigned, qualifier_immediate }; -#define TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS \ - (arm_ternop_none_none_unone_imm_qualifiers) - static enum arm_type_qualifiers arm_ternop_unone_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_unsigned, qualifier_none, diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index f6962cd8cf5..9cce644858d 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -288,15 +288,11 @@ VAR1 (TERNOP_UNONE_UNONE_UNONE_UNONE, vrmlaldavhaq_u, v4si) VAR2 (TERNOP_NONE_NONE_UNONE_PRED, vcvtq_m_to_f_u, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_PRED, vcvtq_m_to_f_s, v8hf, v4sf) VAR2 (TERNOP_PRED_NONE_NONE_PRED, vcmpeqq_m_f, v8hf, v4sf) -VAR3 (TERNOP_UNONE_NONE_UNONE_IMM, vshlcq_carry_s, v16qi, v8hi, v4si) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vshlcq_carry_u, v16qi, v8hi, v4si) VAR2 (TERNOP_UNONE_UNONE_NONE_IMM, vqrshrunbq_n_s, v8hi, v4si) VAR3 (TERNOP_UNONE_UNONE_NONE_NONE, vabavq_s, v16qi, v8hi, v4si) VAR3 (TERNOP_UNONE_UNONE_UNONE_UNONE, vabavq_u, v16qi, v8hi, v4si) VAR2 (TERNOP_UNONE_UNONE_NONE_PRED, vcvtaq_m_u, v8hi, v4si) VAR2 (TERNOP_NONE_NONE_NONE_PRED, vcvtaq_m_s, v8hi, v4si) -VAR3 (TERNOP_UNONE_UNONE_UNONE_IMM, vshlcq_vec_u, v16qi, v8hi, v4si) -VAR3 (TERNOP_NONE_NONE_UNONE_IMM, vshlcq_vec_s, v16qi, v8hi, v4si) VAR4 (TERNOP_UNONE_UNONE_UNONE_PRED, vpselq_u, v16qi, v8hi, v4si, v2di) VAR4 (TERNOP_NONE_NONE_NONE_PRED, vpselq_s, v16qi, v8hi, v4si, v2di) VAR3 (TERNOP_UNONE_UNONE_UNONE_PRED, vrev64q_m_u, v16qi, v8hi, v4si) @@ -862,7 +858,3 @@ VAR1 (UQSHL, urshr_, si) VAR1 (UQSHL, urshrl_, di) VAR1 (UQSHL, uqshl_, si) VAR1 (UQSHL, uqshll_, di) -VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_vec_s, v16qi, v8hi, v4si) -VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_carry_s, v16qi, v8hi, v4si) -VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi, v8hi, v4si) -VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 83a1eb48533..eb603b3d9a7 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1691,34 +1691,6 @@ (define_insn "@mve_q_" ;; ;; [vshlcq_u vshlcq_s] ;; -(define_expand "mve_vshlcq_vec_" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_32") - (unspec:MVE_2 [(const_int 0)] VSHLCQ)] - "TARGET_HAVE_MVE" -{ - rtx ignore_wb = gen_reg_rtx (SImode); - emit_insn(gen_mve_vshlcq_(operands[0], ignore_wb, operands[1], - operands[2], operands[3])); - DONE; -}) - -(define_expand "mve_vshlcq_carry_" - [(match_operand:SI 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_32") - (unspec:MVE_2 [(const_int 0)] VSHLCQ)] - "TARGET_HAVE_MVE" -{ - rtx ignore_vec = gen_reg_rtx (mode); - emit_insn(gen_mve_vshlcq_(ignore_vec, operands[0], operands[1], - operands[2], operands[3])); - DONE; -}) - (define_insn "@mve_vshlcq_" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") @@ -6247,38 +6219,6 @@ (define_insn "mve_sqshll_di" ;; ;; [vshlcq_m_u vshlcq_m_s] ;; -(define_expand "mve_vshlcq_m_vec_" - [(match_operand:MVE_2 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_32") - (match_operand: 4 "vpr_register_operand") - (unspec:MVE_2 [(const_int 0)] VSHLCQ_M)] - "TARGET_HAVE_MVE" -{ - rtx ignore_wb = gen_reg_rtx (SImode); - emit_insn (gen_mve_vshlcq_m_ (operands[0], ignore_wb, operands[1], - operands[2], operands[3], - operands[4])); - DONE; -}) - -(define_expand "mve_vshlcq_m_carry_" - [(match_operand:SI 0 "s_register_operand") - (match_operand:MVE_2 1 "s_register_operand") - (match_operand:SI 2 "s_register_operand") - (match_operand:SI 3 "mve_imm_32") - (match_operand: 4 "vpr_register_operand") - (unspec:MVE_2 [(const_int 0)] VSHLCQ_M)] - "TARGET_HAVE_MVE" -{ - rtx ignore_vec = gen_reg_rtx (mode); - emit_insn (gen_mve_vshlcq_m_ (ignore_vec, operands[0], - operands[1], operands[2], - operands[3], operands[4])); - DONE; -}) - (define_insn "@mve_vshlcq_m_" [(set (match_operand:MVE_2 0 "s_register_operand" "=w") (unspec:MVE_2 [(match_operand:MVE_2 2 "s_register_operand" "0") From patchwork Wed Sep 4 13:26:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980817 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=mfBRJjXy; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNlW4Xvrz1yg3 for ; Wed, 4 Sep 2024 23:35:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 784FD384979B for ; Wed, 4 Sep 2024 13:35:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id B40CC3861029 for ; Wed, 4 Sep 2024 13:27:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B40CC3861029 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B40CC3861029 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; cv=none; b=KkYOEHQlaRWlUy3CHTwDc/OxsFZVxQiaMqNVC8ushWTcpxWbVo17KQn68r0pEAim87EB+U/uU880GzFyzjWdvOEt/vhRMQ4A4T877uVOcieScE1ikbRYsZ9BGvIODOGUNhkxPzHszNI6j7pDLFQPtNhrqyUNM0ALRtKazQk2Bhw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456473; c=relaxed/simple; bh=Nul9bwulVxdef6mKRjndZA6/BI9oTZWbqiqjdm1hBnY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IW8QUIwFd9WGB945YGb7E8SLFY0XkI8jM4BUE12MctLsrWNH04UE6qHj/xbuWVRGYBFzrZgFDHiH/AqDlcRZtij/c2w/DWea8HqZ75zpSbV+7Bif9oBKsUIUh+YnaszMQgX9vwJBe80cgGS2NkS57coJFrkWxl1GzmZHOE4CSkk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5dca990cf58so4098813eaf.1 for ; Wed, 04 Sep 2024 06:27:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456469; x=1726061269; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KPOXW7YkAnZWNw+5R2LFSJp93Z6RpFAgiMCYqGU/fwg=; b=mfBRJjXyoh+Sp4RHUMt+5bfJJFYB2O+f8dUNHeQa1IPO+vTP9kiIk8IaUWGBqhOssk GDz2VZQHSZ4aizBbAOU6pNKiaH2uWchUZupiwcAcZ9JIzPV4devc9fB4bsU0pRTfSkIl CdZ4RUDP090NoDyrYbYnEMgJEKvt72DhS4xFWDiKHVeJbNEXsizWisYNoyTJl3Ndi70t Qgr+H03eDwTP6WwJfqv4peVkJQI0mAKrjVrBcPh8dheTzEz1iy+Sqb2wgos7GsXqw0vJ WxY2Sav/GgAxXthZ80Ij/h7sbeG7/PWtmvCh7d/6gu0ltSS9pc/on/yqdcwanxyRu+y2 Md5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456469; x=1726061269; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KPOXW7YkAnZWNw+5R2LFSJp93Z6RpFAgiMCYqGU/fwg=; b=J5vvR3qx45PmBQ7HEfgoZRZLgrlUwKpw3Q4y73MdjrpGDkrPPorAuQdjaI2kT8oibc FKb4ZJTM4qN0CEiIvLouj1LYNACXgqnvsPpl9mFmRuxCSWb0ESw41CZtfrJFwH0tP5TO gfsxERcRmwnlWqoLnDbYpECHM9OTh4u6iZkkWYQ++iPNnpoXxUsrlAlH07tnPoO01DDe oEIV1lNcFu0OxZa1J3KBADV+wz3yoalVKxFOKpCig+fi6PV/Pind7bMeqpMPevFV0sXf pu8rdG4gnKnsjzyF1/FR5SFzHxB4WN+NaACsUrBa0PKNBn+v1MRj/2pdkggKKbny8hjv wklQ== X-Gm-Message-State: AOJu0YwxUW/kiQQQytb9akWmOJ4REIbEvh88zkc8QBnMSRqJAGJEQuac r+mCU2dUiiKltfYdHXppx6cHlQCnO5Fu34XnXqGBjMZiiSBgwnLprHqn7SjRvMdsj5SsxVtmaKt fDyu3Eg== X-Google-Smtp-Source: AGHT+IGY5bdU3oJXJeZgr4pnfGXt+9ROJ8Qj3kgMZN5UiOAmmPPXDOAbnhKQaWyxwkFlvVpWHU128Q== X-Received: by 2002:a05:6820:2013:b0:5df:8262:bce4 with SMTP id 006d021491bc7-5dfacf8fedamr21639836eaf.7.1725456469594; Wed, 04 Sep 2024 06:27:49 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:49 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 31/36] arm: [MVE intrinsics] add vadc_vsbc shape Date: Wed, 4 Sep 2024 13:26:45 +0000 Message-Id: <20240904132650.2720446-32-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vadc_vsbc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New. * config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 36 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 37 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index ee6b5b0a7b1..9deed178966 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -1996,6 +1996,42 @@ struct unary_widen_acc_def : public overloaded_base<0> }; SHAPE (unary_widen_acc) +/* _t vfoo[_t0](T0, T0, uint32_t*) + + Example: vadcq. + int32x4_t [__arm_]vadcq[_s32](int32x4_t a, int32x4_t b, unsigned *carry) + int32x4_t [__arm_]vadcq_m[_s32](int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned *carry, mve_pred16_t p) */ +struct vadc_vsbc_def : public overloaded_base<0> +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v0,v0,as", group, MODE_none, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index type; + if (!r.check_gp_argument (3, i, nargs) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + if (!r.require_matching_vector_type (1, type)) + return error_mark_node; + + /* Check that last arg is a pointer. */ + if (!POINTER_TYPE_P (r.get_argument_type (i))) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } +}; +SHAPE (vadc_vsbc) + /* mve_pred16_t foo_t0(uint32_t) Example: vctp16q. diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index d73c74c8ad7..e53381d8f36 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -77,6 +77,7 @@ namespace arm_mve extern const function_shape *const unary_n; extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; + extern const function_shape *const vadc_vsbc; extern const function_shape *const vctp; extern const function_shape *const vcvt; extern const function_shape *const vcvt_f16_f32; From patchwork Wed Sep 4 13:26:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980827 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=zMA++r1P; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNsN2y6fz1yg3 for ; Wed, 4 Sep 2024 23:40:08 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 190483861037 for ; Wed, 4 Sep 2024 13:40:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by sourceware.org (Postfix) with ESMTPS id 62A61384F4B8 for ; Wed, 4 Sep 2024 13:27:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 62A61384F4B8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 62A61384F4B8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456477; cv=none; b=nPy8e3FZ61TIlcDmKHyYWU/R+tyQtbJ8MUtFxif4OFtLneXwTvuhQGQh+Hp//OUvVzbs903N8o1vg2iodw/5ZvNDy0NIKMQPSX/AibltpX2sGjPSXpAi9twSsPBkc7gu980nmg9Z9Zv4ePSUsNJfpPO7AF3/eYqjesWqAiWlFTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456477; c=relaxed/simple; bh=npqcIQgnDf2rJz7IYw9hkce0tGcNf0pJ6aYRTKX1JxM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BMh06wTaKJX+3A19iJNsKzOphqaWM4ggxyr+PkONOVdemuhZ6QsLuJW1heLfBG/RE//GuYTfTkxLTV36x54zcW8sCKVprnfPNvngNfhf5NWdqBMQ5wKhd71luc7r1CXcU+60ZBZc3c1DGQ2Tjpuphv6xdfDzMX+qP8VOjIsKFrU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2c.google.com with SMTP id 006d021491bc7-5d5f24d9df8so3734852eaf.2 for ; Wed, 04 Sep 2024 06:27:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456470; x=1726061270; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=foYkw3mOcaEaw1RGlOq5OF5JpaC+7G6FcEJCZuAm3gc=; b=zMA++r1PFHPp0fSm8kjG66h83A9pJ3aco2bX+hsJvj7VHGL/kBHy1w0y+6eq5pNCHn S6FTGvML1FS2iDzux/cOpZT/o3cdLLNnjvz0kReGKQFnAF+9nP4EmavV2nx/cYVg3w6q WNHu55SeC27PRv7K0zg1PoElAfOmW6wtFdDBpMZ2njFh23lswveCWwla3IvL9YVS8Ngo ZCdGc6lpJWv4jhBhemKrjHrWQP6F3sHcphZ9wUUS6NiNaYS3p4/M1CUOLF4yMYL0Uxwl satmotZaalVddmmR3GCBLuDJcV59/P6IK8Wo8FG+qNEPfuJFAjxikJR5SLRF7zYY/L45 etUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456470; x=1726061270; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=foYkw3mOcaEaw1RGlOq5OF5JpaC+7G6FcEJCZuAm3gc=; b=GsxeSaVkcy7uVm7YGZFVmjaebimbyrWkoE6gOL6nOrEFHs1lCtV6BKnclu/tX7C6o9 s3VfYRu9H8WGJfg/IHk6V9pGwpP0EZ0ZUXEhT5cC8RkneADn5X9pYp9UxihUaX4xzUOo Csqkm3zcZSu/OhO2Qpzf15NelACYi1VRAoxzDDgxPbGUgR6KKn3xw/shLRG/ovRN3+S2 ABo8M8m1QoFE7CjvEFVn929C/pE3k6ORnTQKoWDWgUyU5AluZ/q1eZbTNPHOqUssm5ux bwV9iOz+t2ajTw8teBRNMrG8BU+Wa+8C/Ju3MO0yUi8+Hp+4a8DHF6ka8/dN1U3bJqlz OK1Q== X-Gm-Message-State: AOJu0YzcyAQwILTiif6mxdxvDso1XjJubQn9RELPdNcKSSRkayRQG6gM na7+mVdi7bqXO/zyZtPnhdFsRROhFFkG8i5IFW4ibTmzxVBJTRJhmP/f5KGvj07meVPQe/qvSGi Fh/TaGw== X-Google-Smtp-Source: AGHT+IHF8TClRj2bF7p3cpy5xykdYOwhTKupiIz7ASqT4gnjHoBP69xq245pqv+DkofMM3ZE412IcQ== X-Received: by 2002:a05:6820:2216:b0:5d8:10cb:c336 with SMTP id 006d021491bc7-5dfacecb950mr22137190eaf.1.1725456470147; Wed, 04 Sep 2024 06:27:50 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:49 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 32/36] arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci Date: Wed, 4 Sep 2024 13:26:46 +0000 Message-Id: <20240904132650.2720446-33-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vadc/vsbc and vadci/vsbci so that they use the same parameterized names. 2024-08-28 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U, VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U, VSBCIQ_M_S, VSBCIQ_M_U, VSBCIQ_S, VSBCIQ_U, VSBCQ_M_S, VSBCQ_M_U, VSBCQ_S, VSBCQ_U. (VADCIQ, VSBCIQ): Merge into ... (VxCIQ): ... this. (VADCIQ_M, VSBCIQ_M): Merge into ... (VxCIQ_M): ... this. (VSBCQ, VADCQ): Merge into ... (VxCQ): ... this. (VSBCQ_M, VADCQ_M): Merge into ... (VxCQ_M): ... this. * config/arm/mve.md (mve_vadciq_v4si, mve_vsbciq_v4si): Merge into ... (@mve_q_v4si): ... this. (mve_vadciq_m_v4si, mve_vsbciq_m_v4si): Merge into ... (@mve_q_m_v4si): ... this. (mve_vadcq_v4si, mve_vsbcq_v4si): Merge into ... (@mve_q_v4si): ... this. (mve_vadcq_m_v4si, mve_vsbcq_m_v4si): Merge into ... (@mve_q_m_v4si): ... this. --- gcc/config/arm/iterators.md | 20 +++--- gcc/config/arm/mve.md | 131 +++++++++--------------------------- 2 files changed, 42 insertions(+), 109 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 2fb3b25040f..59e112b228c 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -941,6 +941,10 @@ (define_int_attr mve_insn [ (VABDQ_S "vabd") (VABDQ_U "vabd") (VABDQ_F "vabd") (VABSQ_M_F "vabs") (VABSQ_M_S "vabs") + (VADCIQ_M_S "vadci") (VADCIQ_M_U "vadci") + (VADCIQ_S "vadci") (VADCIQ_U "vadci") + (VADCQ_M_S "vadc") (VADCQ_M_U "vadc") + (VADCQ_S "vadc") (VADCQ_U "vadc") (VADDLVAQ_P_S "vaddlva") (VADDLVAQ_P_U "vaddlva") (VADDLVAQ_S "vaddlva") (VADDLVAQ_U "vaddlva") (VADDLVQ_P_S "vaddlv") (VADDLVQ_P_U "vaddlv") @@ -1235,6 +1239,10 @@ (define_int_attr mve_insn [ (VRSHRNTQ_N_S "vrshrnt") (VRSHRNTQ_N_U "vrshrnt") (VRSHRQ_M_N_S "vrshr") (VRSHRQ_M_N_U "vrshr") (VRSHRQ_N_S "vrshr") (VRSHRQ_N_U "vrshr") + (VSBCIQ_M_S "vsbci") (VSBCIQ_M_U "vsbci") + (VSBCIQ_S "vsbci") (VSBCIQ_U "vsbci") + (VSBCQ_M_S "vsbc") (VSBCQ_M_U "vsbc") + (VSBCQ_S "vsbc") (VSBCQ_U "vsbc") (VSHLLBQ_M_N_S "vshllb") (VSHLLBQ_M_N_U "vshllb") (VSHLLBQ_N_S "vshllb") (VSHLLBQ_N_U "vshllb") (VSHLLTQ_M_N_S "vshllt") (VSHLLTQ_M_N_U "vshllt") @@ -2949,14 +2957,10 @@ (define_int_iterator VSTRWSBWBQ [VSTRWQSBWB_S VSTRWQSBWB_U]) (define_int_iterator VLDRWGBWBQ [VLDRWQGBWB_S VLDRWQGBWB_U]) (define_int_iterator VSTRDSBWBQ [VSTRDQSBWB_S VSTRDQSBWB_U]) (define_int_iterator VLDRDGBWBQ [VLDRDQGBWB_S VLDRDQGBWB_U]) -(define_int_iterator VADCIQ [VADCIQ_U VADCIQ_S]) -(define_int_iterator VADCIQ_M [VADCIQ_M_U VADCIQ_M_S]) -(define_int_iterator VSBCQ [VSBCQ_U VSBCQ_S]) -(define_int_iterator VSBCQ_M [VSBCQ_M_U VSBCQ_M_S]) -(define_int_iterator VSBCIQ [VSBCIQ_U VSBCIQ_S]) -(define_int_iterator VSBCIQ_M [VSBCIQ_M_U VSBCIQ_M_S]) -(define_int_iterator VADCQ [VADCQ_U VADCQ_S]) -(define_int_iterator VADCQ_M [VADCQ_M_U VADCQ_M_S]) +(define_int_iterator VxCIQ [VADCIQ_U VADCIQ_S VSBCIQ_U VSBCIQ_S]) +(define_int_iterator VxCIQ_M [VADCIQ_M_U VADCIQ_M_S VSBCIQ_M_U VSBCIQ_M_S]) +(define_int_iterator VxCQ [VADCQ_U VADCQ_S VSBCQ_U VSBCQ_S]) +(define_int_iterator VxCQ_M [VADCQ_M_U VADCQ_M_S VSBCQ_M_U VSBCQ_M_S]) (define_int_iterator UQRSHLLQ [UQRSHLL_64 UQRSHLL_48]) (define_int_iterator SQRSHRLQ [SQRSHRL_64 SQRSHRL_48]) (define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index eb603b3d9a7..9c32d0e1033 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -5717,159 +5717,88 @@ (define_insn "mve_vldrdq_gather_base_wb_z_v2di_insn" } [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vldrdq_gather_base_wb_v2di_insn")) (set_attr "length" "8")]) -;; -;; [vadciq_m_s, vadciq_m_u]) -;; -(define_insn "mve_vadciq_m_v4si" - [(set (match_operand:V4SI 0 "s_register_operand" "=w") - (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") - (match_operand:V4SI 2 "s_register_operand" "w") - (match_operand:V4SI 3 "s_register_operand" "w") - (match_operand:V4BI 4 "vpr_register_operand" "Up")] - VADCIQ_M)) - (set (reg:SI VFPCC_REGNUM) - (unspec:SI [(const_int 0)] - VADCIQ_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;vadcit.i32\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vadciq_v4si")) - (set_attr "type" "mve_move") - (set_attr "length" "8")]) ;; -;; [vadciq_u, vadciq_s]) +;; [vadciq_u, vadciq_s] +;; [vsbciq_s, vsbciq_u] ;; -(define_insn "mve_vadciq_v4si" +(define_insn "@mve_q_v4si" [(set (match_operand:V4SI 0 "s_register_operand" "=w") (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") (match_operand:V4SI 2 "s_register_operand" "w")] - VADCIQ)) + VxCIQ)) (set (reg:SI VFPCC_REGNUM) (unspec:SI [(const_int 0)] - VADCIQ)) + VxCIQ)) ] "TARGET_HAVE_MVE" - "vadci.i32\t%q0, %q1, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vadciq_v4si")) + ".i32\t%q0, %q1, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si")) (set_attr "type" "mve_move") (set_attr "length" "4")]) ;; -;; [vadcq_m_s, vadcq_m_u]) +;; [vadciq_m_s, vadciq_m_u] +;; [vsbciq_m_u, vsbciq_m_s] ;; -(define_insn "mve_vadcq_m_v4si" +(define_insn "@mve_q_m_v4si" [(set (match_operand:V4SI 0 "s_register_operand" "=w") (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") (match_operand:V4SI 2 "s_register_operand" "w") (match_operand:V4SI 3 "s_register_operand" "w") (match_operand:V4BI 4 "vpr_register_operand" "Up")] - VADCQ_M)) + VxCIQ_M)) (set (reg:SI VFPCC_REGNUM) - (unspec:SI [(reg:SI VFPCC_REGNUM)] - VADCQ_M)) + (unspec:SI [(const_int 0)] + VxCIQ_M)) ] "TARGET_HAVE_MVE" - "vpst\;vadct.i32\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vadcq_v4si")) + "vpst\;t.i32\t%q0, %q2, %q3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si")) (set_attr "type" "mve_move") (set_attr "length" "8")]) ;; -;; [vadcq_u, vadcq_s]) +;; [vadcq_u, vadcq_s] +;; [vsbcq_s, vsbcq_u] ;; -(define_insn "mve_vadcq_v4si" +(define_insn "@mve_q_v4si" [(set (match_operand:V4SI 0 "s_register_operand" "=w") (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") (match_operand:V4SI 2 "s_register_operand" "w")] - VADCQ)) + VxCQ)) (set (reg:SI VFPCC_REGNUM) (unspec:SI [(reg:SI VFPCC_REGNUM)] - VADCQ)) + VxCQ)) ] "TARGET_HAVE_MVE" - "vadc.i32\t%q0, %q1, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vadcq_v4si")) + ".i32\t%q0, %q1, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si")) (set_attr "type" "mve_move") (set_attr "length" "4") (set_attr "conds" "set")]) ;; -;; [vsbciq_m_u, vsbciq_m_s]) +;; [vadcq_m_s, vadcq_m_u] +;; [vsbcq_m_u, vsbcq_m_s] ;; -(define_insn "mve_vsbciq_m_v4si" +(define_insn "@mve_q_m_v4si" [(set (match_operand:V4SI 0 "s_register_operand" "=w") - (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") - (match_operand:V4SI 2 "s_register_operand" "w") - (match_operand:V4SI 3 "s_register_operand" "w") - (match_operand:V4BI 4 "vpr_register_operand" "Up")] - VSBCIQ_M)) - (set (reg:SI VFPCC_REGNUM) - (unspec:SI [(const_int 0)] - VSBCIQ_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;vsbcit.i32\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vsbciq_v4si")) - (set_attr "type" "mve_move") - (set_attr "length" "8")]) - -;; -;; [vsbciq_s, vsbciq_u]) -;; -(define_insn "mve_vsbciq_v4si" - [(set (match_operand:V4SI 0 "s_register_operand" "=w") - (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") - (match_operand:V4SI 2 "s_register_operand" "w")] - VSBCIQ)) - (set (reg:SI VFPCC_REGNUM) - (unspec:SI [(const_int 0)] - VSBCIQ)) - ] - "TARGET_HAVE_MVE" - "vsbci.i32\t%q0, %q1, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vsbciq_v4si")) - (set_attr "type" "mve_move") - (set_attr "length" "4")]) - -;; -;; [vsbcq_m_u, vsbcq_m_s]) -;; -(define_insn "mve_vsbcq_m_v4si" - [(set (match_operand:V4SI 0 "s_register_operand" "=w") - (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") (match_operand:V4SI 2 "s_register_operand" "w") (match_operand:V4SI 3 "s_register_operand" "w") (match_operand:V4BI 4 "vpr_register_operand" "Up")] - VSBCQ_M)) + VxCQ_M)) (set (reg:SI VFPCC_REGNUM) (unspec:SI [(reg:SI VFPCC_REGNUM)] - VSBCQ_M)) + VxCQ_M)) ] "TARGET_HAVE_MVE" - "vpst\;vsbct.i32\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vsbcq_v4si")) + "vpst\;t.i32\t%q0, %q2, %q3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_v4si")) (set_attr "type" "mve_move") (set_attr "length" "8")]) -;; -;; [vsbcq_s, vsbcq_u]) -;; -(define_insn "mve_vsbcq_v4si" - [(set (match_operand:V4SI 0 "s_register_operand" "=w") - (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") - (match_operand:V4SI 2 "s_register_operand" "w")] - VSBCQ)) - (set (reg:SI VFPCC_REGNUM) - (unspec:SI [(reg:SI VFPCC_REGNUM)] - VSBCQ)) - ] - "TARGET_HAVE_MVE" - "vsbc.i32\t%q0, %q1, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vsbcq_v4si")) - (set_attr "type" "mve_move") - (set_attr "length" "4")]) - ;; ;; [vst2q]) ;; From patchwork Wed Sep 4 13:26:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980816 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=fNIt8R8b; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNl40KhNz1yg3 for ; Wed, 4 Sep 2024 23:34:40 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3CC91384A440 for ; Wed, 4 Sep 2024 13:34:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) by sourceware.org (Postfix) with ESMTPS id 942E4386183B for ; Wed, 4 Sep 2024 13:27:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 942E4386183B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 942E4386183B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456478; cv=none; b=BiHdKBZ9F/FX5HjCQVptZK+Y7AofnV423r9/cBrXzHuAqVaKY2y9ozoerIA7h2qo4QltmNUCYLC2ovWz+3e2VhmqyHiNznYJTaqOJxN5WKMcZSqQXBUNzWa89BNbjNLk6B7BvkgaqaA0e7vhdA6O/kfOEH3mI3pseyFXTOEuDg8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456478; c=relaxed/simple; bh=zGQHtifONdkKFwvcmTwZRHjUKKdxGHFCyAgLkiWY+zk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=iLr71vh9O/N3tBG7w96CY9vXjpkVueU2qvlTyJ4y4cbFbpiEO/I6lS3HMnW+Jpkvq0WT5kEMgCe6VCfC38pJOn0kfp+XekyyMYFDovC1mhn8CN674NV9B1gkUPZdMdD+BRWYPm481Ns3PVEFjfOFzXENUK2XUoumTd/pcOF/l3s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc33.google.com with SMTP id 006d021491bc7-5d5eec95a74so3789791eaf.1 for ; Wed, 04 Sep 2024 06:27:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456471; x=1726061271; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FCov5mXVFFfwiPrFbNi7ULVG3DwQM/a9ri+z8xcKGOQ=; b=fNIt8R8b9jqGyfK/PlLwiqBU7Dse3m1PL02VoqJBrQfz8tggZhJCmthUYMUvLuL6Pa haoYWtH1nKD01KmKHKF283Hl+W9+GABMSZSMn8YPJINUhe7j+MPP12qaGGiVvvqqhNBz kQRhwcXgTXZ03MS44tRNDnjVOJZHQ0K0TTS4r/y6cYC/bGZY0OjG9SAdT3946GRt8Zag XX7LaTXvvPToooP95QskQJF+h076nc1mzir8jp+muAgWRJdMV3M8kM1HhR1Y2psBM3T0 V5KoVr5r0XPhbTO5M0LYGI+2gHyhKRImwBPkrWkjI6cVStitqNugmF0CdVJhATY9bc0U dViw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456471; x=1726061271; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FCov5mXVFFfwiPrFbNi7ULVG3DwQM/a9ri+z8xcKGOQ=; b=rW42VlmY57LAwWZ26QGyNlTZOoRD+/NXOmIR4C6RyLIFq+F+9qXwQMQXTB8AXEIX4u FMtAGlam9kLeASu67YRY29rkWLs9ZTAHGthLQ76KDk0G1tO5LYDWz6R599wjKNClrcaK Wu6mXFPPq8M0Zw/8amVu7vxFkUj/gsVh596zlnFm6dR/lZ2cLxx5FAWkK8etXWM+mz0m Pcvt69VVW9TdrryBXn+YYH3V6BYFhun14TtBwY1GbW0r9sFt3Yk576I5M2DDgXhQU2UN yk/JIKj/cHtSK+3PGpaMcHcPM0m5/gfjOBrXF9uD106t1CzHLoSWYcUtKfZN2l0IWZFw FULQ== X-Gm-Message-State: AOJu0YyXFr3czu4Ga+ehT4i4WHLClKukt2Fn4HXArfzkFcqTSsGCEZhq Dnw7yVZDPpQEtwcB2VtiN+I+KZQ21fo64zYHm9spHZMaeN9E57+5g0eN9VMi10UGQ86M376Ct95 J9RfPzQ== X-Google-Smtp-Source: AGHT+IFYAjb0DcK5YQBXU2ejxV2C3MzjxfHEVl9YempMnnTP7NLX/S3CyWFe6+mqn6JZiCSNVUf3Tg== X-Received: by 2002:a05:6820:54c:b0:5d8:e6a:236 with SMTP id 006d021491bc7-5dfacfd9ce3mr16493482eaf.3.1725456471358; Wed, 04 Sep 2024 06:27:51 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:50 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 33/36] arm: [MVE intrinsics] rework vadciq Date: Wed, 4 Sep 2024 13:26:47 +0000 Message-Id: <20240904132650.2720446-34-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vadciq using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New. (vadciq): New. * config/arm/arm-mve-builtins-base.def (vadciq): New. * config/arm/arm-mve-builtins-base.h (vadciq): New. * config/arm/arm_mve.h (vadciq): Delete. (vadciq_m): Delete. (vadciq_s32): Delete. (vadciq_u32): Delete. (vadciq_m_s32): Delete. (vadciq_m_u32): Delete. (__arm_vadciq_s32): Delete. (__arm_vadciq_u32): Delete. (__arm_vadciq_m_s32): Delete. (__arm_vadciq_m_u32): Delete. (__arm_vadciq): Delete. (__arm_vadciq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 93 ++++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-base.def | 1 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm_mve.h | 89 ----------------------- 4 files changed, 95 insertions(+), 89 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 9f1f7e69c57..6f3b18c2915 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -554,6 +554,98 @@ public: } }; +/* Map the vadc and similar functions directly to CODE (UNSPEC, UNSPEC). Take + care of the implicit carry argument. */ +class vadc_vsbc_impl : public function_base +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + unsigned int flags = CP_WRITE_MEMORY | CP_READ_FPCR; + return flags; + } + + tree + memory_scalar_type (const function_instance &) const override + { + /* carry is "unsigned int". */ + return get_typenode_from_name ("unsigned int"); + } + + rtx + expand (function_expander &e) const override + { + insn_code code; + rtx insns, carry_ptr, carry_out; + int carry_out_arg_no; + int unspec; + + if (! e.type_suffix (0).integer_p) + gcc_unreachable (); + + if (e.mode_suffix_id != MODE_none) + gcc_unreachable (); + + /* Remove carry from arguments, it is implicit for the builtin. */ + switch (e.pred) + { + case PRED_none: + carry_out_arg_no = 2; + break; + + case PRED_m: + carry_out_arg_no = 3; + break; + + default: + gcc_unreachable (); + } + + carry_ptr = e.args[carry_out_arg_no]; + e.args.ordered_remove (carry_out_arg_no); + + switch (e.pred) + { + case PRED_none: + /* No predicate. */ + unspec = e.type_suffix (0).unsigned_p + ? VADCIQ_U + : VADCIQ_S; + code = code_for_mve_q_v4si (unspec, unspec); + insns = e.use_exact_insn (code); + break; + + case PRED_m: + /* "m" predicate. */ + unspec = e.type_suffix (0).unsigned_p + ? VADCIQ_M_U + : VADCIQ_M_S; + code = code_for_mve_q_m_v4si (unspec, unspec); + insns = e.use_cond_insn (code, 0); + break; + + default: + gcc_unreachable (); + } + + /* Update carry_out. */ + carry_out = gen_reg_rtx (SImode); + emit_insn (gen_get_fpscr_nzcvqc (carry_out)); + emit_insn (gen_rtx_SET (carry_out, + gen_rtx_LSHIFTRT (SImode, + carry_out, + GEN_INT (29)))); + emit_insn (gen_rtx_SET (carry_out, + gen_rtx_AND (SImode, + carry_out, + GEN_INT (1)))); + emit_insn (gen_rtx_SET (gen_rtx_MEM (Pmode, carry_ptr), carry_out)); + + return insns; + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -724,6 +816,7 @@ namespace arm_mve { FUNCTION_PRED_P_S_U (vabavq, VABAVQ) FUNCTION_WITHOUT_N (vabdq, VABDQ) FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, -1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1)) +FUNCTION (vadciq, vadc_vsbc_impl,) FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ) FUNCTION_PRED_P_S_U (vaddlvaq, VADDLVAQ) FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index bd69f06d7e4..72d6461c4e4 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -21,6 +21,7 @@ DEF_MVE_FUNCTION (vabavq, binary_acca_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none) +DEF_MVE_FUNCTION (vadciq, vadc_vsbc, integer_32, m_or_none) DEF_MVE_FUNCTION (vaddlvaq, unary_widen_acc, integer_32, p_or_none) DEF_MVE_FUNCTION (vaddlvq, unary_acc, integer_32, p_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 1eff50d3c6d..2dfc2e18062 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -26,6 +26,7 @@ namespace functions { extern const function_base *const vabavq; extern const function_base *const vabdq; extern const function_base *const vabsq; +extern const function_base *const vadciq; extern const function_base *const vaddlvaq; extern const function_base *const vaddlvq; extern const function_base *const vaddq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index c577c373e98..3a0b3041c42 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -85,8 +85,6 @@ #define vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb(__addr, __offset, __value) -#define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) -#define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) #define vadcq(__a, __b, __carry) __arm_vadcq(__a, __b, __carry) #define vadcq_m(__inactive, __a, __b, __carry, __p) __arm_vadcq_m(__inactive, __a, __b, __carry, __p) #define vsbciq(__a, __b, __carry_out) __arm_vsbciq(__a, __b, __carry_out) @@ -321,10 +319,6 @@ #define vstrwq_scatter_base_wb_s32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_s32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_u32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_u32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_f32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_f32(__addr, __offset, __value) -#define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) -#define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) -#define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) -#define vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) #define vadcq_s32(__a, __b, __carry) __arm_vadcq_s32(__a, __b, __carry) #define vadcq_u32(__a, __b, __carry) __arm_vadcq_u32(__a, __b, __carry) #define vadcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_s32(__inactive, __a, __b, __carry, __p) @@ -1690,42 +1684,6 @@ __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int __offset, uint3 *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) -{ - int32x4_t __res = __builtin_mve_vadciq_sv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) -{ - uint32x4_t __res = __builtin_mve_vadciq_uv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - int32x4_t __res = __builtin_mve_vadciq_m_sv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vadciq_m_uv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) @@ -3642,34 +3600,6 @@ __arm_vstrwq_scatter_base_wb (uint32x4_t * __addr, const int __offset, uint32x4_ __arm_vstrwq_scatter_base_wb_u32 (__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) -{ - return __arm_vadciq_s32 (__a, __b, __carry_out); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) -{ - return __arm_vadciq_u32 (__a, __b, __carry_out); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - return __arm_vadciq_m_s32 (__inactive, __a, __b, __carry_out, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - return __arm_vadciq_m_u32 (__inactive, __a, __b, __carry_out, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadcq (int32x4_t __a, int32x4_t __b, unsigned * __carry) @@ -5289,12 +5219,6 @@ extern void *__ARM_undef; #endif /* MVE Integer. */ -#define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) - #define __arm_vstrdq_scatter_base_wb_p(p0,p1,p2,p3) ({ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p2)])0, \ int (*)[__ARM_mve_type_int64x2_t]: __arm_vstrdq_scatter_base_wb_p_s64 (p0, p1, __ARM_mve_coerce(__p2, int64x2_t), p3), \ @@ -5321,19 +5245,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_int64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_s64 (__ARM_mve_coerce_s64_ptr(p0, int64_t *), p1, p2), \ int (*)[__ARM_mve_type_uint64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_u64 (__ARM_mve_coerce_u64_ptr(p0, uint64_t *), p1, p2))) -#define __arm_vadciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) - -#define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) - #define __arm_vadcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ From patchwork Wed Sep 4 13:26:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980826 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=L1tya0dW; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNrg38gxz1yXY for ; Wed, 4 Sep 2024 23:39:31 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3FC26384AB44 for ; Wed, 4 Sep 2024 13:39:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 22D18385DDCF for ; Wed, 4 Sep 2024 13:27:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22D18385DDCF Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 22D18385DDCF Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; cv=none; b=xp3m0ptIY7i2b9BmkivEH5cKmiq+KZLY49NaGtSlIcSWIXJI3qpKQ5O6YXhrSUAN6oP8IDyFzezImxJiAejlZ7DWNPih1XghA4oPHyy71UTtCWtJ7OTkFlKcltVZIq8YHm3+wQ9Mo/lJDsD/sijb/3M06pa+94rGtLLvFayFNc4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; c=relaxed/simple; bh=h5ihID5OsjLz4Bfem+23BCtf/VEV4VGGUpXM0zWltBA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=AnlGIlOlQHL8mWoWoFoM0hZ659AzcfTlXgDE3mFRzc9IETiRruKi1Z4ErPH5WCNe0ezauFtBjdb9U3bChJ03ZJjHoB2Viy1f1FYcXVhCtBgnvKWrcYj4UWrKf2LriqxrEvqt0cr5xiKWYKz5JNXCFM6f/xJeXupPBEFfm1Rxf5Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5dcd8403656so4650955eaf.1 for ; Wed, 04 Sep 2024 06:27:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456476; x=1726061276; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZRg+ayUp0X5pN/YPpTrZ8MQmKY5XoA7J9gtDmtFPYEc=; b=L1tya0dW1c4F83grQbX8kBW4Myn/cWj9yETfrdyz2B4vKIikkN74k/FIfbSc19h9hL BXHnk4tLvBAf9KwozKkLW6aRwa14yDPtDm5E1XZs2MK+9Zb0TfVxQTF04GtnovQb1uRy zVPrZYIsKgwzt++1PJPAHnxm2lsS166q+qUjaRyLwr2frNUSMoHEh+1sEwG9kXfnFs0W lTmaJ3jXgEJzQaI3314yLFklpdXnhrSSKQr8u8D/7wOemuDaA8PveG+BB1OngWq+BK+r y45+oevkPSDY+jC9whXGI2tcq9luia+wqx18K/0Lb3q3fO8cf/QxWlxZ12Ko3R4fAOdG nfeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456476; x=1726061276; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZRg+ayUp0X5pN/YPpTrZ8MQmKY5XoA7J9gtDmtFPYEc=; b=QnayNemEY10LLAz44DdovPq8WJz7v6fp+thqH9XU3gZ4vXCUZBa+OO2M8saCL7r47G 03vTHMGLCLRgfXCxsX6HBnoD8z9xXGbyDSABmyXoIXrneYIqBLumvdgfIWZEXCt7RC9B lB2Cj+45pGOSoCGkKJd70ZHVHuqgrNlrcbHmhJJX15o9E+wdK1iizH2T2LR7eT8vjPa3 bikXFvzlwvFeofU7iD1Gp2PVOGyJjmNAENnb3PiAc8zEgpbiP1Oun2n0LaPFCnzgsbPw Rfp3OWvwyXNLfAq0dXbEbkGvKThDVvMxM6Dgg5a+M7PfybizUOH6CnKYBVWFeTapPCZc Es7w== X-Gm-Message-State: AOJu0YyPwRs7KvuXRNu2K1qCZNScAnbED3qL+Nq6kIagd6EK4gye9LKO voF/E92EVaSe2cvExMkEHJ0leaeNbS37Rqz83mAh3LE6NL4I0mnjMoGdPsmLD+GHBs3N2YzNqzS fl/9nCQ== X-Google-Smtp-Source: AGHT+IErCv/XCuorHBkItIySmiLZRKPjjm3IA3RDPRHSQGkFifJMyGQLXU2uzmxVM5YNJoSvDR9lhg== X-Received: by 2002:a05:6820:2216:b0:5d6:ae6:a852 with SMTP id 006d021491bc7-5dfacf28289mr22295167eaf.6.1725456471990; Wed, 04 Sep 2024 06:27:51 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:51 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 34/36] arm: [MVE intrinsics] rework vadcq Date: Wed, 4 Sep 2024 13:26:48 +0000 Message-Id: <20240904132650.2720446-35-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vadcq using the new MVE builtins framework. We re-use most of the code introduced by the previous patch to support vadciq: we just need to initialize carry from the input parameter. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vadcq_vsbc): Add support for vadcq. * config/arm/arm-mve-builtins-base.def (vadcq): New. * config/arm/arm-mve-builtins-base.h (vadcq): New. * config/arm/arm_mve.h (vadcq): Delete. (vadcq_m): Delete. (vadcq_s32): Delete. (vadcq_u32): Delete. (vadcq_m_s32): Delete. (vadcq_m_u32): Delete. (__arm_vadcq_s32): Delete. (__arm_vadcq_u32): Delete. (__arm_vadcq_m_s32): Delete. (__arm_vadcq_m_u32): Delete. (__arm_vadcq): Delete. (__arm_vadcq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 61 +++++++++++++++-- gcc/config/arm/arm-mve-builtins-base.def | 1 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm_mve.h | 87 ------------------------ 4 files changed, 56 insertions(+), 94 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 6f3b18c2915..9c2e11356ef 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -559,10 +559,19 @@ public: class vadc_vsbc_impl : public function_base { public: + CONSTEXPR vadc_vsbc_impl (bool init_carry) + : m_init_carry (init_carry) + {} + + /* Initialize carry with 0 (vadci). */ + bool m_init_carry; + unsigned int call_properties (const function_instance &) const override { unsigned int flags = CP_WRITE_MEMORY | CP_READ_FPCR; + if (!m_init_carry) + flags |= CP_READ_MEMORY; return flags; } @@ -605,22 +614,59 @@ public: carry_ptr = e.args[carry_out_arg_no]; e.args.ordered_remove (carry_out_arg_no); + if (!m_init_carry) + { + /* Prepare carry in: + set_fpscr ( (fpscr & ~0x20000000u) + | ((*carry & 1u) << 29) ) */ + rtx carry_in = gen_reg_rtx (SImode); + rtx fpscr = gen_reg_rtx (SImode); + emit_insn (gen_get_fpscr_nzcvqc (fpscr)); + emit_insn (gen_rtx_SET (carry_in, gen_rtx_MEM (SImode, carry_ptr))); + + emit_insn (gen_rtx_SET (carry_in, + gen_rtx_ASHIFT (SImode, + carry_in, + GEN_INT (29)))); + emit_insn (gen_rtx_SET (carry_in, + gen_rtx_AND (SImode, + carry_in, + GEN_INT (0x20000000)))); + emit_insn (gen_rtx_SET (fpscr, + gen_rtx_AND (SImode, + fpscr, + GEN_INT (~0x20000000)))); + emit_insn (gen_rtx_SET (carry_in, + gen_rtx_IOR (SImode, + carry_in, + fpscr))); + emit_insn (gen_set_fpscr_nzcvqc (carry_in)); + } + switch (e.pred) { case PRED_none: /* No predicate. */ - unspec = e.type_suffix (0).unsigned_p - ? VADCIQ_U - : VADCIQ_S; + unspec = m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VADCIQ_U + : VADCIQ_S) + : (e.type_suffix (0).unsigned_p + ? VADCQ_U + : VADCQ_S); code = code_for_mve_q_v4si (unspec, unspec); insns = e.use_exact_insn (code); break; case PRED_m: /* "m" predicate. */ - unspec = e.type_suffix (0).unsigned_p - ? VADCIQ_M_U - : VADCIQ_M_S; + unspec = m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VADCIQ_M_U + : VADCIQ_M_S) + : (e.type_suffix (0).unsigned_p + ? VADCQ_M_U + : VADCQ_M_S); code = code_for_mve_q_m_v4si (unspec, unspec); insns = e.use_cond_insn (code, 0); break; @@ -816,7 +862,8 @@ namespace arm_mve { FUNCTION_PRED_P_S_U (vabavq, VABAVQ) FUNCTION_WITHOUT_N (vabdq, VABDQ) FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, -1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1)) -FUNCTION (vadciq, vadc_vsbc_impl,) +FUNCTION (vadciq, vadc_vsbc_impl, (true)) +FUNCTION (vadcq, vadc_vsbc_impl, (false)) FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ) FUNCTION_PRED_P_S_U (vaddlvaq, VADDLVAQ) FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 72d6461c4e4..37efa6bf13e 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -22,6 +22,7 @@ DEF_MVE_FUNCTION (vabavq, binary_acca_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vadciq, vadc_vsbc, integer_32, m_or_none) +DEF_MVE_FUNCTION (vadcq, vadc_vsbc, integer_32, m_or_none) DEF_MVE_FUNCTION (vaddlvaq, unary_widen_acc, integer_32, p_or_none) DEF_MVE_FUNCTION (vaddlvq, unary_acc, integer_32, p_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 2dfc2e18062..eb8423c3fe2 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -27,6 +27,7 @@ extern const function_base *const vabavq; extern const function_base *const vabdq; extern const function_base *const vabsq; extern const function_base *const vadciq; +extern const function_base *const vadcq; extern const function_base *const vaddlvaq; extern const function_base *const vaddlvq; extern const function_base *const vaddq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 3a0b3041c42..dd7b6f5cdab 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -85,8 +85,6 @@ #define vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb(__addr, __offset, __value) -#define vadcq(__a, __b, __carry) __arm_vadcq(__a, __b, __carry) -#define vadcq_m(__inactive, __a, __b, __carry, __p) __arm_vadcq_m(__inactive, __a, __b, __carry, __p) #define vsbciq(__a, __b, __carry_out) __arm_vsbciq(__a, __b, __carry_out) #define vsbciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m(__inactive, __a, __b, __carry_out, __p) #define vsbcq(__a, __b, __carry) __arm_vsbcq(__a, __b, __carry) @@ -319,10 +317,6 @@ #define vstrwq_scatter_base_wb_s32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_s32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_u32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_u32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_f32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_f32(__addr, __offset, __value) -#define vadcq_s32(__a, __b, __carry) __arm_vadcq_s32(__a, __b, __carry) -#define vadcq_u32(__a, __b, __carry) __arm_vadcq_u32(__a, __b, __carry) -#define vadcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_s32(__inactive, __a, __b, __carry, __p) -#define vadcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_u32(__inactive, __a, __b, __carry, __p) #define vsbciq_s32(__a, __b, __carry_out) __arm_vsbciq_s32(__a, __b, __carry_out) #define vsbciq_u32(__a, __b, __carry_out) __arm_vsbciq_u32(__a, __b, __carry_out) #define vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) @@ -1684,46 +1678,6 @@ __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int __offset, uint3 *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - uint32x4_t __res = __builtin_mve_vadcq_m_uv4si (__inactive, __a, __b, __p); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vsbciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -3600,34 +3554,6 @@ __arm_vstrwq_scatter_base_wb (uint32x4_t * __addr, const int __offset, uint32x4_ __arm_vstrwq_scatter_base_wb_u32 (__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq (int32x4_t __a, int32x4_t __b, unsigned * __carry) -{ - return __arm_vadcq_s32 (__a, __b, __carry); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) -{ - return __arm_vadcq_u32 (__a, __b, __carry); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - return __arm_vadcq_m_s32 (__inactive, __a, __b, __carry, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - return __arm_vadcq_m_u32 (__inactive, __a, __b, __carry, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vsbciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -5245,19 +5171,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_int64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_s64 (__ARM_mve_coerce_s64_ptr(p0, int64_t *), p1, p2), \ int (*)[__ARM_mve_type_uint64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_u64 (__ARM_mve_coerce_u64_ptr(p0, uint64_t *), p1, p2))) -#define __arm_vadcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) - -#define __arm_vadcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) - #define __arm_vsbciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ From patchwork Wed Sep 4 13:26:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980828 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=P3vz2Wbb; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNtb5vQLz1yfv for ; Wed, 4 Sep 2024 23:41:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 12F2C3860C3B for ; Wed, 4 Sep 2024 13:41:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 1727E384F4B9 for ; Wed, 4 Sep 2024 13:27:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1727E384F4B9 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1727E384F4B9 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; cv=none; b=JQ6TvfFNrjLf5xhg0SXFxyhoX/L2oCIT2NoYAJJFXlaWbuGS5zUH20424QDsUeHRnwDvvZ0qRUsDqXBP4wVB406QeSeVtjEZ6j1BDtqyHRpjF24LCENycUXIOnXpeOPTrFxqMOiDY3Gn/pdA+3wBkqeRjxoKBbRKYJXNN4VsEX0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; c=relaxed/simple; bh=k73j28G+K4KukRyOolxzy3Cj6KdDXcKi9xgVftSG/TQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=jg17K81Vxcvs4xMBCDBuabbxfdlmt6CcgcsKZhCpXnPqKkEWDin6cv3oOIgtK7aYBqwu93omeKYNiL+VDDSOXOw/vCRYsxfqhgKDl2yEJGW5jKCZ+QfP6dgP8mfYO7vZYHWHgOzX1CpYDtxBtt/kGiJoq2Di5tEEX/F8vFEC4Bc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5dfaff47600so374656eaf.0 for ; Wed, 04 Sep 2024 06:27:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456473; x=1726061273; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oPq0U+6gbMjNFVr01EIh4n9Po0TZbWBqRh0BmaqvumQ=; b=P3vz2WbbsL/DcOJtJwG2jwYiXEqoTb3LMXuhvv0ztOQX0rYBwvRnE562tTghp+FHD0 +KXvIPi1R3OsgIG2T/h4MlHX5pMs5dL/sVugH5DjCxtHRKrP+UdUinzqor61mkoVmMGL K5tsUkrOoQdz7KYrFeMUWK/2fv7oO0IpoIL6I30JtEJfXfTSrIYMLgr1da0CLpw3bkKj UmYZzMbovyfAwc1c42eNERFp2gVJVuf2kUDKfhxaxzY+ncF8+WRlc587fgPb75kD+KEJ peHibujRzYald4OnIOqDDb3lwCYanRDCQiZgCOspLTljax9POg5QNvCad/vmDE4nEkYw 86lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456473; x=1726061273; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oPq0U+6gbMjNFVr01EIh4n9Po0TZbWBqRh0BmaqvumQ=; b=TSyrlOhhEKoLHCZLnwk22mx+Ht4jOrGTJGdz2wDXubDD+zKQbNvky/yFy9E/BkXmCF uyuve6MaEIttA3sbVbsEoHKaEGZ/9tMQ2TxNY0z+mUOu1MptVkluo0rwpeK456KfI7tH +hfSWogS5wvaIZsIVdqSKleU9cONUHzFi67axs/ch0LDc92izRLPvvEQb7Au6aQlEd0M xYtjSkqjD7grGRYrJw1KvEeVj/yMk1WHXwlCSyWfH1MR28TiAclTw2SM4INVCQPCw+5V qOn9Gbjgxm9Q+RYQJpNRa4XImLG6Qp0mZ+5U3YwRmx/9YAGVMbfUNb8uoOP4SaQrXMRW +83Q== X-Gm-Message-State: AOJu0YyjNdPQTy6YWGcQezS0+IIrlwMT30HqckO9kEXDvDslkySGqQ6u jPQ4G/CIWWGQSPtmaKPS0bcasmkKCGF2UjbzD3zo1xgLRsBHI6Aliz/ULc3zZkQD26bw9qwzAWY tzcmpyw== X-Google-Smtp-Source: AGHT+IHSUDsJ7pOJBNZ6fBrPbyecDOgiFGxF82hf1K4LfHNiVaIf6EFILCbw3fBAXIELeI2rtSikHQ== X-Received: by 2002:a05:6820:160f:b0:5dc:a8ee:6ba5 with SMTP id 006d021491bc7-5e18eb031d0mr1156292eaf.1.1725456472695; Wed, 04 Sep 2024 06:27:52 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:52 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 35/36] arm: [MVE intrinsics] rework vsbcq vsbciq Date: Wed, 4 Sep 2024 13:26:49 +0000 Message-Id: <20240904132650.2720446-36-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vsbcq vsbciq using the new MVE builtins framework. We re-use most of the code introduced by the previous patches. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add support for vsbciq and vsbcq. (vadciq, vadcq): Add new parameter. (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.def (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.h (vsbciq): New. (vsbcq): New. * config/arm/arm_mve.h (vsbciq): Delete. (vsbciq_m): Delete. (vsbcq): Delete. (vsbcq_m): Delete. (vsbciq_s32): Delete. (vsbciq_u32): Delete. (vsbciq_m_s32): Delete. (vsbciq_m_u32): Delete. (vsbcq_s32): Delete. (vsbcq_u32): Delete. (vsbcq_m_s32): Delete. (vsbcq_m_u32): Delete. (__arm_vsbciq_s32): Delete. (__arm_vsbciq_u32): Delete. (__arm_vsbciq_m_s32): Delete. (__arm_vsbciq_m_u32): Delete. (__arm_vsbcq_s32): Delete. (__arm_vsbcq_u32): Delete. (__arm_vsbcq_m_s32): Delete. (__arm_vsbcq_m_u32): Delete. (__arm_vsbciq): Delete. (__arm_vsbciq_m): Delete. (__arm_vsbcq): Delete. (__arm_vsbcq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 56 +++++--- gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm_mve.h | 170 ----------------------- 4 files changed, 42 insertions(+), 188 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 9c2e11356ef..02fccdcb71f 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -559,12 +559,14 @@ public: class vadc_vsbc_impl : public function_base { public: - CONSTEXPR vadc_vsbc_impl (bool init_carry) - : m_init_carry (init_carry) + CONSTEXPR vadc_vsbc_impl (bool init_carry, bool add) + : m_init_carry (init_carry), m_add (add) {} /* Initialize carry with 0 (vadci). */ bool m_init_carry; + /* Add (true) or Sub (false). */ + bool m_add; unsigned int call_properties (const function_instance &) const override @@ -647,26 +649,42 @@ public: { case PRED_none: /* No predicate. */ - unspec = m_init_carry - ? (e.type_suffix (0).unsigned_p - ? VADCIQ_U - : VADCIQ_S) - : (e.type_suffix (0).unsigned_p - ? VADCQ_U - : VADCQ_S); + unspec = m_add + ? (m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VADCIQ_U + : VADCIQ_S) + : (e.type_suffix (0).unsigned_p + ? VADCQ_U + : VADCQ_S)) + : (m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VSBCIQ_U + : VSBCIQ_S) + : (e.type_suffix (0).unsigned_p + ? VSBCQ_U + : VSBCQ_S)); code = code_for_mve_q_v4si (unspec, unspec); insns = e.use_exact_insn (code); break; case PRED_m: /* "m" predicate. */ - unspec = m_init_carry - ? (e.type_suffix (0).unsigned_p - ? VADCIQ_M_U - : VADCIQ_M_S) - : (e.type_suffix (0).unsigned_p - ? VADCQ_M_U - : VADCQ_M_S); + unspec = m_add + ? (m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VADCIQ_M_U + : VADCIQ_M_S) + : (e.type_suffix (0).unsigned_p + ? VADCQ_M_U + : VADCQ_M_S)) + : (m_init_carry + ? (e.type_suffix (0).unsigned_p + ? VSBCIQ_M_U + : VSBCIQ_M_S) + : (e.type_suffix (0).unsigned_p + ? VSBCQ_M_U + : VSBCQ_M_S)); code = code_for_mve_q_m_v4si (unspec, unspec); insns = e.use_cond_insn (code, 0); break; @@ -862,8 +880,8 @@ namespace arm_mve { FUNCTION_PRED_P_S_U (vabavq, VABAVQ) FUNCTION_WITHOUT_N (vabdq, VABDQ) FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, -1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1)) -FUNCTION (vadciq, vadc_vsbc_impl, (true)) -FUNCTION (vadcq, vadc_vsbc_impl, (false)) +FUNCTION (vadciq, vadc_vsbc_impl, (true, true)) +FUNCTION (vadcq, vadc_vsbc_impl, (false, true)) FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ) FUNCTION_PRED_P_S_U (vaddlvaq, VADDLVAQ) FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ) @@ -1026,6 +1044,8 @@ FUNCTION_WITH_M_N_NO_F (vrshlq, VRSHLQ) FUNCTION_ONLY_N_NO_F (vrshrnbq, VRSHRNBQ) FUNCTION_ONLY_N_NO_F (vrshrntq, VRSHRNTQ) FUNCTION_ONLY_N_NO_F (vrshrq, VRSHRQ) +FUNCTION (vsbciq, vadc_vsbc_impl, (true, false)) +FUNCTION (vsbcq, vadc_vsbc_impl, (false, false)) FUNCTION (vshlcq, vshlc_impl,) FUNCTION_ONLY_N_NO_F (vshllbq, VSHLLBQ) FUNCTION_ONLY_N_NO_F (vshlltq, VSHLLTQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 37efa6bf13e..b8a8cf2c555 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -154,6 +154,8 @@ DEF_MVE_FUNCTION (vrshlq, binary_round_lshift, all_integer, mx_or_none) DEF_MVE_FUNCTION (vrshrnbq, binary_rshift_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vrshrntq, binary_rshift_narrow, integer_16_32, m_or_none) DEF_MVE_FUNCTION (vrshrq, binary_rshift, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vsbciq, vadc_vsbc, integer_32, m_or_none) +DEF_MVE_FUNCTION (vsbcq, vadc_vsbc, integer_32, m_or_none) DEF_MVE_FUNCTION (vshlcq, vshlc, all_integer, m_or_none) DEF_MVE_FUNCTION (vshllbq, binary_widen_n, integer_8_16, mx_or_none) DEF_MVE_FUNCTION (vshlltq, binary_widen_n, integer_8_16, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index eb8423c3fe2..da630d48e11 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -190,6 +190,8 @@ extern const function_base *const vrshlq; extern const function_base *const vrshrnbq; extern const function_base *const vrshrntq; extern const function_base *const vrshrq; +extern const function_base *const vsbciq; +extern const function_base *const vsbcq; extern const function_base *const vshlcq; extern const function_base *const vshllbq; extern const function_base *const vshlltq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index dd7b6f5cdab..34f024b29f4 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -85,10 +85,6 @@ #define vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrdq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) __arm_vstrwq_scatter_base_wb_p(__addr, __offset, __value, __p) #define vstrwq_scatter_base_wb(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb(__addr, __offset, __value) -#define vsbciq(__a, __b, __carry_out) __arm_vsbciq(__a, __b, __carry_out) -#define vsbciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m(__inactive, __a, __b, __carry_out, __p) -#define vsbcq(__a, __b, __carry) __arm_vsbcq(__a, __b, __carry) -#define vsbcq_m(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m(__inactive, __a, __b, __carry, __p) #define vst1q_p(__addr, __value, __p) __arm_vst1q_p(__addr, __value, __p) #define vst2q(__addr, __value) __arm_vst2q(__addr, __value) #define vld1q_z(__base, __p) __arm_vld1q_z(__base, __p) @@ -317,14 +313,6 @@ #define vstrwq_scatter_base_wb_s32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_s32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_u32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_u32(__addr, __offset, __value) #define vstrwq_scatter_base_wb_f32(__addr, __offset, __value) __arm_vstrwq_scatter_base_wb_f32(__addr, __offset, __value) -#define vsbciq_s32(__a, __b, __carry_out) __arm_vsbciq_s32(__a, __b, __carry_out) -#define vsbciq_u32(__a, __b, __carry_out) __arm_vsbciq_u32(__a, __b, __carry_out) -#define vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) -#define vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) -#define vsbcq_s32(__a, __b, __carry) __arm_vsbcq_s32(__a, __b, __carry) -#define vsbcq_u32(__a, __b, __carry) __arm_vsbcq_u32(__a, __b, __carry) -#define vsbcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_s32(__inactive, __a, __b, __carry, __p) -#define vsbcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_u32(__inactive, __a, __b, __carry, __p) #define vst1q_p_u8(__addr, __value, __p) __arm_vst1q_p_u8(__addr, __value, __p) #define vst1q_p_s8(__addr, __value, __p) __arm_vst1q_p_s8(__addr, __value, __p) #define vst2q_s8(__addr, __value) __arm_vst2q_s8(__addr, __value) @@ -1678,82 +1666,6 @@ __arm_vstrwq_scatter_base_wb_u32 (uint32x4_t * __addr, const int __offset, uint3 *__addr = __builtin_mve_vstrwq_scatter_base_wb_uv4si (*__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) -{ - int32x4_t __res = __builtin_mve_vsbciq_sv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) -{ - uint32x4_t __res = __builtin_mve_vsbciq_uv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - int32x4_t __res = __builtin_mve_vsbciq_m_sv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vsbciq_m_uv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vsbcq_sv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - uint32x4_t __res = __builtin_mve_vsbcq_uv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vsbcq_m_sv4si (__inactive, __a, __b, __p); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - uint32x4_t __res = __builtin_mve_vsbcq_m_uv4si (__inactive, __a, __b, __p); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vst1q_p_u8 (uint8_t * __addr, uint8x16_t __value, mve_pred16_t __p) @@ -3554,62 +3466,6 @@ __arm_vstrwq_scatter_base_wb (uint32x4_t * __addr, const int __offset, uint32x4_ __arm_vstrwq_scatter_base_wb_u32 (__addr, __offset, __value); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) -{ - return __arm_vsbciq_s32 (__a, __b, __carry_out); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) -{ - return __arm_vsbciq_u32 (__a, __b, __carry_out); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - return __arm_vsbciq_m_s32 (__inactive, __a, __b, __carry_out, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbciq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - return __arm_vsbciq_m_u32 (__inactive, __a, __b, __carry_out, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq (int32x4_t __a, int32x4_t __b, unsigned * __carry) -{ - return __arm_vsbcq_s32 (__a, __b, __carry); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) -{ - return __arm_vsbcq_u32 (__a, __b, __carry); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - return __arm_vsbcq_m_s32 (__inactive, __a, __b, __carry, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vsbcq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - return __arm_vsbcq_m_u32 (__inactive, __a, __b, __carry, __p); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vst1q_p (uint8_t * __addr, uint8x16_t __value, mve_pred16_t __p) @@ -5171,32 +5027,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_int64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_s64 (__ARM_mve_coerce_s64_ptr(p0, int64_t *), p1, p2), \ int (*)[__ARM_mve_type_uint64_t_ptr]: __arm_vldrdq_gather_shifted_offset_z_u64 (__ARM_mve_coerce_u64_ptr(p0, uint64_t *), p1, p2))) -#define __arm_vsbciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) - -#define __arm_vsbciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) - -#define __arm_vsbcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) - -#define __arm_vsbcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) - #define __arm_vldrbq_gather_offset_z(p0,p1,p2) ({ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p1)])0, \ int (*)[__ARM_mve_type_int8_t_ptr][__ARM_mve_type_uint8x16_t]: __arm_vldrbq_gather_offset_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), __ARM_mve_coerce(__p1, uint8x16_t), p2), \ From patchwork Wed Sep 4 13:26:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980819 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=A4uIo/7E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNm80tMzz1yXY for ; Wed, 4 Sep 2024 23:35:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50FEF38708AD for ; Wed, 4 Sep 2024 13:35:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id 3ECB73861027 for ; Wed, 4 Sep 2024 13:27:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3ECB73861027 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3ECB73861027 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; cv=none; b=phlmCzHHisUY7pQTMSDiqaHSPvXFGWm5H7T9VeM+FYEgYCrHFUVCrK7oi+hH67Kw8WuTWJDcgYML/qvX3eTfPK+L8xRj0dz1OVLSAVkkd92BStO66k0hvez/8odemWrWVVfzFqRwoJzq1FDwrjfKNKiFzDB/+7DtJjNwI38014A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456480; c=relaxed/simple; bh=WhNdcH8BrhG5Bbdlt2Kl/i9LAOmm2UwZ/vlDCb65qik=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=u5EpWG9R434gANQMLdQhEqEvo+qXSpLA4PvYUuC9D0jXk03N/t81m6xnuoZ2k6/DDAnlpjXNXE0+HWU5T8klIHoodfmd2Wr3HwqVyuu7yeqWOqGHEjAmQCYi0Mw4LVDGnGxdMNTxbuOelxgCLt2Mal+GWssF+4EdLeNMO+jtQ6k= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-5df9433ac0cso4415613eaf.3 for ; Wed, 04 Sep 2024 06:27:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456474; x=1726061274; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fnwhyqZyijtHQp2JzsgiqUA2kLkDNYK+xDVZh42rGhc=; b=A4uIo/7EK9ubbGECrvuKszZEG9V6VJIQVXjRNj8kdEsS9UQz5NgJykITDuM/XiE3Za D36j2z2nuCH4ggk48LODXPDQvMz14ahk/N2983014mI/Cr48DpUzhgOSy0tjKzTx4av3 cAcv2ZvoRmij2VrQgNhr+uYpEzG1IHKK+azPSLUh4B+/bVQe0W617SRjmY3jCsCS43fY BziTeX53dIa4OwdIFDfeKKEUEegdBUgyLe65bLYgvuWRv+8XsQTR8cwDHjp78nzrd49T DVKzqbkt3fIrpUI+EsyYiSIOcPXY4fhl2z0J/fOSi6nZyA9b765IhpHx/kpJ8zmld2Tg r1qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456474; x=1726061274; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fnwhyqZyijtHQp2JzsgiqUA2kLkDNYK+xDVZh42rGhc=; b=ZtVQlNt61b5ZAQW1a5e6eqC3OAS+QhOywNtgVk6FMDksBgJjRq1mQSU+RFDA1mTfeT LWDlDMrCdiRhJLHRawaki5RCYKjx/Rw+sO5M/49+/5IlnJx4P/7krlBXoi586PReMb2Y kSD4Iu6gydDGnkY/DHMcP22UgHiT5M06dkNA8gPXArMiJ+aiu4WL4PYZOMdu6Gll8z0U v5n5sKq42Q13UKgMuzOH+asvSbQvRDhKzfuyWzl+73rLmX6m2Fc69mVnAe8fbsZ+9Uxv 7ccnM0FsdWWKIMLU8b0PlKzKzF1HNnErF10D92yAZE6g1xFN2YFzRlwK4ZuR2EpicUd6 /tCw== X-Gm-Message-State: AOJu0Yydgwu/RI6LFDIco99RWvxbsDJLVQHIzTQ+jt4KQBmAzCKQq33Z C1xW7Wo8WxFyB0C48RsszhHA55Pc05jfzCvsNstEVMB+ZZHi3Da8frbSGK8OOxs3fcJUdCciOM8 tieTTzg== X-Google-Smtp-Source: AGHT+IGResFOFAQcbEuWZv99iHhzgZlnRSocWBbZKBOwezmulLMswPfCKl/FO81H48J2GcsEkn458w== X-Received: by 2002:a05:6820:1ad5:b0:5da:9b98:e208 with SMTP id 006d021491bc7-5dfad0203e6mr19650502eaf.5.1725456474224; Wed, 04 Sep 2024 06:27:54 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:53 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 36/36] arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers Date: Wed, 4 Sep 2024 13:26:50 +0000 Message-Id: <20240904132650.2720446-37-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org In several places we are looking for a type twice or half as large as the type suffix: this patch introduces helper functions to avoid code duplication. long_type_suffix is similar to the SVE counterpart, but adds an 'expected_tclass' parameter. half_type_suffix is similar to it, but does not exist in SVE. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (long_type_suffix): New. (half_type_suffix): New. (struct binary_move_narrow_def): Use new helper. (struct binary_move_narrow_unsigned_def): Likewise. (struct binary_rshift_narrow_def): Likewise. (struct binary_rshift_narrow_unsigned_def): Likewise. (struct binary_widen_def): Likewise. (struct binary_widen_n_def): Likewise. (struct binary_widen_opt_n_def): Likewise. (struct unary_widen_def): Likewise. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 114 +++++++++++++--------- 1 file changed, 68 insertions(+), 46 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 9deed178966..0a108cf0127 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -320,6 +320,45 @@ build_16_32 (function_builder &b, const char *signature, } } +/* TYPE is the largest type suffix associated with the arguments of R, but the + result is twice as wide. Return the associated type suffix of + EXPECTED_TCLASS if it exists, otherwise report an appropriate error and + return NUM_TYPE_SUFFIXES. */ +static type_suffix_index +long_type_suffix (function_resolver &r, + type_suffix_index type, + type_class_index expected_tclass) +{ + unsigned int element_bits = type_suffixes[type].element_bits; + if (expected_tclass == function_resolver::SAME_TYPE_CLASS) + expected_tclass = type_suffixes[type].tclass; + + if (type_suffixes[type].integer_p && element_bits < 64) + return find_type_suffix (expected_tclass, element_bits * 2); + + r.report_no_such_form (type); + return NUM_TYPE_SUFFIXES; +} + +/* Return the type suffix half as wide as TYPE with EXPECTED_TCLASS if it + exists, otherwise report an appropriate error and return + NUM_TYPE_SUFFIXES. */ +static type_suffix_index +half_type_suffix (function_resolver &r, + type_suffix_index type, + type_class_index expected_tclass) +{ + unsigned int element_bits = type_suffixes[type].element_bits; + if (expected_tclass == function_resolver::SAME_TYPE_CLASS) + expected_tclass = type_suffixes[type].tclass; + + if (type_suffixes[type].integer_p && element_bits > 8) + return find_type_suffix (expected_tclass, element_bits / 2); + + r.report_no_such_form (type); + return NUM_TYPE_SUFFIXES; +} + /* Declare the function shape NAME, pointing it to an instance of class _def. */ #define SHAPE(NAME) \ @@ -779,16 +818,13 @@ struct binary_move_narrow_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, narrow_suffix; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES) + || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES + || ((narrow_suffix = half_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES)) return error_mark_node; - type_suffix_index narrow_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits / 2); - - if (!r.require_matching_vector_type (0, narrow_suffix)) return error_mark_node; @@ -816,15 +852,13 @@ struct binary_move_narrow_unsigned_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, narrow_suffix; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES) + || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES + || ((narrow_suffix = half_type_suffix (r, type, TYPE_unsigned)) + == NUM_TYPE_SUFFIXES)) return error_mark_node; - type_suffix_index narrow_suffix - = find_type_suffix (TYPE_unsigned, - type_suffixes[type].element_bits / 2); - if (!r.require_matching_vector_type (0, narrow_suffix)) return error_mark_node; @@ -1112,16 +1146,14 @@ struct binary_rshift_narrow_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, narrow_suffix; if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES + || ((narrow_suffix = half_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES) || !r.require_integer_immediate (i)) return error_mark_node; - type_suffix_index narrow_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits / 2); - if (!r.require_matching_vector_type (0, narrow_suffix)) return error_mark_node; @@ -1159,16 +1191,14 @@ struct binary_rshift_narrow_unsigned_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, narrow_suffix; if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES + || ((narrow_suffix = half_type_suffix (r, type, TYPE_unsigned)) + == NUM_TYPE_SUFFIXES) || !r.require_integer_immediate (i)) return error_mark_node; - type_suffix_index narrow_suffix - = find_type_suffix (TYPE_unsigned, - type_suffixes[type].element_bits / 2); - if (!r.require_matching_vector_type (0, narrow_suffix)) return error_mark_node; @@ -1205,15 +1235,13 @@ struct binary_widen_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, wide_suffix; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES) + || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES + || ((wide_suffix = long_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES)) return error_mark_node; - type_suffix_index wide_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits * 2); - if (!r.require_matching_vector_type (i, type)) return error_mark_node; @@ -1298,17 +1326,15 @@ struct binary_widen_n_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, wide_suffix; tree res; if (!r.check_gp_argument (2, i, nargs) || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES + || ((wide_suffix = long_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES) || !r.require_integer_immediate (i)) return error_mark_node; - type_suffix_index wide_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits * 2); - /* Check the inactive argument has the wide type. */ if (((r.pred == PRED_m) && (r.infer_vector_type (0) == wide_suffix)) || r.pred == PRED_none @@ -1352,15 +1378,13 @@ struct binary_widen_opt_n_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, wide_suffix; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES) + || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES + || ((wide_suffix = long_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES)) return error_mark_node; - type_suffix_index wide_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits * 2); - /* Skip last argument, may be scalar, will be checked below by finish_opt_n_resolution. */ unsigned int last_arg = i--; @@ -1939,16 +1963,14 @@ struct unary_widen_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + type_suffix_index type, wide_suffix; tree res; if (!r.check_gp_argument (1, i, nargs) - || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES) + || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES + || ((wide_suffix = long_type_suffix (r, type, r.SAME_TYPE_CLASS)) + == NUM_TYPE_SUFFIXES)) return error_mark_node; - type_suffix_index wide_suffix - = find_type_suffix (type_suffixes[type].tclass, - type_suffixes[type].element_bits * 2); - /* Check the inactive argument has the wide type. */ if ((r.pred == PRED_m) && (r.infer_vector_type (0) != wide_suffix))