From patchwork Thu Jul 11 21:42:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959544 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=xHTfpYRs; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpBz69n7z1xpd for ; Fri, 12 Jul 2024 07:43:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 833DB38754BD for ; Thu, 11 Jul 2024 21:43:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 441293858410 for ; Thu, 11 Jul 2024 21:43:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 441293858410 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 441293858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734203; cv=none; b=Oty4zUcs6nE2QKJCnEmiBk59fC0LKLdukadDW9UFckGPBYSorfWLsNsLKi0mD7ARu0I+9baMYHlt23KLEslF/CYMF0NuIGG19ncPIHlecOaMvr76AMGiFn1n9Kxkp4V3MMfkcYdeP3bFeZwpASPF7QkFozY3+1avIdhPmMcr2lM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734203; c=relaxed/simple; bh=hlpEEH6lrEWHukZMDhiuil0/jtDO6QZxpAgzxLPmgDU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=H0Rb7FN9bvfLvfY+eTpcXUxg8uqQ7ndZkqeNO7gCM2FyG//Ex71POgXyigKdVLog/9DQhc72hmli0KZ1SIuGqiUn17q+QfWLbgFTV9IVxSxJIZH5cDhwEdLTPItY0Q0qM9NSXPgrg7L3Ap9wgmoiXiDm963xQvT9qN7nflOd05Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5c697fc4aa2so851846eaf.2 for ; Thu, 11 Jul 2024 14:43:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734200; x=1721339000; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=yKQLt03cdjLeH58Bs4MWNkZpAmLIj4IuzaYsSSvKvAg=; b=xHTfpYRsv1sSFE4f4mzzQiyussOgW5mWu4hWXkGOcXISgAukyldrEzrd4WBj3E09+B nTEH1w5s9lomcPTqe9sseHGKUrjoLPzprC0nhmPukv65O/vZmqshj2vzg/0a+auQgnnk jrC1o/3IMHh2ldD1/X3ScTXCbAgulXEN4FuL2flH8mt2GztAtd9WbqrCFGviywuJusP1 YesHheY5cCwsjF7uZQCyn4ZdBPZ3W9akj6N8MJtpEOJI9Fa1NmWlccNGHgSp0deZgvY7 Qh7lt0Pca0FwIuovCji4pJ91y0iD43fPI+eqY/siySjiTCWTNjwxPrdsz67zrJWO0nln g1qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734200; x=1721339000; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yKQLt03cdjLeH58Bs4MWNkZpAmLIj4IuzaYsSSvKvAg=; b=s64/XNYXtI71rRhmr50G9DRqmT1ZXvWAeDlU/zeaIVqP8JnpsC0HEFlWS+L4I+kS9c Pn+NJDerUxb3oqha6Nj/mnBpeJ6J26JvcUaxD83KNRGIKnF6bNLTxQjsT8Cx+eoOnm22 cX/C08I6RAk41Bgpmj2C8Rl7MOI9vMDtL/P/S8Z4kPpBgpWSHzRxmyshLqkRu9J2tD7x Qk3EtoaQfgr/9ILbjPqgUeoDY5olpNpaHR8PqDZYOdoHRFQj0yLtMWhWvwDJOFkaHnMc Ks2chTC7Wtlud9lo9IC8iVknTwm86uCRP+lR1JD82hf7Tkg5xYin9VCBBsw0w+W7hvr/ vNdQ== X-Gm-Message-State: AOJu0Yygeah6rUw3LE+GQ6aa1QD65TeuIMy8Zsl71KT6P2gSSfyty/7I GVCRRYW+gJRpJINxDzzupTNSbcmcrUFuNl33v3YYeht7Y4v50uPEpNIv/UOgkfU241tddQPw71+ ljIE= X-Google-Smtp-Source: AGHT+IHiR+WUppZsb6M3w/vLUa6hG0u4JDo+MU882q0oqvcvinJyEv6BmuymoBBvzzYKfQA7il1cMA== X-Received: by 2002:a05:6820:f01:b0:5c4:5cbc:b1b5 with SMTP id 006d021491bc7-5ccd7d70fb2mr416508eaf.0.1720734199708; Thu, 11 Jul 2024 14:43:19 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:19 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 01/15] arm: [MVE intrinsics] improve comment for orrq shape Date: Thu, 11 Jul 2024 21:42:51 +0000 Message-Id: <20240711214305.3193022-1-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Add a comment about the lack of "n" forms for floating-point nor 8-bit integers, to make it clearer why we use build_16_32 for MODE_n. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_orrq_def): Improve comment. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index ba20c6a8f73..e01939469e3 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -865,7 +865,12 @@ SHAPE (binary_opt_n) int16x8_t [__arm_]vorrq_m[_s16](int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p) int16x8_t [__arm_]vorrq_x[_s16](int16x8_t a, int16x8_t b, mve_pred16_t p) int16x8_t [__arm_]vorrq[_n_s16](int16x8_t a, const int16_t imm) - int16x8_t [__arm_]vorrq_m_n[_s16](int16x8_t a, const int16_t imm, mve_pred16_t p) */ + int16x8_t [__arm_]vorrq_m_n[_s16](int16x8_t a, const int16_t imm, mve_pred16_t p) + + No "_n" forms for floating-point, nor 8-bit integers: + float16x8_t [__arm_]vorrq[_f16](float16x8_t a, float16x8_t b) + float16x8_t [__arm_]vorrq_m[_f16](float16x8_t inactive, float16x8_t a, float16x8_t b, mve_pred16_t p) + float16x8_t [__arm_]vorrq_x[_f16](float16x8_t a, float16x8_t b, mve_pred16_t p) */ struct binary_orrq_def : public overloaded_base<0> { bool From patchwork Thu Jul 11 21:42:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959543 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=DOJ0NRre; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpBz6wzLz20MX for ; Fri, 12 Jul 2024 07:43:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A2BAA385DDD2 for ; Thu, 11 Jul 2024 21:43:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id 8BB53385DDD2 for ; Thu, 11 Jul 2024 21:43:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8BB53385DDD2 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8BB53385DDD2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c36 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734203; cv=none; b=ADgZ8LDrwbill9zi7a1mdl1nUk4bAoO6XLwOztUb+YqmyH9X5eU6YxylNJC9Gj9rAgZ3NmWQXctIPmxBARHdxcs3dvGJAVRjvBCrtsNmtqw3yD8l7tNQbi5A9k8AeD6lAtWESl0J8CUEZrNxR23U+T9DBPa1OaVYVbyZ6fgtzmE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734203; c=relaxed/simple; bh=FpEw6N+DxnhaFmYOzcpNA4+B/x/Hy2Mw7PF0ows3kNE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=VJxHUpNg7dY+ApuGQF1kIRQdtcXov7nHZG4LVxtm1JTZmACO4UjPDe/iKbD8knC9GqaKbjWow5sgW/TatzAMP4PvdH0WUm6DfKQP6OyZA4dtkHwq2pSOq8qtvGkEqAy5WWiJsgcyRXf+AeCsGY5kBf+YZwH87M3JHNdGN5/yWdQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-5ca9835db95so769047eaf.3 for ; Thu, 11 Jul 2024 14:43:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734200; x=1721339000; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=l5mjEkziHbn9EkdRH1yxO7aiRl8+CbanGxIv3R9K800=; b=DOJ0NRretSzKiMQBwTwkvHic4Xe4A+2c96TW2CsLJe6Qat0fQbheCDSMUCJDOYHy1L wsHM8w+0D2OfE7T8RFA1kNZtPsNaWIWusT70TzhTnhHWqbfc64tGGy9q2NDkdSYYteJL XWwvU6HmKLohxFaNmkCcgvuDVrkVdRrdErQ01eloJLiGb7+hAaTREOC0U/mFTkX/6Tyg Q52XWZ6Fr+D0/8VJkPTRynVmE4YWV7Shs7mHYdzNFKykxsUX6mdhmhtJjmT/nbdxvSFm wkcJN06Kfm3vRQRZ/p3646D/ktgUYDE0pwpa/CjcwMJYXjqO1j70tCPRf+BX6MNfYNPU 6iiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734200; x=1721339000; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l5mjEkziHbn9EkdRH1yxO7aiRl8+CbanGxIv3R9K800=; b=MC37O5p1WIPH7YWhMw5+nl+tlzFwi+ASS1eCENVNjHXhgDD763f4HyMkTAn44OPVKy dZvop5fC77bbN0L1NXEO3QJa14MzhcTLsoR17pfcIYJYXm5l7dpjToLw67F+4qwk8wJ5 VpmavcGn6U+leOsce2Y4xnPrmIaLuotHkQfriwujIB/FMNKAX9n6eTs5UgbKjAMwa/sv hpJYGmrPQU99dTIfymxjXAhYSh7iFHVeGwQyilgorGlhopwjLVObQEVhPyqmH+e/7rzM nH6jNVHGxfE6l3348iweOGLeXxBFY6xxLKQlBuS0gEpEVKiAUWKxaIcYhwi429KeC0YU 5nfg== X-Gm-Message-State: AOJu0Yznh0mM5c6Jk0awYRVQkBMxXIjYc1uj7wcawyFeKZ3D3xiBd+H/ MBrFxfxLExfmb4BMzkKZSrq374aRcoFczjUSESLKEnOmsXjD8+uzXvpNvuHBj69e/wEv00xs9lK t4csGeQ== X-Google-Smtp-Source: AGHT+IFnXZT7mzRMfcRVjFC9E+tw2AaFVhM1IQaefQslXQzOnxOnv4WhlXQrG4d+BPW/yDzLThODDA== X-Received: by 2002:a05:6820:2686:b0:5c4:57b6:ffbf with SMTP id 006d021491bc7-5cce0a4d350mr282416eaf.0.1720734200343; Thu, 11 Jul 2024 14:43:20 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:19 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 02/15] arm: [MVE intrinsics] remove useless resolve from create shape Date: Thu, 11 Jul 2024 21:42:52 +0000 Message-Id: <20240711214305.3193022-2-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org vcreateq have no overloaded forms, so there's no need for resolve (). 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (create_def): Remove resolve. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 6 ------ 1 file changed, 6 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index e01939469e3..0520a8331db 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -1408,12 +1408,6 @@ struct create_def : public nonoverloaded_base { build_all (b, "v0,su64,su64", group, MODE_none, preserve_user_namespace); } - - tree - resolve (function_resolver &r) const override - { - return r.resolve_uniform (0, 2); - } }; SHAPE (create) From patchwork Thu Jul 11 21:42:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959548 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=vKXp2lfN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpDR3xVPz1xqx for ; Fri, 12 Jul 2024 07:45:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CDA813839165 for ; Thu, 11 Jul 2024 21:45:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id B35183875DD5 for ; Thu, 11 Jul 2024 21:43:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B35183875DD5 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B35183875DD5 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734208; cv=none; b=Qz4Qbk85mC5RyDX/OYXY0zY8pZHo+ZNXpoBfFxD3p0XUjCwSc2Aj0o8UtzqvxhMt3peplU+YREHibSzWMMy/MZ9NSmkDipZe7FwxtRf0TeJHPujD9r3/Z17GJGKvugiVVn3IZo6WV2iAt9eRpG5OUyUsZog/S2wF8F18IWLpU2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734208; c=relaxed/simple; bh=OG3xjwg4MpcAK6s/GswhWjcyEDch9MMwK3T7oV6LEKA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=kQ0Lmw9TQJWTMpCiNa/aYHfR2972CiXK8lgnT9g/wZJ6mbaoXQ/g+arYFaay6BvG00HqUfO7x8W1Rlu8mCO80mVOUuRM+QLDMs0TDGr/UdqRQHEbvmz1kgdvv0wBsMWDeQ2TF0D9ZbwuOqG8Rz7rgB3tqtwxc094FD4RdwkKwDI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-5c661e75ff6so653020eaf.2 for ; Thu, 11 Jul 2024 14:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734202; x=1721339002; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=q0Q7FahkdbyEmzgzkFSQgKcikXYKDCpbmOLWQ4Kd9pY=; b=vKXp2lfNJNvv+InvgiGJUI/1P6iTvTYG16PkiTq/91yblGRQv+88gEAufimhPYqPdj d3XXRGIJ7aXMavtHRaJ/xxio4JhxnXjI3mIlGluumPCm0UbTjVxoKHhntGRCt3CztO95 x6JOpR27XiGOAnoRldVAUvAzes+Wfn27diPUGA5qFBe/UmqobCpuTK80WzemZkjN11kG +3W6RcZwqMs4I9EEZR/y4wICcPutzIuwfsfhW5nUwc3OIjZnkTCqYXIR65ClRrOoFAR+ sGpd//HnIJBT4i+AsauKKVwoHXIh2eHPCRA/eoTWY7El8RBG0axbGApeM6WXwCLF5J8A WZBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734202; x=1721339002; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q0Q7FahkdbyEmzgzkFSQgKcikXYKDCpbmOLWQ4Kd9pY=; b=YLUyIvPfaHFNQHvOhI1uMO8cJxE0TDzQ9VQCN6hpwKA0kKym/yejsTJfOyv7nmviKk TA0n/4cBWEjLeFCcpwrgG3qj12oVnqPnfEjnRzwdkHufJOqkr2e9FR4ZMq9/Yugr8ni8 8SuzTlj9mxh2hxg6QqivfHNOLwAgEPwS4G41l+tSN39H7ZH+dTFv4HcJugOhOi0FU7GR wNQrNnWDuOh5WU0hDhGiLHJB3EAhvh6qDppPS4f7ZNE0sEUoaMIIJ2b5Cz6VbUtGQklN l6pQ5dAIluu7TyIBQ44FOa4FHl7C3oWgydHQfLmpLJwFayOmTUPpSq1OMFurZ2qaeobY Qd+A== X-Gm-Message-State: AOJu0YwYdsCXDJ6EH9GGTQ0kxPATQSgAbXvyfzJX2gNaYpULHKbDlvKF IgQGgBbUBvthceOhg4LVMz//GDblXnQm/EE5rO2MTG2J+qaOzxVo53Wg0X5WwEAfMy1EIQAKEYq 4O3JtGA== X-Google-Smtp-Source: AGHT+IHZjTkyIUvpLcUknsnE9s1B3kGqKXvhbRB02idjuEL98QSjiLdDsyVQxZxmeWaA+pwdD4kEKw== X-Received: by 2002:a4a:ec4c:0:b0:5c6:aea8:aae7 with SMTP id 006d021491bc7-5cce657637emr151441eaf.5.1720734201489; Thu, 11 Jul 2024 14:43:21 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:20 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 03/15] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h Date: Thu, 11 Jul 2024 21:42:53 +0000 Message-Id: <20240711214305.3193022-3-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch brings no functional change but removes some code duplication in arm-mve-builtins-functions.h and makes it easier to read and maintain. It introduces a new expand_unspec () member of unspec_based_mve_function_base and makes a few classes inherit from it instead of function_base. This adds 3 new members containing the unspec codes for signed-int, unsigned-int and floating-point intrinsics (no mode, no predicate). Depending on the derived class, these will be used instead of the 3 similar RTX codes. The new expand_unspec () handles all the possible unspecs, some of which maybe not be supported by a given intrinsics family: such code paths won't be used in that case. Similarly codes specific to a family (RTX, or PRED_p for instance) should be handled by the caller of expand_unspec (). Thanks to this, expand () for unspec_based_mve_function_exact_insn, unspec_mve_function_exact_insn, unspec_mve_function_exact_insn_pred_p, unspec_mve_function_exact_insn_vshl no longer duplicate a lot of code. The patch also make most of PRED_m and PRED_x handling use the same code. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-functions.h (unspec_based_mve_function_base): Add m_unspec_for_sint, m_unspec_for_uint, m_unspec_for_fp and expand_unspec members. (unspec_based_mve_function_exact_insn): Inherit from unspec_based_mve_function_base and use expand_unspec. (unspec_mve_function_exact_insn): Likewise. (unspec_mve_function_exact_insn_pred_p): Likewise. (unspec_mve_function_exact_insn_vshl): Likewise. (unspec_based_mve_function_exact_insn_vcmp): Initialize new inherited members. (unspec_mve_function_exact_insn_rot): Merge PRED_m and PRED_x handling. (unspec_mve_function_exact_insn_vmull): Likewise. (unspec_mve_function_exact_insn_vmull_poly): Likewise. --- gcc/config/arm/arm-mve-builtins-functions.h | 607 +++++++------------- 1 file changed, 211 insertions(+), 396 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index ac2a731bff4..43c4aaffeb1 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -40,17 +40,23 @@ public: }; /* An incomplete function_base for functions that have an associated - rtx_code for signed integers, unsigned integers and floating-point - values for the non-predicated, non-suffixed intrinsic, and unspec - codes, with separate codes for signed integers, unsigned integers - and floating-point values. The class simply records information - about the mapping for derived classes to use. */ + rtx_code or an unspec for signed integers, unsigned integers and + floating-point values for the non-predicated, non-suffixed + intrinsics, and unspec codes, with separate codes for signed + integers, unsigned integers and floating-point values for + predicated and/or suffixed intrinsics. The class simply records + information about the mapping for derived classes to use and + provides a generic expand_unspec () to avoid duplicating expansion + code in derived classes. */ class unspec_based_mve_function_base : public function_base { public: CONSTEXPR unspec_based_mve_function_base (rtx_code code_for_sint, rtx_code code_for_uint, rtx_code code_for_fp, + int unspec_for_sint, + int unspec_for_uint, + int unspec_for_fp, int unspec_for_n_sint, int unspec_for_n_uint, int unspec_for_n_fp, @@ -63,6 +69,9 @@ public: : m_code_for_sint (code_for_sint), m_code_for_uint (code_for_uint), m_code_for_fp (code_for_fp), + m_unspec_for_sint (unspec_for_sint), + m_unspec_for_uint (unspec_for_uint), + m_unspec_for_fp (unspec_for_fp), m_unspec_for_n_sint (unspec_for_n_sint), m_unspec_for_n_uint (unspec_for_n_uint), m_unspec_for_n_fp (unspec_for_n_fp), @@ -83,6 +92,9 @@ public: /* The unspec code associated with signed-integer, unsigned-integer and floating-point operations respectively. It covers the cases with the _n suffix, and/or the _m predicate. */ + int m_unspec_for_sint; + int m_unspec_for_uint; + int m_unspec_for_fp; int m_unspec_for_n_sint; int m_unspec_for_n_uint; int m_unspec_for_n_fp; @@ -92,142 +104,146 @@ public: int m_unspec_for_m_n_sint; int m_unspec_for_m_n_uint; int m_unspec_for_m_n_fp; + + rtx expand_unspec (function_expander &e) const; }; -/* Map the function directly to CODE (UNSPEC, M) where M is the vector - mode associated with type suffix 0, except when there is no - predicate and no _n suffix, in which case we use the appropriate - rtx_code. This is useful when the basic operation is mapped to a - standard RTX code and all other versions use different unspecs. */ -class unspec_based_mve_function_exact_insn : public unspec_based_mve_function_base +/* Expand the unspecs, which is common to all intrinsics using + unspec_based_mve_function_base. If some combinations are not + supported for an intrinsics family, they should be handled by the + caller (and not crash here). */ +rtx +unspec_based_mve_function_base::expand_unspec (function_expander &e) const { -public: - CONSTEXPR unspec_based_mve_function_exact_insn (rtx_code code_for_sint, - rtx_code code_for_uint, - rtx_code code_for_fp, - int unspec_for_n_sint, - int unspec_for_n_uint, - int unspec_for_n_fp, - int unspec_for_m_sint, - int unspec_for_m_uint, - int unspec_for_m_fp, - int unspec_for_m_n_sint, - int unspec_for_m_n_uint, - int unspec_for_m_n_fp) - : unspec_based_mve_function_base (code_for_sint, - code_for_uint, - code_for_fp, - unspec_for_n_sint, - unspec_for_n_uint, - unspec_for_n_fp, - unspec_for_m_sint, - unspec_for_m_uint, - unspec_for_m_fp, - unspec_for_m_n_sint, - unspec_for_m_n_uint, - unspec_for_m_n_fp) - {} - - rtx - expand (function_expander &e) const override - { - /* No suffix, no predicate, use the right RTX code. */ - if ((e.mode_suffix_id != MODE_n) - && (e.pred == PRED_none)) - return e.map_to_rtx_codes (m_code_for_sint, m_code_for_uint, - m_code_for_fp); - + machine_mode mode = e.vector_mode (0); insn_code code; + switch (e.pred) { case PRED_none: - if (e.mode_suffix_id == MODE_n) - /* No predicate, _n suffix. */ - { - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_n_f (m_unspec_for_n_fp, e.vector_mode (0)); - - return e.use_exact_insn (code); - } - gcc_unreachable (); - break; - - case PRED_m: switch (e.mode_suffix_id) { case MODE_none: - /* No suffix, "m" predicate. */ + /* No predicate, no suffix. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, mode); else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, mode); else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); + code = code_for_mve_q_f (m_unspec_for_fp, mode); break; case MODE_n: - /* _n suffix, "m" predicate. */ + /* No predicate, _n suffix. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); + code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, mode); else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); + code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, mode); else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); + code = code_for_mve_q_n_f (m_unspec_for_n_fp, mode); break; default: gcc_unreachable (); } - return e.use_cond_insn (code, 0); + return e.use_exact_insn (code); + case PRED_m: case PRED_x: switch (e.mode_suffix_id) { case MODE_none: - /* No suffix, "x" predicate. */ + /* No suffix, "m" or "x" predicate. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, mode); else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, mode); else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); + code = code_for_mve_q_m_f (m_unspec_for_m_fp, mode); break; case MODE_n: - /* _n suffix, "x" predicate. */ + /* _n suffix, "m" or "x" predicate. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); + code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, mode); else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); + code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, mode); else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); + code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, mode); break; default: gcc_unreachable (); } - return e.use_pred_x_insn (code); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + break; default: gcc_unreachable (); } +} - gcc_unreachable (); +/* Map the function directly to CODE (UNSPEC, M) where M is the vector + mode associated with type suffix 0, except when there is no + predicate and no _n suffix, in which case we use the appropriate + rtx_code. This is useful when the basic operation is mapped to a + standard RTX code and all other versions use different unspecs. */ +class unspec_based_mve_function_exact_insn : public unspec_based_mve_function_base +{ +public: + CONSTEXPR unspec_based_mve_function_exact_insn (rtx_code code_for_sint, + rtx_code code_for_uint, + rtx_code code_for_fp, + int unspec_for_n_sint, + int unspec_for_n_uint, + int unspec_for_n_fp, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp, + int unspec_for_m_n_sint, + int unspec_for_m_n_uint, + int unspec_for_m_n_fp) + : unspec_based_mve_function_base (code_for_sint, + code_for_uint, + code_for_fp, + -1, + -1, + -1, + unspec_for_n_sint, + unspec_for_n_uint, + unspec_for_n_fp, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + unspec_for_m_n_fp) + {} + + rtx + expand (function_expander &e) const override + { + /* No suffix, no predicate, use the right RTX code. */ + if ((e.mode_suffix_id != MODE_n) + && (e.pred == PRED_none)) + return e.map_to_rtx_codes (m_code_for_sint, m_code_for_uint, + m_code_for_fp); + + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC, M) where M is the vector mode associated with type suffix 0. */ -class unspec_mve_function_exact_insn : public function_base +class unspec_mve_function_exact_insn : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn (int unspec_for_sint, @@ -242,143 +258,33 @@ public: int unspec_for_m_n_sint, int unspec_for_m_n_uint, int unspec_for_m_n_fp) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp), - m_unspec_for_n_sint (unspec_for_n_sint), - m_unspec_for_n_uint (unspec_for_n_uint), - m_unspec_for_n_fp (unspec_for_n_fp), - m_unspec_for_m_sint (unspec_for_m_sint), - m_unspec_for_m_uint (unspec_for_m_uint), - m_unspec_for_m_fp (unspec_for_m_fp), - m_unspec_for_m_n_sint (unspec_for_m_n_sint), - m_unspec_for_m_n_uint (unspec_for_m_n_uint), - m_unspec_for_m_n_fp (unspec_for_m_n_fp) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + unspec_for_fp, + unspec_for_n_sint, + unspec_for_n_uint, + unspec_for_n_fp, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + unspec_for_m_n_fp) {} - /* The unspec code associated with signed-integer, unsigned-integer - and floating-point operations respectively. It covers the cases - with the _n suffix, and/or the _m predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_fp; - int m_unspec_for_n_sint; - int m_unspec_for_n_uint; - int m_unspec_for_n_fp; - int m_unspec_for_m_sint; - int m_unspec_for_m_uint; - int m_unspec_for_m_fp; - int m_unspec_for_m_n_sint; - int m_unspec_for_m_n_uint; - int m_unspec_for_m_n_fp; - rtx expand (function_expander &e) const override { - insn_code code; - switch (e.pred) - { - case PRED_none: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No predicate, no suffix. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - else - code = code_for_mve_q_f (m_unspec_for_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* No predicate, _n suffix. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_n_f (m_unspec_for_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_exact_insn (code); - - case PRED_m: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "m" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); - - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_pred_x_insn (code); - - default: - gcc_unreachable (); - } - - gcc_unreachable (); + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC), when there is a non-predicated version and one with the "_p" predicate. */ -class unspec_mve_function_exact_insn_pred_p : public function_base +class unspec_mve_function_exact_insn_pred_p : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn_pred_p (int unspec_for_sint, @@ -387,19 +293,23 @@ public: int unspec_for_p_sint, int unspec_for_p_uint, int unspec_for_p_fp) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp), + : unspec_based_mve_function_base (UNKNOWN, /* No RTX code. */ + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + unspec_for_fp, + -1, -1, -1, /* No _n intrinsics. */ + -1, -1, -1, /* No _m intrinsics. */ + -1, -1, -1), /* No _m_n intrinsics. */ m_unspec_for_p_sint (unspec_for_p_sint), m_unspec_for_p_uint (unspec_for_p_uint), m_unspec_for_p_fp (unspec_for_p_fp) {} - /* The unspec code associated with signed-integer and unsigned-integer - operations, with no predicate, or with "_p" predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_fp; + /* The unspec code associated with signed-integer and + unsigned-integer or floating-point operations with "_p" + predicate. */ int m_unspec_for_p_sint; int m_unspec_for_p_uint; int m_unspec_for_p_fp; @@ -440,45 +350,30 @@ public: gcc_unreachable (); } } - else - { - switch (e.pred) - { - case PRED_none: - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - else - code = code_for_mve_q_f (m_unspec_for_fp, e.vector_mode (0)); - - return e.use_exact_insn (code); - case PRED_p: - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_p (m_unspec_for_p_uint, m_unspec_for_p_uint, e.vector_mode (0)); - else - code = code_for_mve_q_p (m_unspec_for_p_sint, m_unspec_for_p_sint, e.vector_mode (0)); - else - code = code_for_mve_q_p_f (m_unspec_for_p_fp, e.vector_mode (0)); + if (e.pred == PRED_p) + { + machine_mode mode = e.vector_mode (0); - return e.use_exact_insn (code); + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q_p (m_unspec_for_p_uint, m_unspec_for_p_uint, mode); + else + code = code_for_mve_q_p (m_unspec_for_p_sint, m_unspec_for_p_sint, mode); + else + code = code_for_mve_q_p_f (m_unspec_for_p_fp, mode); - default: - gcc_unreachable (); - } + return e.use_exact_insn (code); } - gcc_unreachable (); + return expand_unspec (e); } }; /* Map the function directly to CODE (UNSPEC, M) for vshl-like builtins. The difference with unspec_mve_function_exact_insn is that this function handles MODE_r and the related unspecs.. */ -class unspec_mve_function_exact_insn_vshl : public function_base +class unspec_mve_function_exact_insn_vshl : public unspec_based_mve_function_base { public: CONSTEXPR unspec_mve_function_exact_insn_vshl (int unspec_for_sint, @@ -493,31 +388,29 @@ public: int unspec_for_m_r_uint, int unspec_for_r_sint, int unspec_for_r_uint) - : m_unspec_for_sint (unspec_for_sint), - m_unspec_for_uint (unspec_for_uint), - m_unspec_for_n_sint (unspec_for_n_sint), - m_unspec_for_n_uint (unspec_for_n_uint), - m_unspec_for_m_sint (unspec_for_m_sint), - m_unspec_for_m_uint (unspec_for_m_uint), - m_unspec_for_m_n_sint (unspec_for_m_n_sint), - m_unspec_for_m_n_uint (unspec_for_m_n_uint), + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + unspec_for_sint, + unspec_for_uint, + -1, + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + -1, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1), m_unspec_for_m_r_sint (unspec_for_m_r_sint), m_unspec_for_m_r_uint (unspec_for_m_r_uint), m_unspec_for_r_sint (unspec_for_r_sint), m_unspec_for_r_uint (unspec_for_r_uint) {} - /* The unspec code associated with signed-integer, unsigned-integer - and floating-point operations respectively. It covers the cases - with the _n suffix, and/or the _m predicate. */ - int m_unspec_for_sint; - int m_unspec_for_uint; - int m_unspec_for_n_sint; - int m_unspec_for_n_uint; - int m_unspec_for_m_sint; - int m_unspec_for_m_uint; - int m_unspec_for_m_n_sint; - int m_unspec_for_m_n_uint; + /* The unspec code associated with signed-integer and unsigned-integer + operations with MODE_r with or without PRED_m. */ int m_unspec_for_m_r_sint; int m_unspec_for_m_r_uint; int m_unspec_for_r_sint; @@ -527,101 +420,38 @@ public: expand (function_expander &e) const override { insn_code code; - switch (e.pred) + if (e.mode_suffix_id == MODE_r) { - case PRED_none: - switch (e.mode_suffix_id) + machine_mode mode = e.vector_mode (0); + switch (e.pred) { - case MODE_none: - /* No predicate, no suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); - else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); - break; - - case MODE_n: - /* No predicate, _n suffix. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_n (m_unspec_for_n_uint, m_unspec_for_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_n (m_unspec_for_n_sint, m_unspec_for_n_sint, e.vector_mode (0)); - break; - - case MODE_r: + case PRED_none: /* No predicate, _r suffix. */ if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_r (m_unspec_for_r_uint, m_unspec_for_r_uint, e.vector_mode (0)); - else - code = code_for_mve_q_r (m_unspec_for_r_sint, m_unspec_for_r_sint, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_exact_insn (code); - - case PRED_m: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - break; - - case MODE_n: - /* _n suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - break; - - case MODE_r: - /* _r suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_r (m_unspec_for_m_r_uint, m_unspec_for_m_r_uint, e.vector_mode (0)); + code = code_for_mve_q_r (m_unspec_for_r_uint, m_unspec_for_r_uint, mode); else - code = code_for_mve_q_m_r (m_unspec_for_m_r_sint, m_unspec_for_m_r_sint, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); + code = code_for_mve_q_r (m_unspec_for_r_sint, m_unspec_for_r_sint, mode); + return e.use_exact_insn (code); - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ + case PRED_m: + case PRED_x: + /* _r suffix, "m" or "x" predicate. */ if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + code = code_for_mve_q_m_r (m_unspec_for_m_r_uint, m_unspec_for_m_r_uint, mode); else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - break; + code = code_for_mve_q_m_r (m_unspec_for_m_r_sint, m_unspec_for_m_r_sint, mode); - case MODE_n: - /* _n suffix, "x" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, m_unspec_for_m_n_uint, e.vector_mode (0)); + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); else - code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, m_unspec_for_m_n_sint, e.vector_mode (0)); - break; + return e.use_pred_x_insn (code); default: gcc_unreachable (); } - return e.use_pred_x_insn (code); - - default: - gcc_unreachable (); } - gcc_unreachable (); + return expand_unspec (e); } }; @@ -641,9 +471,8 @@ public: : unspec_based_mve_function_base (code_for_sint, code_for_uint, code_for_fp, - -1, - -1, - -1, + -1, -1, -1, /* No non-predicated, no mode intrinsics. */ + -1, -1, -1, /* No _n intrinsics. */ unspec_for_m_sint, unspec_for_m_uint, unspec_for_m_fp, @@ -738,7 +567,9 @@ public: /* Map the function directly to CODE (UNSPEC, UNSPEC, UNSPEC, M) where M is the vector mode associated with type suffix 0. USed for the operations where there is a "rot90" or "rot270" suffix, depending - on the UNSPEC. */ + on the UNSPEC. We cannot use + unspec_based_mve_function_base::expand_unspec () because we call + code_for_mve_q with one more parameter. */ class unspec_mve_function_exact_insn_rot : public function_base { public: @@ -769,6 +600,7 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; switch (e.pred) @@ -780,11 +612,11 @@ public: /* No predicate, no suffix. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); + code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, m_unspec_for_uint, mode); else - code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); + code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, m_unspec_for_sint, mode); else - code = code_for_mve_q_f (m_unspec_for_fp, m_unspec_for_fp, e.vector_mode (0)); + code = code_for_mve_q_f (m_unspec_for_fp, m_unspec_for_fp, mode); break; default: @@ -793,42 +625,28 @@ public: return e.use_exact_insn (code); case PRED_m: + case PRED_x: switch (e.mode_suffix_id) { case MODE_none: - /* No suffix, "m" predicate. */ + /* No suffix, "m" or "x" predicate. */ if (e.type_suffix (0).integer_p) if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, mode); else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, mode); else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); - break; - - default: - gcc_unreachable (); - } - return e.use_cond_insn (code, 0); + code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, mode); - case PRED_x: - switch (e.mode_suffix_id) - { - case MODE_none: - /* No suffix, "x" predicate. */ - if (e.type_suffix (0).integer_p) - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); else - code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); + return e.use_pred_x_insn (code); break; default: gcc_unreachable (); } - return e.use_pred_x_insn (code); default: gcc_unreachable (); @@ -866,6 +684,7 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; if (! e.type_suffix (0).integer_p) @@ -879,29 +698,24 @@ public: case PRED_none: /* No predicate, no suffix. */ if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int (m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); + code = code_for_mve_q_int (m_unspec_for_uint, m_unspec_for_uint, mode); else - code = code_for_mve_q_int (m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); + code = code_for_mve_q_int (m_unspec_for_sint, m_unspec_for_sint, mode); return e.use_exact_insn (code); case PRED_m: - /* No suffix, "m" predicate. */ - if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); - else - code = code_for_mve_q_int_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); - - return e.use_cond_insn (code, 0); - case PRED_x: - /* No suffix, "x" predicate. */ + /* No suffix, "m" or "x" predicate. */ if (e.type_suffix (0).unsigned_p) - code = code_for_mve_q_int_m (m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + code = code_for_mve_q_int_m (m_unspec_for_m_uint, m_unspec_for_m_uint, mode); else - code = code_for_mve_q_int_m (m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + code = code_for_mve_q_int_m (m_unspec_for_m_sint, m_unspec_for_m_sint, mode); - return e.use_pred_x_insn (code); + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); default: gcc_unreachable (); @@ -933,6 +747,7 @@ public: rtx expand (function_expander &e) const override { + machine_mode mode = e.vector_mode (0); insn_code code; if (e.mode_suffix_id != MODE_none) @@ -945,18 +760,18 @@ public: { case PRED_none: /* No predicate, no suffix. */ - code = code_for_mve_q_poly (m_unspec_for_poly, m_unspec_for_poly, e.vector_mode (0)); + code = code_for_mve_q_poly (m_unspec_for_poly, m_unspec_for_poly, mode); return e.use_exact_insn (code); case PRED_m: - /* No suffix, "m" predicate. */ - code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, e.vector_mode (0)); - return e.use_cond_insn (code, 0); - case PRED_x: - /* No suffix, "x" predicate. */ - code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, e.vector_mode (0)); - return e.use_pred_x_insn (code); + /* No suffix, "m" or "x" predicate. */ + code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); default: gcc_unreachable (); From patchwork Thu Jul 11 21:42:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959545 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=EUdsCHfT; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpC26ZZjz1xpd for ; Fri, 12 Jul 2024 07:43:58 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 285523839151 for ; Thu, 11 Jul 2024 21:43:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) by sourceware.org (Postfix) with ESMTPS id 030613875DDA for ; Thu, 11 Jul 2024 21:43:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 030613875DDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 030613875DDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734206; cv=none; b=MKWu/SBrmdqO59wdkWcATWeuh8LJxfSkO/FPHcXeX4yQTP6ytbhf1z075tYupCmLCf8dli3KmJv1XiU5Em9yN/9Pj8AACsurRrDWoX75K9tZ3d0xqC/rGiia1TJAc+7RCefVu2F7RdPgFsVjj9avUJQxyzNrH2B/SBsorz+/2Kw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734206; c=relaxed/simple; bh=LtF+fit6ysToy8ggC1fw+NLKzmG39JbvcIC3zBZTP/o=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=bQ8mQc8mOSizs3DoS+09GcYknb+wXNpJ/2kfHgoHkJsUZXdeSMTdb5qws02CIH7e9gSg1CB95BP99/5GsS+CNeeNtcl058Y7DedLk2NYpMVHIgHGDTiC1a+9YsjQx9J3MRDIZ8F70xF/c93oScyUY67Jd+xLb8iD3hs96C/B+0g= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc33.google.com with SMTP id 006d021491bc7-5c44ec4f2dbso701795eaf.2 for ; Thu, 11 Jul 2024 14:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734202; x=1721339002; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sJcHUCvneyY8PLa5P6ECUmHFGYtSZiv2H9HKYUOrZ3o=; b=EUdsCHfT+cx4DgLt1kXZ/U8PC5F2/1HBNxBbC23XQIX2/ruaHWG7IDVLJxcrlMhRQK bxwEXCkXWXTVHhMXQeHBUKtV1DY9A5AdWdj5lNt9/VXB6/f8KOQAjehAtHU/GygR7IKC lx4j6tJPRheeX6eWWBEx3aKZx8Hc7CWc/0f4f2nZZ8DJM2Cdh1lu81+GEzitc5c6HNDB RgoysYuH81ULoxGiv/i/uD5BD23sfr+Z745mx/FI+mGG0K4zgZ95lq2qoAL4XTcYdDOR yPxba0WlVBViGerpinIPtpWVqiyV9VBq62LcO40NrYqLSpykYOP+9qFb9JTrNjc6BICa Gmeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734202; x=1721339002; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sJcHUCvneyY8PLa5P6ECUmHFGYtSZiv2H9HKYUOrZ3o=; b=rBaFmpF9EqIhl2hOPt/4KdPsFo6jG9gsNtS180rc49R8WT2JD/LfYykpS7hrxeIWW8 YaOjPp1OoQlBIkyeQqZL8x2TunMrGfVx5E8TPFxepX5n+QFuW4glwrnt0c4gT6bLouZJ 0iHOgT+heo7BmBE3HWkdGAyhmUeUlsH/udBdppukcJOM0Qk0zh8atKRwp37GQVmu6sks LliQ5ET1QK6eb2iWfuTVhTUFkNmwg1yNDMfweJFjIn8iKqp4qvnLqYvoiYmOcxOjFtkY 9r/IuJrz4PmNh4MGGaUhUUxSiuy0ocRObX6gQqbKzB3B+Rw7S5ZgvZ/ygoRRPFrOg5Y4 NEhQ== X-Gm-Message-State: AOJu0Yx2vVqVHPOcxCEPQHTyl6o9bvmw08NYoXNAzVT0ReOXpvG7Y0vg DG+bktsFre5bk+7sirhG2bhaBbnUFznYdyI1RooBtRbaSl0IcL3/WhuHYaAUV/eaYjq08HLBQRp 4NLdZSw== X-Google-Smtp-Source: AGHT+IF6f1NbSwlbxJkYsk2+L3jf/oIImUSeab3N66d45NG1tSU0cG5lA9U2DLua3Wa2+8dV3jEO9g== X-Received: by 2002:a05:6820:1e14:b0:5c6:4807:2d1f with SMTP id 006d021491bc7-5cce34386ccmr190355eaf.8.1720734202129; Thu, 11 Jul 2024 14:43:22 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:21 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 04/15] arm: [MVE intrinsics] factorize vcvtq Date: Thu, 11 Jul 2024 21:42:54 +0000 Message-Id: <20240711214305.3193022-4-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtq so that they use the parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTQ_FROM_F_S, VCVTQ_FROM_F_U, VCVTQ_M_FROM_F_S, VCVTQ_M_FROM_F_U, VCVTQ_M_N_FROM_F_S, VCVTQ_M_N_FROM_F_U, VCVTQ_M_N_TO_F_S, VCVTQ_M_N_TO_F_U, VCVTQ_M_TO_F_S, VCVTQ_M_TO_F_U, VCVTQ_N_FROM_F_S, VCVTQ_N_FROM_F_U, VCVTQ_N_TO_F_S, VCVTQ_N_TO_F_U, VCVTQ_TO_F_S, VCVTQ_TO_F_U. * config/arm/mve.md (mve_vcvtq_to_f_): Rename into @mve_q_to_f_. (mve_vcvtq_from_f_): Rename into @mve_q_from_f_. (mve_vcvtq_n_to_f_): Rename into @mve_q_n_to_f_. (mve_vcvtq_n_from_f_): Rename into @mve_q_n_from_f_. (mve_vcvtq_m_to_f_): Rename into @mve_q_m_to_f_. (mve_vcvtq_m_n_from_f_): Rename into @mve_q_m_n_from_f_. (mve_vcvtq_m_from_f_): Rename into @mve_q_m_from_f_. (mve_vcvtq_m_n_to_f_): Rename into @mve_q_m_n_to_f_. --- gcc/config/arm/iterators.md | 8 +++++++ gcc/config/arm/mve.md | 48 ++++++++++++++++++------------------- 2 files changed, 32 insertions(+), 24 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index b9ff01cb104..bf800625fac 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,6 +964,14 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") + (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") + (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") + (VCVTQ_M_N_TO_F_S "vcvt") (VCVTQ_M_N_TO_F_U "vcvt") + (VCVTQ_M_TO_F_S "vcvt") (VCVTQ_M_TO_F_U "vcvt") + (VCVTQ_N_FROM_F_S "vcvt") (VCVTQ_N_FROM_F_U "vcvt") + (VCVTQ_N_TO_F_S "vcvt") (VCVTQ_N_TO_F_U "vcvt") + (VCVTQ_TO_F_S "vcvt") (VCVTQ_TO_F_U "vcvt") (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup") (VDUPQ_N_S "vdup") (VDUPQ_N_U "vdup") (VDUPQ_N_F "vdup") (VEORQ_M_S "veor") (VEORQ_M_U "veor") (VEORQ_M_F "veor") diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 4b4d6298ffb..b339d0ccdf6 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -250,15 +250,15 @@ (define_insn "mve_vcvtbq_f32_f16v4sf" ;; ;; [vcvtq_to_f_s, vcvtq_to_f_u]) ;; -(define_insn "mve_vcvtq_to_f_" +(define_insn "@mve_q_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "w")] VCVTQ_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.f%#.%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_to_f_")) + ".f%#.%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_to_f_")) (set_attr "type" "mve_move") ]) @@ -280,15 +280,15 @@ (define_insn "@mve_q_" ;; ;; [vcvtq_from_f_s, vcvtq_from_f_u]) ;; -(define_insn "mve_vcvtq_from_f_" +(define_insn "@mve_q_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] VCVTQ_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_from_f_")) + ".%#.f%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_from_f_")) (set_attr "type" "mve_move") ]) @@ -583,7 +583,7 @@ (define_insn "@mve_q_n_f" ;; ;; [vcvtq_n_to_f_s, vcvtq_n_to_f_u]) ;; -(define_insn "mve_vcvtq_n_to_f_" +(define_insn "@mve_q_n_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "w") @@ -591,8 +591,8 @@ (define_insn "mve_vcvtq_n_to_f_" VCVTQ_N_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt.f.\t%q0, %q1, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_to_f_")) + ".f.\t%q0, %q1, %2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_to_f_")) (set_attr "type" "mve_move") ]) @@ -681,7 +681,7 @@ (define_insn "mve_vshrq_n_u_imm" ;; ;; [vcvtq_n_from_f_s, vcvtq_n_from_f_u]) ;; -(define_insn "mve_vcvtq_n_from_f_" +(define_insn "@mve_q_n_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w") @@ -689,8 +689,8 @@ (define_insn "mve_vcvtq_n_from_f_" VCVTQ_N_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvt..f\t%q0, %q1, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_from_f_")) + "..f\t%q0, %q1, %2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_from_f_")) (set_attr "type" "mve_move") ]) @@ -1674,7 +1674,7 @@ (define_insn "mve_vcvtaq_m_" ;; ;; [vcvtq_m_to_f_s, vcvtq_m_to_f_u]) ;; -(define_insn "mve_vcvtq_m_to_f_" +(define_insn "@mve_q_m_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") @@ -1683,8 +1683,8 @@ (define_insn "mve_vcvtq_m_to_f_" VCVTQ_M_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.f%#.%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_to_f_")) + "vpst\;t.f%#.%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_to_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2653,7 +2653,7 @@ (define_insn "mve_vcvtnq_m_" ;; ;; [vcvtq_m_n_from_f_s, vcvtq_m_n_from_f_u]) ;; -(define_insn "mve_vcvtq_m_n_from_f_" +(define_insn "@mve_q_m_n_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") @@ -2663,8 +2663,8 @@ (define_insn "mve_vcvtq_m_n_from_f_" VCVTQ_M_N_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.%#.f%#\t%q0, %q2, %3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_from_f_")) + "vpst\;t.%#.f%#\t%q0, %q2, %3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_from_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2688,7 +2688,7 @@ (define_insn "@mve_q_m_" ;; ;; [vcvtq_m_from_f_u, vcvtq_m_from_f_s]) ;; -(define_insn "mve_vcvtq_m_from_f_" +(define_insn "@mve_q_m_from_f_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") @@ -2697,8 +2697,8 @@ (define_insn "mve_vcvtq_m_from_f_" VCVTQ_M_FROM_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_from_f_")) + "vpst\;t.%#.f%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_from_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2759,7 +2759,7 @@ (define_insn "@mve_q_m_n_" ;; ;; [vcvtq_m_n_to_f_u, vcvtq_m_n_to_f_s]) ;; -(define_insn "mve_vcvtq_m_n_to_f_" +(define_insn "@mve_q_m_n_to_f_" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") @@ -2769,8 +2769,8 @@ (define_insn "mve_vcvtq_m_n_to_f_" VCVTQ_M_N_TO_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtt.f%#.%#\t%q0, %q2, %3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtq_n_to_f_")) + "vpst\;t.f%#.%#\t%q0, %q2, %3" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_to_f_")) (set_attr "type" "mve_move") (set_attr "length""8")]) From patchwork Thu Jul 11 21:42:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959547 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=hjk45L32; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpDM67PPz1xqc for ; Fri, 12 Jul 2024 07:45:07 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1C3BB3875DEF for ; Thu, 11 Jul 2024 21:45:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id 3EAFE3875DE0 for ; Thu, 11 Jul 2024 21:43:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3EAFE3875DE0 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3EAFE3875DE0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734206; cv=none; b=KO53Tvw0JjCEUSLwOKbEzqiuEaOkeJo59ECYDCl4anPxsuSRUgDoU2V0BGVaVSGHeE5TdHD+7qFS3KNJ3xT0+AGaf/KI+wrn4WgDsyUhGNYkzpcMU9WEB2T6wEJiGq1Q3Oe4CG/DEtKG+DK8xiz7GZI7Y3F2TsNSsn1qEbGQBxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734206; c=relaxed/simple; bh=hAjXmVYDS18DePKKyU3gOaCGKEGNXDbbxN8wrJRzYbE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=e0GcVQhHajpJjoCEZZ4xWbNWQRnudSDai0KzIvG/B4191/PQuxFqqp0iyfZWacOH1B5t0ZoSkHE+y/sN0RdJlQLXkvw73CxNhsBCjH2UHdeUwssGa2V7gbAzjw+QXRg1CDJ0QhiznvvHqaaJQSNpWScXzEExQKfv5GhcmZGPMo4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-5b9a35a0901so507361eaf.0 for ; Thu, 11 Jul 2024 14:43:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734203; x=1721339003; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/4tRVZIYZI3eydJAvtmeCfL6hTmE4LER9EuuA6sTiQE=; b=hjk45L328lObSYS+sPaBvapQpcr5uslLviEzwcsnDyu3wIOIk8IPjXiFg3JoNou3cp QmaV9YWFbH9V6QXYCerQ50LrhLXIllqbabAVrhqifliURu1VZTLBrGwhNb3e+/RrGHSr 7R7zRBjnxgctMkCd9MLcSTofzqjhm80riyp4Tk5uUyhe2svi1/m+bFCel59aH+8k/jT4 9VSrSNSFteK5Ilu9agcVAISHYCySdPjcfbzGNiohqB97hIhm3kkx0D+eTBkNdtDzotjl AuFccq9JSGg3EhbpOsdG/V4fcSucMj+t8EzCnCR2Id1GxVDnwxuoPCgKvBfmjIrw+Fn0 z1Hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734203; x=1721339003; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/4tRVZIYZI3eydJAvtmeCfL6hTmE4LER9EuuA6sTiQE=; b=FogFOOse0PCEjne0MXrAZWHWd6AqZkx8hpKYM8tAJ+fIxchpWaU2jQzfbQQMQmwl6U M8uESeLmVujYFDAxPdstoYSDtWG5kZ4hZg+DTRanB0GrAlNqO1pcYMidmk8mJN04MPS8 prKGQKSZGb1vH42MPh0jtS3b3VuGkJcxcZh8u6Ovz21lc2KvRx0N1YTuERkLYMPrhyXi bQ7tJ2R3Rx4Za0njPFiJ4lpRgm8MXcEjWNuiUb1yESKbTqOzeY7iZvFMatKBCiINQYk0 S9Kb1TQqivDvESLgm0Z4ehT/9ofJ+8FDVzHeZYwr+Mkpb6XFptmryAN44T2mqc4aPcCX GPcQ== X-Gm-Message-State: AOJu0YzyvnmAaafloJ1OSbWmrOU9UZ34C48tlLkgCJH7+XTR7alltdQe 1iPS7HbUE5bMAfXC/MXFm3z8ntI+nIn8Or9HU9mlDqojxtlhzSs8QsmVSoC9fqrHdtRd1Ggn6ik PAAPieg== X-Google-Smtp-Source: AGHT+IETftvwYd1PKMFhFkjzvnwOTQFMMbcvRs0r4NgluSUOW2KcgHgGkXUt/RU6N7FecfHiwdCbEQ== X-Received: by 2002:a05:6820:1792:b0:5c6:60d9:b0e1 with SMTP id 006d021491bc7-5cce3e1cfedmr268171eaf.2.1720734202803; Thu, 11 Jul 2024 14:43:22 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:22 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 05/15] arm: [MVE intrinsics] add vcvt shape Date: Thu, 11 Jul 2024 21:42:55 +0000 Message-Id: <20240711214305.3193022-5-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvt shape description. It needs to add a new type_suffix_info parameter to explicit_type_suffix_p (), because vcvt uses overloads for type suffixes for integer-> floating-point conversions, but not for floating-point to integer. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (nonoverloaded_base::explicit_type_suffix_p): Add unused type_suffix_info parameter. (overloaded_base::explicit_type_suffix_p): Likewise. (unary_n_def::explicit_type_suffix_p): Likewise. (vcvt): New. * config/arm/arm-mve-builtins-shapes.h (vcvt): New. * config/arm/arm-mve-builtins.cc (function_builder::get_name): Add new type_suffix parameter. (function_builder::add_overloaded_functions): Likewise. * config/arm/arm-mve-builtins.h (function_shape::explicit_type_suffix_p): Likewise. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 108 +++++++++++++++++++++- gcc/config/arm/arm-mve-builtins-shapes.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 9 +- gcc/config/arm/arm-mve-builtins.h | 10 +- 4 files changed, 119 insertions(+), 9 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index 0520a8331db..e1c5dd2c0f2 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -330,7 +330,8 @@ build_16_32 (function_builder &b, const char *signature, struct nonoverloaded_base : public function_shape { bool - explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const override + explicit_type_suffix_p (unsigned int, enum predication_index, + enum mode_suffix_index, type_suffix_info) const override { return true; } @@ -360,7 +361,8 @@ template struct overloaded_base : public function_shape { bool - explicit_type_suffix_p (unsigned int i, enum predication_index, enum mode_suffix_index) const override + explicit_type_suffix_p (unsigned int i, enum predication_index, + enum mode_suffix_index, type_suffix_info) const override { return (EXPLICIT_MASK >> i) & 1; } @@ -1856,7 +1858,7 @@ struct unary_n_def : public overloaded_base<0> { bool explicit_type_suffix_p (unsigned int, enum predication_index pred, - enum mode_suffix_index) const override + enum mode_suffix_index, type_suffix_info) const override { return pred != PRED_m; } @@ -1979,6 +1981,106 @@ struct unary_widen_acc_def : public overloaded_base<0> }; SHAPE (unary_widen_acc) +/* _t foo_t0[_t1](_t) + _t foo_t0_n[_t1](_t, const int) + + Example: vcvtq. + float32x4_t [__arm_]vcvtq[_f32_s32](int32x4_t a) + float32x4_t [__arm_]vcvtq_m[_f32_s32](float32x4_t inactive, int32x4_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_x[_f32_s32](int32x4_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_n[_f32_s32](int32x4_t a, const int imm6) + float32x4_t [__arm_]vcvtq_m_n[_f32_s32](float32x4_t inactive, int32x4_t a, const int imm6, mve_pred16_t p) + float32x4_t [__arm_]vcvtq_x_n[_f32_s32](int32x4_t a, const int imm6, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_s32_f32(float32x4_t a) + int32x4_t [__arm_]vcvtq_m[_s32_f32](int32x4_t inactive, float32x4_t a, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_x_s32_f32(float32x4_t a, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_n_s32_f32(float32x4_t a, const int imm6) + int32x4_t [__arm_]vcvtq_m_n[_s32_f32](int32x4_t inactive, float32x4_t a, const int imm6, mve_pred16_t p) + int32x4_t [__arm_]vcvtq_x_n_s32_f32(float32x4_t a, const int imm6, mve_pred16_t p) */ +struct vcvt_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info type_info) const override + { + if (pred != PRED_m + && ((i == 0 && type_info.integer_p) + || (i == 1 && type_info.float_p))) + return true; + return false; + } + + bool + explicit_mode_suffix_p (enum predication_index, + enum mode_suffix_index) const override + { + return true; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + b.add_overloaded_functions (group, MODE_n, preserve_user_namespace); + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v1,ss8", group, MODE_n, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index from_type; + tree res; + unsigned int nimm = (r.mode_suffix_id == MODE_none) ? 0 : 1; + + if (!r.check_gp_argument (1 + nimm, i, nargs) + || (from_type + = r.infer_vector_type (i - nimm)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + if (nimm > 0 + && !r.require_integer_immediate (i)) + return error_mark_node; + + type_suffix_index to_type; + + if (type_suffixes[from_type].integer_p) + { + to_type = find_type_suffix (TYPE_float, + type_suffixes[from_type].element_bits); + } + else + { + /* This should not happen: when 'from_type' is float, the type + suffixes are not overloaded (except for "m" predication, + handled above). */ + gcc_assert (r.pred == PRED_m); + + /* Get the return type from the 'inactive' argument. */ + to_type = r.infer_vector_type (0); + } + + if ((res = r.lookup_form (r.mode_suffix_id, to_type, from_type))) + return res; + + return r.report_no_such_form (from_type); + } + + bool + check (function_checker &c) const override + { + if (c.mode_suffix_id == MODE_none) + return true; + + unsigned int bits = c.type_suffix (0).element_bits; + return c.require_immediate_range (1, 1, bits); + } +}; +SHAPE (vcvt) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 61aa4fa73b3..9a112ceeb29 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -77,6 +77,7 @@ namespace arm_mve extern const function_shape *const unary_n; extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; + extern const function_shape *const vcvt; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 7e8217666fe..ea44f463dd8 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -823,7 +823,8 @@ function_builder::get_name (const function_instance &instance, for (unsigned int i = 0; i < 2; ++i) if (!overloaded_p || instance.shape->explicit_type_suffix_p (i, instance.pred, - instance.mode_suffix_id)) + instance.mode_suffix_id, + instance.type_suffix (i))) append_name (instance.type_suffix (i).string); return finish_name (); } @@ -1001,9 +1002,11 @@ function_builder::add_overloaded_functions (const function_group_info &group, for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) { unsigned int explicit_type0 - = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode); + = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode, + type_suffixes[NUM_TYPE_SUFFIXES]); unsigned int explicit_type1 - = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode); + = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode, + type_suffixes[NUM_TYPE_SUFFIXES]); if ((*group.shape)->skip_overload_p (group.preds[pi], mode)) continue; diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve-builtins.h index f282236a843..3306736bff0 100644 --- a/gcc/config/arm/arm-mve-builtins.h +++ b/gcc/config/arm/arm-mve-builtins.h @@ -571,9 +571,13 @@ public: class function_shape { public: - virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const = 0; - virtual bool explicit_mode_suffix_p (enum predication_index, enum mode_suffix_index) const = 0; - virtual bool skip_overload_p (enum predication_index, enum mode_suffix_index) const = 0; + virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, + enum mode_suffix_index, + type_suffix_info) const = 0; + virtual bool explicit_mode_suffix_p (enum predication_index, + enum mode_suffix_index) const = 0; + virtual bool skip_overload_p (enum predication_index, + enum mode_suffix_index) const = 0; /* Define all functions associated with the given group. */ virtual void build (function_builder &, From patchwork Thu Jul 11 21:42:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959558 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=STZgPWZe; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpHz10c4z1xqx for ; Fri, 12 Jul 2024 07:48:15 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6C47A3838A04 for ; Thu, 11 Jul 2024 21:48:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id 415533875DEF for ; Thu, 11 Jul 2024 21:43:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 415533875DEF Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 415533875DEF Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734213; cv=none; b=j4YpI1GRiV2idqqHV2cC4rSDE2lXrmavC84p5MzH6lX0gnmtPiF1Eh6NUnWABm9CIadGTe+4ZGASFHUQVnKaLeju47cFzNSnQut6U0rFgKb9s4oE6/dF9eDhwfVvAXMyF15cNr6u5on+0FrJst8w6PA9wtwX5iNg5r7/Tha63Xw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734213; c=relaxed/simple; bh=6PGiHpvoj9gHr7d/y7TiAurG4adggDrCUvNcTXeujD8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=E8EKnjwcqDZ9e88xZJZbddc38aegNXLVlxP4pYsuvboAz5joLKmsgxkHT1gCjcbz5RwGQshRwv6aqKBoTGSsDhOtY2XlD0APvdOStVh/I4bgmjEqVOjq8Zig9ScRDgdijct0ox190ShXrOqExREIdOWGlh54eNJ4EwXGRIWjoQU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5ca9835db95so769097eaf.3 for ; Thu, 11 Jul 2024 14:43:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734205; x=1721339005; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xS/uh5K2JQbclnNjy2LXC21L61CvXcKCIQKOmiDgE+s=; b=STZgPWZexRwnu5O1VAJNUcT8o9+SMpOTfx472R/oEjMefkHgr0rFaJomrMZkyJHzyI MM/tHNXs8UzaYJuAGGJSJ0EafmVujsmyXmtxS7Rbgd0u38CdduxcZ6zWikA8HYZEkdjO FNqgQnRCxgAYNyRUbCtN1LD/YpcyEE1gQtEGfLFgLjigHmumucO//IFBdp1lArcmHb7U GqRU8VW2wr5EpPKkdS6YMWwSYKVQRlG2VU/ZYkwjfc1/8fSr00Wz15swYFeLmkJvADFN 4fIoMIg1P/wlk1cziwvacv3aykfJsWS9Mo5E3UE8cX4iweiw3BlWzlxprJe4JuIsg55A EXZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734205; x=1721339005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xS/uh5K2JQbclnNjy2LXC21L61CvXcKCIQKOmiDgE+s=; b=F9G6qvETIlrKfkjdDRqtZ9eGECRWDjsqxiZuvQeb74iqHfc3/jiX7bpyk4sMdFvPvM 5FDL89czSL+kD5b3hN7mpsvwLT3Aib8MkvQEi92eGUcLhuzxzeKTwswxjgopmuWMPERS z3uWsDEeIUjxFTFGDdCAdcpoz3MxxpDJil9qeeoiLLlHURgb/RD2yAHGsAKfS8vnBWJy Th9ZZGLDM04vJUdM98T5B4mawEeaOv7KVCX/kEwbNYEKaWUl2qbueaLT9YA1zV510bJo EQY+aRV27Or1RHojsSvh6rt2wIsokBYwE2ptzDd4+10yYMJfLs4DZ8CLbaX62tZZUonu eyag== X-Gm-Message-State: AOJu0YwlTXRanAIUc4xf9+I/uiHcKlXC2746bYYWZg7pxCQHiPytCLTE g+uTb8fyfWgvAWAZst+PAp2HNrEePAaa6SHNZJlC2ELFWNbA7rXF5T15o/FsA9iNLUL/NBiXdbE fkA7kMw== X-Google-Smtp-Source: AGHT+IHb9PSjqDDhsTJW1TUQkU53MbhTYVnY9wA2iNhNmY+uBnbHdDUXWj1nRbI+5j7wc8/tbY6/RQ== X-Received: by 2002:a05:6820:210c:b0:5c4:79a5:c37f with SMTP id 006d021491bc7-5cce704a038mr172691eaf.8.1720734204241; Thu, 11 Jul 2024 14:43:24 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:23 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 06/15] arm: [MVE intrinsics] rework vcvtq Date: Thu, 11 Jul 2024 21:42:56 +0000 Message-Id: <20240711214305.3193022-6-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtq_impl): New. (vcvtq): New. * config/arm/arm-mve-builtins-base.def (vcvtq): New. * config/arm/arm-mve-builtins-base.h (vcvtq): New. * config/arm/arm-mve-builtins.cc (cvt): New type. * config/arm/arm_mve.h (vcvtq): Delete. (vcvtq_n): Delete. (vcvtq_m): Delete. (vcvtq_m_n): Delete. (vcvtq_x): Delete. (vcvtq_x_n): Delete. (vcvtq_f16_s16): Delete. (vcvtq_f32_s32): Delete. (vcvtq_f16_u16): Delete. (vcvtq_f32_u32): Delete. (vcvtq_s16_f16): Delete. (vcvtq_s32_f32): Delete. (vcvtq_u16_f16): Delete. (vcvtq_u32_f32): Delete. (vcvtq_n_f16_s16): Delete. (vcvtq_n_f32_s32): Delete. (vcvtq_n_f16_u16): Delete. (vcvtq_n_f32_u32): Delete. (vcvtq_n_s16_f16): Delete. (vcvtq_n_s32_f32): Delete. (vcvtq_n_u16_f16): Delete. (vcvtq_n_u32_f32): Delete. (vcvtq_m_f16_s16): Delete. (vcvtq_m_f16_u16): Delete. (vcvtq_m_f32_s32): Delete. (vcvtq_m_f32_u32): Delete. (vcvtq_m_s16_f16): Delete. (vcvtq_m_u16_f16): Delete. (vcvtq_m_s32_f32): Delete. (vcvtq_m_u32_f32): Delete. (vcvtq_m_n_f16_u16): Delete. (vcvtq_m_n_f16_s16): Delete. (vcvtq_m_n_f32_u32): Delete. (vcvtq_m_n_f32_s32): Delete. (vcvtq_m_n_s32_f32): Delete. (vcvtq_m_n_s16_f16): Delete. (vcvtq_m_n_u32_f32): Delete. (vcvtq_m_n_u16_f16): Delete. (vcvtq_x_f16_u16): Delete. (vcvtq_x_f16_s16): Delete. (vcvtq_x_f32_s32): Delete. (vcvtq_x_f32_u32): Delete. (vcvtq_x_n_f16_s16): Delete. (vcvtq_x_n_f16_u16): Delete. (vcvtq_x_n_f32_s32): Delete. (vcvtq_x_n_f32_u32): Delete. (vcvtq_x_s16_f16): Delete. (vcvtq_x_s32_f32): Delete. (vcvtq_x_u16_f16): Delete. (vcvtq_x_u32_f32): Delete. (vcvtq_x_n_s16_f16): Delete. (vcvtq_x_n_s32_f32): Delete. (vcvtq_x_n_u16_f16): Delete. (vcvtq_x_n_u32_f32): Delete. (__arm_vcvtq_f16_s16): Delete. (__arm_vcvtq_f32_s32): Delete. (__arm_vcvtq_f16_u16): Delete. (__arm_vcvtq_f32_u32): Delete. (__arm_vcvtq_s16_f16): Delete. (__arm_vcvtq_s32_f32): Delete. (__arm_vcvtq_u16_f16): Delete. (__arm_vcvtq_u32_f32): Delete. (__arm_vcvtq_n_f16_s16): Delete. (__arm_vcvtq_n_f32_s32): Delete. (__arm_vcvtq_n_f16_u16): Delete. (__arm_vcvtq_n_f32_u32): Delete. (__arm_vcvtq_n_s16_f16): Delete. (__arm_vcvtq_n_s32_f32): Delete. (__arm_vcvtq_n_u16_f16): Delete. (__arm_vcvtq_n_u32_f32): Delete. (__arm_vcvtq_m_f16_s16): Delete. (__arm_vcvtq_m_f16_u16): Delete. (__arm_vcvtq_m_f32_s32): Delete. (__arm_vcvtq_m_f32_u32): Delete. (__arm_vcvtq_m_s16_f16): Delete. (__arm_vcvtq_m_u16_f16): Delete. (__arm_vcvtq_m_s32_f32): Delete. (__arm_vcvtq_m_u32_f32): Delete. (__arm_vcvtq_m_n_f16_u16): Delete. (__arm_vcvtq_m_n_f16_s16): Delete. (__arm_vcvtq_m_n_f32_u32): Delete. (__arm_vcvtq_m_n_f32_s32): Delete. (__arm_vcvtq_m_n_s32_f32): Delete. (__arm_vcvtq_m_n_s16_f16): Delete. (__arm_vcvtq_m_n_u32_f32): Delete. (__arm_vcvtq_m_n_u16_f16): Delete. (__arm_vcvtq_x_f16_u16): Delete. (__arm_vcvtq_x_f16_s16): Delete. (__arm_vcvtq_x_f32_s32): Delete. (__arm_vcvtq_x_f32_u32): Delete. (__arm_vcvtq_x_n_f16_s16): Delete. (__arm_vcvtq_x_n_f16_u16): Delete. (__arm_vcvtq_x_n_f32_s32): Delete. (__arm_vcvtq_x_n_f32_u32): Delete. (__arm_vcvtq_x_s16_f16): Delete. (__arm_vcvtq_x_s32_f32): Delete. (__arm_vcvtq_x_u16_f16): Delete. (__arm_vcvtq_x_u32_f32): Delete. (__arm_vcvtq_x_n_s16_f16): Delete. (__arm_vcvtq_x_n_s32_f32): Delete. (__arm_vcvtq_x_n_u16_f16): Delete. (__arm_vcvtq_x_n_u32_f32): Delete. (__arm_vcvtq): Delete. (__arm_vcvtq_n): Delete. (__arm_vcvtq_m): Delete. (__arm_vcvtq_m_n): Delete. (__arm_vcvtq_x): Delete. (__arm_vcvtq_x_n): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 113 ++++ gcc/config/arm/arm-mve-builtins-base.def | 1 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins.cc | 15 + gcc/config/arm/arm_mve.h | 666 ----------------------- 5 files changed, 130 insertions(+), 666 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e0ae593a6c0..a780d686eb1 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -139,6 +139,118 @@ public: } }; + /* Implements vcvtq intrinsics. */ +class vcvtq_impl : public function_base +{ +public: + rtx + expand (function_expander &e) const override + { + insn_code code; + machine_mode target_mode = e.vector_mode (0); + int unspec; + switch (e.pred) + { + case PRED_none: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No predicate, no suffix. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_FROM_F_U + : VCVTQ_FROM_F_S; + code = code_for_mve_q_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_TO_F_U + : VCVTQ_TO_F_S; + code = code_for_mve_q_to_f (unspec, unspec, target_mode); + } + break; + + case MODE_n: + /* No predicate, _n suffix. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_N_FROM_F_U + : VCVTQ_N_FROM_F_S; + code = code_for_mve_q_n_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_N_TO_F_U + : VCVTQ_N_TO_F_S; + code = code_for_mve_q_n_to_f (unspec, unspec, target_mode); + } + break; + + default: + gcc_unreachable (); + } + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_M_FROM_F_U + : VCVTQ_M_FROM_F_S; + code = code_for_mve_q_m_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_M_TO_F_U + : VCVTQ_M_TO_F_S; + code = code_for_mve_q_m_to_f (unspec, unspec, target_mode); + } + break; + + case MODE_n: + /* _n suffix, "m" or "x" predicate. */ + if (e.type_suffix (0).integer_p) + { + unspec = e.type_suffix (0).unsigned_p + ? VCVTQ_M_N_FROM_F_U + : VCVTQ_M_N_FROM_F_S; + code = code_for_mve_q_m_n_from_f (unspec, unspec, target_mode); + } + else + { + unspec = e.type_suffix (1).unsigned_p + ? VCVTQ_M_N_TO_F_U + : VCVTQ_M_N_TO_F_S; + code = code_for_mve_q_m_n_to_f (unspec, unspec, target_mode); + } + break; + + default: + gcc_unreachable (); + } + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + + gcc_unreachable (); + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -339,6 +451,7 @@ FUNCTION (vcmpltq, unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT, FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN)) FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) +FUNCTION (vcvtq, vcvtq_impl,) FUNCTION_ONLY_N (vdupq, VDUPQ) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 90d031eebec..4aaf02eab7b 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -179,6 +179,7 @@ DEF_MVE_FUNCTION (vcmpleq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpltq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpneq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) +DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_float, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vfmaq, ternary_opt_n, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index c9b52a81c5e..dee73d9c457 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -54,6 +54,7 @@ extern const function_base *const vcmulq_rot180; extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; +extern const function_base *const vcvtq; extern const function_base *const vdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index ea44f463dd8..3c5b54dade1 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -205,6 +205,20 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { #define TYPES_signed_32(S, D) \ S (s32) +/* All the type combinations allowed by vcvtq. */ +#define TYPES_cvt(S, D) \ + D (f16, s16), \ + D (f16, u16), \ + \ + D (f32, s32), \ + D (f32, u32), \ + \ + D (s16, f16), \ + D (s32, f32), \ + \ + D (u16, f16), \ + D (u32, f32) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -284,6 +298,7 @@ DEF_MVE_TYPES_ARRAY (integer_32); DEF_MVE_TYPES_ARRAY (poly_8_16); DEF_MVE_TYPES_ARRAY (signed_16_32); DEF_MVE_TYPES_ARRAY (signed_32); +DEF_MVE_TYPES_ARRAY (cvt); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index ae1b5438797..07897f510f5 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -139,18 +139,12 @@ #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) #define vcvttq_f32(__a) __arm_vcvttq_f32(__a) #define vcvtbq_f32(__a) __arm_vcvtbq_f32(__a) -#define vcvtq(__a) __arm_vcvtq(__a) -#define vcvtq_n(__a, __imm6) __arm_vcvtq_n(__a, __imm6) #define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtq_m(__inactive, __a, __p) __arm_vcvtq_m(__inactive, __a, __p) #define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) #define vcvttq_m(__a, __b, __p) __arm_vcvttq_m(__a, __b, __p) #define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) #define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) #define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) -#define vcvtq_m_n(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n(__inactive, __a, __imm6, __p) -#define vcvtq_x(__a, __p) __arm_vcvtq_x(__a, __p) -#define vcvtq_x_n(__a, __imm6, __p) __arm_vcvtq_x_n(__a, __imm6, __p) #define vst4q_s8( __addr, __value) __arm_vst4q_s8( __addr, __value) @@ -163,10 +157,6 @@ #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) #define vcvttq_f32_f16(__a) __arm_vcvttq_f32_f16(__a) #define vcvtbq_f32_f16(__a) __arm_vcvtbq_f32_f16(__a) -#define vcvtq_f16_s16(__a) __arm_vcvtq_f16_s16(__a) -#define vcvtq_f32_s32(__a) __arm_vcvtq_f32_s32(__a) -#define vcvtq_f16_u16(__a) __arm_vcvtq_f16_u16(__a) -#define vcvtq_f32_u32(__a) __arm_vcvtq_f32_u32(__a) #define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) #define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) #define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) @@ -175,10 +165,6 @@ #define vcvtpq_s32_f32(__a) __arm_vcvtpq_s32_f32(__a) #define vcvtmq_s16_f16(__a) __arm_vcvtmq_s16_f16(__a) #define vcvtmq_s32_f32(__a) __arm_vcvtmq_s32_f32(__a) -#define vcvtq_s16_f16(__a) __arm_vcvtq_s16_f16(__a) -#define vcvtq_s32_f32(__a) __arm_vcvtq_s32_f32(__a) -#define vcvtq_u16_f16(__a) __arm_vcvtq_u16_f16(__a) -#define vcvtq_u32_f32(__a) __arm_vcvtq_u32_f32(__a) #define vcvtpq_u16_f16(__a) __arm_vcvtpq_u16_f16(__a) #define vcvtpq_u32_f32(__a) __arm_vcvtpq_u32_f32(__a) #define vcvtnq_u16_f16(__a) __arm_vcvtnq_u16_f16(__a) @@ -192,14 +178,6 @@ #define vctp64q(__a) __arm_vctp64q(__a) #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) -#define vcvtq_n_f16_s16(__a, __imm6) __arm_vcvtq_n_f16_s16(__a, __imm6) -#define vcvtq_n_f32_s32(__a, __imm6) __arm_vcvtq_n_f32_s32(__a, __imm6) -#define vcvtq_n_f16_u16(__a, __imm6) __arm_vcvtq_n_f16_u16(__a, __imm6) -#define vcvtq_n_f32_u32(__a, __imm6) __arm_vcvtq_n_f32_u32(__a, __imm6) -#define vcvtq_n_s16_f16(__a, __imm6) __arm_vcvtq_n_s16_f16(__a, __imm6) -#define vcvtq_n_s32_f32(__a, __imm6) __arm_vcvtq_n_s32_f32(__a, __imm6) -#define vcvtq_n_u16_f16(__a, __imm6) __arm_vcvtq_n_u16_f16(__a, __imm6) -#define vcvtq_n_u32_f32(__a, __imm6) __arm_vcvtq_n_u32_f32(__a, __imm6) #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) #define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b) #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) @@ -234,10 +212,6 @@ #define vcvtaq_m_u16_f16(__inactive, __a, __p) __arm_vcvtaq_m_u16_f16(__inactive, __a, __p) #define vcvtaq_m_s32_f32(__inactive, __a, __p) __arm_vcvtaq_m_s32_f32(__inactive, __a, __p) #define vcvtaq_m_u32_f32(__inactive, __a, __p) __arm_vcvtaq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_f16_s16(__inactive, __a, __p) __arm_vcvtq_m_f16_s16(__inactive, __a, __p) -#define vcvtq_m_f16_u16(__inactive, __a, __p) __arm_vcvtq_m_f16_u16(__inactive, __a, __p) -#define vcvtq_m_f32_s32(__inactive, __a, __p) __arm_vcvtq_m_f32_s32(__inactive, __a, __p) -#define vcvtq_m_f32_u32(__inactive, __a, __p) __arm_vcvtq_m_f32_u32(__inactive, __a, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) @@ -251,23 +225,15 @@ #define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) #define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) #define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) -#define vcvtq_m_s16_f16(__inactive, __a, __p) __arm_vcvtq_m_s16_f16(__inactive, __a, __p) #define vcvtmq_m_u16_f16(__inactive, __a, __p) __arm_vcvtmq_m_u16_f16(__inactive, __a, __p) #define vcvtnq_m_u16_f16(__inactive, __a, __p) __arm_vcvtnq_m_u16_f16(__inactive, __a, __p) #define vcvtpq_m_u16_f16(__inactive, __a, __p) __arm_vcvtpq_m_u16_f16(__inactive, __a, __p) -#define vcvtq_m_u16_f16(__inactive, __a, __p) __arm_vcvtq_m_u16_f16(__inactive, __a, __p) #define vcvtmq_m_s32_f32(__inactive, __a, __p) __arm_vcvtmq_m_s32_f32(__inactive, __a, __p) #define vcvtnq_m_s32_f32(__inactive, __a, __p) __arm_vcvtnq_m_s32_f32(__inactive, __a, __p) #define vcvtpq_m_s32_f32(__inactive, __a, __p) __arm_vcvtpq_m_s32_f32(__inactive, __a, __p) -#define vcvtq_m_s32_f32(__inactive, __a, __p) __arm_vcvtq_m_s32_f32(__inactive, __a, __p) #define vcvtmq_m_u32_f32(__inactive, __a, __p) __arm_vcvtmq_m_u32_f32(__inactive, __a, __p) #define vcvtnq_m_u32_f32(__inactive, __a, __p) __arm_vcvtnq_m_u32_f32(__inactive, __a, __p) #define vcvtpq_m_u32_f32(__inactive, __a, __p) __arm_vcvtpq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_u32_f32(__inactive, __a, __p) __arm_vcvtq_m_u32_f32(__inactive, __a, __p) -#define vcvtq_m_n_f16_u16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f16_u16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f16_s16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f16_s16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f32_u32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f32_u32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_f32_s32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_f32_s32(__inactive, __a, __imm6, __p) #define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) #define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) #define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) @@ -282,10 +248,6 @@ #define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) #define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) #define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) -#define vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) -#define vcvtq_m_n_u16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u16_f16(__inactive, __a, __imm6, __p) #define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) #define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) @@ -600,22 +562,6 @@ #define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) #define vcvtbq_x_f32_f16(__a, __p) __arm_vcvtbq_x_f32_f16(__a, __p) #define vcvttq_x_f32_f16(__a, __p) __arm_vcvttq_x_f32_f16(__a, __p) -#define vcvtq_x_f16_u16(__a, __p) __arm_vcvtq_x_f16_u16(__a, __p) -#define vcvtq_x_f16_s16(__a, __p) __arm_vcvtq_x_f16_s16(__a, __p) -#define vcvtq_x_f32_s32(__a, __p) __arm_vcvtq_x_f32_s32(__a, __p) -#define vcvtq_x_f32_u32(__a, __p) __arm_vcvtq_x_f32_u32(__a, __p) -#define vcvtq_x_n_f16_s16(__a, __imm6, __p) __arm_vcvtq_x_n_f16_s16(__a, __imm6, __p) -#define vcvtq_x_n_f16_u16(__a, __imm6, __p) __arm_vcvtq_x_n_f16_u16(__a, __imm6, __p) -#define vcvtq_x_n_f32_s32(__a, __imm6, __p) __arm_vcvtq_x_n_f32_s32(__a, __imm6, __p) -#define vcvtq_x_n_f32_u32(__a, __imm6, __p) __arm_vcvtq_x_n_f32_u32(__a, __imm6, __p) -#define vcvtq_x_s16_f16(__a, __p) __arm_vcvtq_x_s16_f16(__a, __p) -#define vcvtq_x_s32_f32(__a, __p) __arm_vcvtq_x_s32_f32(__a, __p) -#define vcvtq_x_u16_f16(__a, __p) __arm_vcvtq_x_u16_f16(__a, __p) -#define vcvtq_x_u32_f32(__a, __p) __arm_vcvtq_x_u32_f32(__a, __p) -#define vcvtq_x_n_s16_f16(__a, __imm6, __p) __arm_vcvtq_x_n_s16_f16(__a, __imm6, __p) -#define vcvtq_x_n_s32_f32(__a, __imm6, __p) __arm_vcvtq_x_n_s32_f32(__a, __imm6, __p) -#define vcvtq_x_n_u16_f16(__a, __imm6, __p) __arm_vcvtq_x_n_u16_f16(__a, __imm6, __p) -#define vcvtq_x_n_u32_f32(__a, __imm6, __p) __arm_vcvtq_x_n_u32_f32(__a, __imm6, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3772,62 +3718,6 @@ __arm_vcvtbq_f32_f16 (float16x8_t __a) return __builtin_mve_vcvtbq_f32_f16v4sf (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f16_s16 (int16x8_t __a) -{ - return __builtin_mve_vcvtq_to_f_sv8hf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f32_s32 (int32x4_t __a) -{ - return __builtin_mve_vcvtq_to_f_sv4sf (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f16_u16 (uint16x8_t __a) -{ - return __builtin_mve_vcvtq_to_f_uv8hf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_f32_u32 (uint32x4_t __a) -{ - return __builtin_mve_vcvtq_to_f_uv4sf (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtq_from_f_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtq_from_f_sv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtq_from_f_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtq_from_f_uv4si (__a); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtpq_u16_f16 (float16x8_t __a) @@ -3940,62 +3830,6 @@ __arm_vcvtmq_s32_f32 (float32x4_t __a) return __builtin_mve_vcvtmq_sv4si (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f16_s16 (int16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_sv8hf (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f32_s32 (int32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_sv4sf (__a, __imm6); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f16_u16 (uint16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_uv8hf (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_f32_u32 (uint32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_to_f_uv4sf (__a, __imm6); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_s16_f16 (float16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_sv8hi (__a, __imm6); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_s32_f32 (float32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_sv4si (__a, __imm6); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_u16_f16 (float16x8_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_uv8hi (__a, __imm6); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n_u32_f32 (float32x4_t __a, const int __imm6) -{ - return __builtin_mve_vcvtq_n_from_f_uv4si (__a, __imm6); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) @@ -4066,34 +3900,6 @@ __arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p return __builtin_mve_vcvtaq_m_uv4si (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f16_s16 (float16x8_t __inactive, int16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv8hf (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f16_u16 (float16x8_t __inactive, uint16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv8hf (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f32_s32 (float32x4_t __inactive, int32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv4sf (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_f32_u32 (float32x4_t __inactive, uint32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv4sf (__inactive, __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) @@ -4144,13 +3950,6 @@ __arm_vcvtpq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __builtin_mve_vcvtpq_m_sv8hi (__inactive, __a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv8hi (__inactive, __a, __p); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -4172,13 +3971,6 @@ __arm_vcvtpq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p return __builtin_mve_vcvtpq_m_uv8hi (__inactive, __a, __p); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv8hi (__inactive, __a, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -4200,13 +3992,6 @@ __arm_vcvtpq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __builtin_mve_vcvtpq_m_sv4si (__inactive, __a, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv4si (__inactive, __a, __p); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -4228,41 +4013,6 @@ __arm_vcvtpq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p return __builtin_mve_vcvtpq_m_uv4si (__inactive, __a, __p); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f16_u16 (float16x8_t __inactive, uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv8hf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f16_s16 (float16x8_t __inactive, int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv8hf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f32_u32 (float32x4_t __inactive, uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv4sf (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_f32_s32 (float32x4_t __inactive, int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv4sf (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4277,34 +4027,6 @@ __arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_s32_f32 (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv4si (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_s16_f16 (int16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv8hi (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_u32_f32 (uint32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv4si (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n_u16_f16 (uint16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv8hi (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4675,118 +4397,6 @@ __arm_vcvttq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) return __builtin_mve_vcvttq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f16_u16 (uint16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv8hf (__arm_vuninitializedq_f16 (), __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f16_s16 (int16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv8hf (__arm_vuninitializedq_f16 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f32_s32 (int32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_sv4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_f32_u32 (uint32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_to_f_uv4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f16_s16 (int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv8hf (__arm_vuninitializedq_f16 (), __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f16_u16 (uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv8hf (__arm_vuninitializedq_f16 (), __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f32_s32 (int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_sv4sf (__arm_vuninitializedq_f32 (), __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_f32_u32 (uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_to_f_uv4sf (__arm_vuninitializedq_f32 (), __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_from_f_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_s16_f16 (float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv8hi (__arm_vuninitializedq_s16 (), __a, __imm6, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_s32_f32 (float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_sv4si (__arm_vuninitializedq_s32 (), __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_u16_f16 (float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv8hi (__arm_vuninitializedq_u16 (), __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n_u32_f32 (float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __builtin_mve_vcvtq_m_n_from_f_uv4si (__arm_vuninitializedq_u32 (), __a, __imm6, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -7231,62 +6841,6 @@ __arm_vcvtbq_f32 (float16x8_t __a) return __arm_vcvtbq_f32_f16 (__a); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (int16x8_t __a) -{ - return __arm_vcvtq_f16_s16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (int32x4_t __a) -{ - return __arm_vcvtq_f32_s32 (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (uint16x8_t __a) -{ - return __arm_vcvtq_f16_u16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq (uint32x4_t __a) -{ - return __arm_vcvtq_f32_u32 (__a); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (int16x8_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f16_s16 (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (int32x4_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f32_s32 (__a, __imm6); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (uint16x8_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f16_u16 (__a, __imm6); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_n (uint32x4_t __a, const int __imm6) -{ - return __arm_vcvtq_n_f32_u32 (__a, __imm6); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float16x8_t __a, float16x8_t __b) @@ -7343,34 +6897,6 @@ __arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float16x8_t __inactive, int16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f16_s16 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float16x8_t __inactive, uint16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f16_u16 (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float32x4_t __inactive, int32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f32_s32 (__inactive, __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (float32x4_t __inactive, uint32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_f32_u32 (__inactive, __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtbq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7420,13 +6946,6 @@ __arm_vcvtpq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_s16_f16 (__inactive, __a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_s16_f16 (__inactive, __a, __p); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -7448,13 +6967,6 @@ __arm_vcvtpq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_u16_f16 (__inactive, __a, __p); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_u16_f16 (__inactive, __a, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -7476,13 +6988,6 @@ __arm_vcvtpq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_s32_f32 (__inactive, __a, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_s32_f32 (__inactive, __a, __p); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -7504,41 +7009,6 @@ __arm_vcvtpq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtpq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float16x8_t __inactive, uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f16_u16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float16x8_t __inactive, int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f16_s16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float32x4_t __inactive, uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f32_u32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (float32x4_t __inactive, int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_f32_s32 (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7553,34 +7023,6 @@ __arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pre return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_s32_f32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (int16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_s16_f16 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (uint32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_u32_f32 (__inactive, __a, __imm6, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_m_n (uint16x8_t __inactive, float16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_m_n_u16_f16 (__inactive, __a, __imm6, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7763,62 +7205,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (uint16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f16_u16 (__a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (int16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f16_s16 (__a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (int32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f32_s32 (__a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x (uint32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtq_x_f32_u32 (__a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (int16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f16_s16 (__a, __imm6, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (uint16x8_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f16_u16 (__a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (int32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f32_s32 (__a, __imm6, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtq_x_n (uint32x4_t __a, const int __imm6, mve_pred16_t __p) -{ - return __arm_vcvtq_x_n_f32_u32 (__a, __imm6, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -8276,20 +7662,6 @@ extern void *__ARM_undef; _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvttq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) -#define __arm_vcvtq(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_f16_s16 (__ARM_mve_coerce(__p0, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_f32_s32 (__ARM_mve_coerce(__p0, int32x4_t)), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_f16_u16 (__ARM_mve_coerce(__p0, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_f32_u32 (__ARM_mve_coerce(__p0, uint32x4_t)));}) - -#define __arm_vcvtq_n(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_n_f16_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_n_f32_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_n_f16_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_n_f32_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1));}) - #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -8342,30 +7714,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) -#define __arm_vcvtq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcvtq_m_f16_s16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcvtq_m_f32_s32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcvtq_m_f16_u16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcvtq_m_f32_u32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtq_m_n(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_n_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_n_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtq_m_n_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtq_m_n_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcvtq_m_n_f16_s16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, int16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcvtq_m_n_f32_s32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcvtq_m_n_f16_u16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcvtq_m_n_f32_u32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) - #define __arm_vcvtbq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -8730,20 +8078,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcvtq_x(p1,p2) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_x_f16_s16 (__ARM_mve_coerce(__p1, int16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_x_f32_s32 (__ARM_mve_coerce(__p1, int32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_x_f16_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_x_f32_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2));}) - -#define __arm_vcvtq_x_n(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_x_n_f16_s16 (__ARM_mve_coerce(__p1, int16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vcvtq_x_n_f32_s32 (__ARM_mve_coerce(__p1, int32x4_t), p2, p3), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vcvtq_x_n_f16_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vcvtq_x_n_f32_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) - #define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ From patchwork Thu Jul 11 21:42:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959552 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=lZUH0ZZ/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpFp6znfz1xqx for ; Fri, 12 Jul 2024 07:46:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 329673839150 for ; Thu, 11 Jul 2024 21:46:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc29.google.com (mail-oo1-xc29.google.com [IPv6:2607:f8b0:4864:20::c29]) by sourceware.org (Postfix) with ESMTPS id C243C387606A for ; Thu, 11 Jul 2024 21:43:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C243C387606A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C243C387606A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c29 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734210; cv=none; b=h1Mb5yJfDOjRbm17H9nTyoRBeqp/G/MQMtxIOVQmsWiVo+04kTzUfwgE+zo3vaPVUUByK3STxjf6Fs/64wa0E2Y6AwJhl2zOOAQyydbuPXnG68m8pwDceChZD4RCXW9R9FjxPYnH1ju5V2sJuG+Aw2MXIAV8HXILHg/vvx4//4w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734210; c=relaxed/simple; bh=oOOlJJSGPLLnLV7zduRUbdZ5QmP7z9w1YZuakLePrt4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=GX606rihOXhhFc9CAic5AiLvvVmMTp7quhTFo3wgzvGmTINmC3oMgVt8KGYXvUCgQ0F4mB2xUEac7OzkA9QBC7iOqjCEe356mL6Tdf04U7f4xP8AA9Ma10MORYF5hgxQjXkrOUcEiHFjrwBUJNtPxyzz9GwSqi6/IVKHuiTzPQQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc29.google.com with SMTP id 006d021491bc7-5c66de0c24bso771992eaf.1 for ; Thu, 11 Jul 2024 14:43:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734205; x=1721339005; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3iJIqYbKdhlVBGn+mlBLOF07mqWkA1wEG9xTw5wHjMA=; b=lZUH0ZZ/ZXtghuOEbbi6hA3LNAmPxbpnhtcUZ8R1qHgakqw3QdhAAO3YCCgVY1f+23 lNTAW47S4oDYjzn2jjCEQfOcPEDbUghGznSuUEZq0+51DUs1cE3CqXtUjxfihvLBtsMD dx3jpMxexGdrwB3TIcXrhVfc7qKdenRB8e39gtb1QJBCdaFFWQkjFXtbku71Law2QaLT Le89mkagFcTlV27ZTZDwOm5Ukdvs/Ml+wqelRTkukIaqXrCGNW+t16wZMgVq2bbcGea1 Xkb2OXWew8YWiXZJRNclq39UO92i8gkXS28ix0uhfAm1oZR6lZZRUrDOGyfC2GD4GFUE Frqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734205; x=1721339005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3iJIqYbKdhlVBGn+mlBLOF07mqWkA1wEG9xTw5wHjMA=; b=nEnh0h5M8gu747Eb0CcLL47qrKBo8bWDG1dZTbZ9h2316dPY5Ea2C04HgwQKfWZakK Ia8mCShoEYNIQWc7FdEM94IzH1WgnEep7320Ot7tV7qMNE7Ovy95tTxvWNP0SEmWbbIH MjicXD2QA2fZ2eOwPDF2UFGb9oMj/Yu2PD+WVGKMJIJhBsvOiJ6AZYcedf9l1lVSNHb7 EFm8hq6NgyB8N+/6QTP+TxGKvSKDl+mdS5I498x8R9uGcPmucaCXoHCsZBfG7NMMtUQ3 QOQ7zwjZWiLUN1LPmWc0voi3/ZMvQH45TFoU95HHNfMXHTuLI/UkLOlBMH9RPRUMh6lk RjcA== X-Gm-Message-State: AOJu0YwM/2B5WishcNHKYRbdJO3fohH+NRcC9WM42n1eu+KCJMcMqLsj miS0iddBWmc0zwCYuiDvYyx9rXFA73GRuFlk1GYsXF/Iv7H6RD4ZZFdMCHqAxOoKAQPYt7HjUcV HQg6qog== X-Google-Smtp-Source: AGHT+IFwa7Ihk1ihUT+tU21CdjaJVFvsQwYRfz31nRquk08Ko84wX0wu0kpmbgpZ8sI+7gom+GO3qw== X-Received: by 2002:a05:6820:c90:b0:5c9:d9db:6a51 with SMTP id 006d021491bc7-5ccd81572ebmr410708eaf.0.1720734204862; Thu, 11 Jul 2024 14:43:24 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:24 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 07/15] arm: [MVE intrinsics] factorize vcvtbq vcvttq Date: Thu, 11 Jul 2024 21:42:57 +0000 Message-Id: <20240711214305.3193022-7-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtbq, vcvttq so that they use the same parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTBQ_F16_F32, VCVTTQ_F16_F32, VCVTBQ_F32_F16, VCVTTQ_F32_F16, VCVTBQ_M_F16_F32, VCVTTQ_M_F16_F32, VCVTBQ_M_F32_F16, VCVTTQ_M_F32_F16. (VCVTxQ_F16_F32): New iterator. (VCVTxQ_F32_F16): Likewise. (VCVTxQ_M_F16_F32): Likewise. (VCVTxQ_M_F32_F16): Likewise. * config/arm/mve.md (mve_vcvttq_f32_f16v4sf) (mve_vcvtbq_f32_f16v4sf): Merge into ... (@mve_q_f32_f16v4sf): ... this. (mve_vcvtbq_f16_f32v8hf, mve_vcvttq_f16_f32v8hf): Merge into ... (@mve_q_f16_f32v8hf): ... this. (mve_vcvtbq_m_f16_f32v8hf, mve_vcvttq_m_f16_f32v8hf): Merge into ... (@mve_q_m_f16_f32v8hf): ... this. (mve_vcvtbq_m_f32_f16v4sf, mve_vcvttq_m_f32_f16v4sf): Merge into ... (@mve_q_m_f32_f16v4sf): ... this. --- gcc/config/arm/iterators.md | 8 +++ gcc/config/arm/mve.md | 102 ++++++++---------------------------- 2 files changed, 29 insertions(+), 81 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index bf800625fac..b9c39a98ca2 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,6 +964,10 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTBQ_F16_F32 "vcvtb") (VCVTTQ_F16_F32 "vcvtt") + (VCVTBQ_F32_F16 "vcvtb") (VCVTTQ_F32_F16 "vcvtt") + (VCVTBQ_M_F16_F32 "vcvtb") (VCVTTQ_M_F16_F32 "vcvtt") + (VCVTBQ_M_F32_F16 "vcvtb") (VCVTTQ_M_F32_F16 "vcvtt") (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") @@ -2948,6 +2952,10 @@ (define_int_iterator SQRSHRLQ [SQRSHRL_64 SQRSHRL_48]) (define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U]) (define_int_iterator VQSHLUQ_M_N [VQSHLUQ_M_N_S]) (define_int_iterator VQSHLUQ_N [VQSHLUQ_N_S]) +(define_int_iterator VCVTxQ_F16_F32 [VCVTBQ_F16_F32 VCVTTQ_F16_F32]) +(define_int_iterator VCVTxQ_F32_F16 [VCVTBQ_F32_F16 VCVTTQ_F32_F16]) +(define_int_iterator VCVTxQ_M_F16_F32 [VCVTBQ_M_F16_F32 VCVTTQ_M_F16_F32]) +(define_int_iterator VCVTxQ_M_F32_F16 [VCVTBQ_M_F32_F16 VCVTTQ_M_F32_F16]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index b339d0ccdf6..7a05a216516 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -217,33 +217,20 @@ (define_insn "@mve_q_f" [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f")) (set_attr "type" "mve_move") ]) -;; -;; [vcvttq_f32_f16]) -;; -(define_insn "mve_vcvttq_f32_f16v4sf" - [ - (set (match_operand:V4SF 0 "s_register_operand" "=w") - (unspec:V4SF [(match_operand:V8HF 1 "s_register_operand" "w")] - VCVTTQ_F32_F16)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtt.f32.f16\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f32_f16v4sf")) - (set_attr "type" "mve_move") -]) ;; ;; [vcvtbq_f32_f16]) +;; [vcvttq_f32_f16]) ;; -(define_insn "mve_vcvtbq_f32_f16v4sf" +(define_insn "@mve_q_f32_f16v4sf" [ (set (match_operand:V4SF 0 "s_register_operand" "=w") (unspec:V4SF [(match_operand:V8HF 1 "s_register_operand" "w")] - VCVTBQ_F32_F16)) + VCVTxQ_F32_F16)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtb.f32.f16\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f32_f16v4sf")) + ".f32.f16\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f32_f16v4sf")) (set_attr "type" "mve_move") ]) @@ -1343,33 +1330,18 @@ (define_insn "mve_vctpq_m" ;; ;; [vcvtbq_f16_f32]) -;; -(define_insn "mve_vcvtbq_f16_f32v8hf" - [ - (set (match_operand:V8HF 0 "s_register_operand" "=w") - (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") - (match_operand:V4SF 2 "s_register_operand" "w")] - VCVTBQ_F16_F32)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtb.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f16_f32v8hf")) - (set_attr "type" "mve_move") -]) - -;; ;; [vcvttq_f16_f32]) ;; -(define_insn "mve_vcvttq_f16_f32v8hf" +(define_insn "@mve_q_f16_f32v8hf" [ (set (match_operand:V8HF 0 "s_register_operand" "=w") (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") (match_operand:V4SF 2 "s_register_operand" "w")] - VCVTTQ_F16_F32)) + VCVTxQ_F16_F32)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f16_f32v8hf")) + ".f16.f32\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f16_f32v8hf")) (set_attr "type" "mve_move") ]) @@ -2238,71 +2210,39 @@ (define_insn "@mve_vcmpq_m_n_f" ;; ;; [vcvtbq_m_f16_f32]) -;; -(define_insn "mve_vcvtbq_m_f16_f32v8hf" - [ - (set (match_operand:V8HF 0 "s_register_operand" "=w") - (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") - (match_operand:V4SF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTBQ_M_F16_F32)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtbt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f16_f32v8hf")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvtbq_m_f32_f16]) -;; -(define_insn "mve_vcvtbq_m_f32_f16v4sf" - [ - (set (match_operand:V4SF 0 "s_register_operand" "=w") - (unspec:V4SF [(match_operand:V4SF 1 "s_register_operand" "0") - (match_operand:V8HF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTBQ_M_F32_F16)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtbt.f32.f16\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtbq_f32_f16v4sf")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; ;; [vcvttq_m_f16_f32]) ;; -(define_insn "mve_vcvttq_m_f16_f32v8hf" +(define_insn "@mve_q_m_f16_f32v8hf" [ (set (match_operand:V8HF 0 "s_register_operand" "=w") (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "0") (match_operand:V4SF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTTQ_M_F16_F32)) + (match_operand:V4BI 3 "vpr_register_operand" "Up")] + VCVTxQ_M_F16_F32)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvttt.f16.f32\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f16_f32v8hf")) + "vpst\;t.f16.f32\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f16_f32v8hf")) (set_attr "type" "mve_move") (set_attr "length""8")]) ;; +;; [vcvtbq_m_f32_f16]) ;; [vcvttq_m_f32_f16]) ;; -(define_insn "mve_vcvttq_m_f32_f16v4sf" +(define_insn "@mve_q_m_f32_f16v4sf" [ (set (match_operand:V4SF 0 "s_register_operand" "=w") (unspec:V4SF [(match_operand:V4SF 1 "s_register_operand" "0") (match_operand:V8HF 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTTQ_M_F32_F16)) + (match_operand:V8BI 3 "vpr_register_operand" "Up")] + VCVTxQ_M_F32_F16)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvttt.f32.f16\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvttq_f32_f16v4sf")) + "vpst\;t.f32.f16\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_f32_f16v4sf")) (set_attr "type" "mve_move") - (set_attr "length""8")]) + (set_attr "length""8")]) ;; ;; [vdupq_m_n_f]) From patchwork Thu Jul 11 21:42:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959557 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Gp4n3Lbu; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpGz2GV2z1xqx for ; Fri, 12 Jul 2024 07:47:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 95C83387703F for ; Thu, 11 Jul 2024 21:47:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by sourceware.org (Postfix) with ESMTPS id 02E9A38754BD for ; Thu, 11 Jul 2024 21:43:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 02E9A38754BD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 02E9A38754BD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734209; cv=none; b=Hbe2gakQcOeGjTviHOVnXUdPtNM1aFVLIwN8PWFUlQuApgVwdCYy8SjdW4Z17M1ZNNqXuKW0DKzxqKuDG5UEqRuD2ZATAAxshSyJ6eM2elR5uzCNDUiSTKhSHZI7a3CtgltSLjRBei37T1rvtXIUMX/wCQIgyEInEoKmZrZAx8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734209; c=relaxed/simple; bh=RtTVZW1mcbkiigPP4PhwlKLDfuE67q4hQwZKtTZmxY8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=rHQubVq4jxoRnnuBQjw4ElBom0F2+q033y/lpe7t18A9lVG3cNLwJBgjxhVGqrobS74HyKMOkORuZ0k9VjG10Dfg0jkz6wqh5jgYyf++b3cSY60d4wIX367I5XCBJY9ZMFpQEjJL+ZEW3C4L0wGmBV1Ooszd2DA+MPiBHaQPbjc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-5c47ad9967cso740419eaf.2 for ; Thu, 11 Jul 2024 14:43:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734207; x=1721339007; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PZ1bQqUTZPPjO3/za+hFy+Ph0DHak3St30LBLtcouR8=; b=Gp4n3Lbu1pHqsAaM9mqz4MWLlJS7ocMWmN8KLKt32SbWsMg3WCq3ySSgjQ0BLhb4MQ sLCt6LaoayAQjdhb4fhyEiuxC9COo0EaDkYKZvHyJvuEW66zPt3waQfPPSmx1z0yjf9F sf9FMeaE595zKAxCBYQUo9iacuMMecIu254V6aTi7rEi5nJaQlnKhnzyCNPBcF+1kQZw iMby69LFqAsaMvbrnfBhCfqtEJvrQ72P49Argveda1XvrTGj8SppwR0/8AkVL+c1OX7O yGaa2CGOeXEhLqZe9AF/M9SuBVtbt5Av1PBa5i74TG2ZxRsHNM6/5RjKnV7admjWqzVp ENkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734207; x=1721339007; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PZ1bQqUTZPPjO3/za+hFy+Ph0DHak3St30LBLtcouR8=; b=q90FItujLckWIQvMII0lgR5DdiGeRxuIdtJQLAIVQd8eSUZeOln00GL07sY5Gqc2mD ZoUNmuyhdcD9ehe7jAhqBccr9cAAcpzvCDYEM0uTghLgccXzxNoFcj+kYdotuzBulaWm goLWH0co11w1Yd1siF2TSQ70JnKCLTF13NIjf9Qns5UrdNHn8pJDgaHR5i5Sj+es7JYh jbBf4EZUkoSd6eUqOJgxflFKbzjNj0WBV9/2JICK1f+NnewblDmYpHfBJarMP0+UdlPt vcDxUKe5BepmJVUTPpgGQ1NPnqJaRPAMkJiM8pk2wHwYyhP7613GjoqPXhrw8gAQI9Ef fcbg== X-Gm-Message-State: AOJu0YxPhwl9wup0RAujZ24+Bo9TT7FLz/ZxPHPXPPNxcHWanKcAw8ZN qjTxfA74AgLwnjnXuhliW6Dy6IoHYMvBKOOXQ1lwbTunu8ldJLBSxokqtMvp5Vt6JzzLGdYkJdM duddCVA== X-Google-Smtp-Source: AGHT+IEDl0mObAdtSjpC3e3FVVFY97HPtkkXkIzTPMhKkuqcCpwNZyzV0wm7KI+WJ024yMU0CBy6Wg== X-Received: by 2002:a05:6820:1e0d:b0:5c4:4aaa:d245 with SMTP id 006d021491bc7-5ccddcf7fbamr211950eaf.5.1720734206680; Thu, 11 Jul 2024 14:43:26 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:25 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 08/15] arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes Date: Thu, 11 Jul 2024 21:42:58 +0000 Message-Id: <20240711214305.3193022-8-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvt_f16_f32 and vcvt_f32_f16 shapes descriptions. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvt_f16_f32) (vcvt_f32_f16): New. * config/arm/arm-mve-builtins-shapes.h (vcvt_f16_f32) (vcvt_f32_f16): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 35 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 2 ++ 2 files changed, 37 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index e1c5dd2c0f2..c311f255e1b 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2081,6 +2081,41 @@ struct vcvt_def : public overloaded_base<0> }; SHAPE (vcvt) +/* float16x8_t foo_f16_f32(float16x8_t, float32x4_t) + + Example: vcvttq_f16_f32. + float16x8_t [__arm_]vcvttq_f16_f32(float16x8_t a, float32x4_t b) + float16x8_t [__arm_]vcvttq_m_f16_f32(float16x8_t a, float32x4_t b, mve_pred16_t p) +*/ +struct vcvt_f16_f32_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + build_all (b, "v0,v0,v1", group, MODE_none, preserve_user_namespace); + } +}; +SHAPE (vcvt_f16_f32) + +/* float32x4_t foo_f32_f16(float16x8_t) + + Example: vcvttq_f32_f16. + float32x4_t [__arm_]vcvttq_f32_f16(float16x8_t a) + float32x4_t [__arm_]vcvttq_m_f32_f16(float32x4_t inactive, float16x8_t a, mve_pred16_t p) + float32x4_t [__arm_]vcvttq_x_f32_f16(float16x8_t a, mve_pred16_t p) +*/ +struct vcvt_f32_f16_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + } +}; +SHAPE (vcvt_f32_f16) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 9a112ceeb29..50157b57571 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -78,6 +78,8 @@ namespace arm_mve extern const function_shape *const unary_widen; extern const function_shape *const unary_widen_acc; extern const function_shape *const vcvt; + extern const function_shape *const vcvt_f16_f32; + extern const function_shape *const vcvt_f32_f16; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ From patchwork Thu Jul 11 21:42:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959546 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=iSz2i4iP; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpCT1W5cz1xpd for ; Fri, 12 Jul 2024 07:44:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6E10B3876059 for ; Thu, 11 Jul 2024 21:44:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by sourceware.org (Postfix) with ESMTPS id E9BC93875DF3 for ; Thu, 11 Jul 2024 21:43:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E9BC93875DF3 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E9BC93875DF3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::335 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; cv=none; b=uKAhhfdHgrTf6t9GHxg7i3xZghp3rg99Wh2KEBx7pipST3JNE/Qi153joWdDhuN9+SWf+b68I8EuqofvTB0NSby3M6KeTLXMPM5H1YQb8iahKiM4tmafr/AikChFlhtz8F9RLTS/NaTFl2TaZbzEjKwm6agRnDeQpNAu4M0wIyg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; c=relaxed/simple; bh=EnXOGYRzx+hVXaLbycfM6fFZqBmKMNqbCMXefagY+d8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=EwurcKlfJ97iHBDMjq1EkDjuPtN9oCd/vnz2MEk1zwcVkOWWQbt7skIygBaUWMXCKjHk0PG/UIHadd0sYOJSoPwC5J2ET73YXOaD89UBzsI8xpy6KT3kBInwPRoTflY2KHAYyZIGB64KpR9oqjSOwvgIV+s0RRw9aYfwbRK61sw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x335.google.com with SMTP id 46e09a7af769-70445bb3811so741625a34.1 for ; Thu, 11 Jul 2024 14:43:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734207; x=1721339007; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AM2xd7+7rA9t4/onZe/RJM3KDLWCLl4XrJFBQbfzog0=; b=iSz2i4iP/N/O3+tQCGTLnKxv+OiBwA1oi7RdKsVDlFRemYPJFRWYsIqM++eUvhUqRV W+brKrXPS09IbtqUE1BRm+kkaOQDfyK//s940lFvclz3CAlFbzBl/bW5dYIMXR176W4U mIXKxKCN137E+pevpeApCxrv2PrzH8sdonCXZfv5Ybkb0f1GxjUH+lARdZiwuWMeTis+ HKmtvZsKTlm3Ho+lz981idI7XMdXgLBwvQo2urloeQS6CRcQMXXBhQovZgwmFxni2K7P 5Q3VutocnKWfpG9pBXKqCzgluMIs/PBH+t+o1XdpEpLCCKZ8Zff0F3g5IwZ5l/nByOxv P0/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734207; x=1721339007; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AM2xd7+7rA9t4/onZe/RJM3KDLWCLl4XrJFBQbfzog0=; b=L4GYwqwL3dZfTqZ+te7Ujq8YRipJ3Hy7dryL/hDGD50Q6TUXjU6LhLIwsFLItYiTwO epi8k00wEoTiAficAsK7yy+POR3KQYm2c+5FvUfNxKKBkOR2bkFq75bPK6QLsk+83H1e SjA520bNJZre2s5FCPmI0qL5ce9O+3AM1D/avzWDBUhx5aasBZDW+DQ1B4pZCkwKfILz NkCqo+os3hLM/xEiv1LuBPK4Yo8YYyIcHiGrotxeZfTy7QGYqmyNhHSmKpdmF/GUoGQN +Fq15pAuziWjcvFvDz23Qnn6/zl32LuI/Qi0TCfi8lNZRBYd9Nbw17Y5mVkYuuAZtdVp lt2A== X-Gm-Message-State: AOJu0YztmwT7DDP8c4Ap7uV8MBt59VbzgLX/H2utDNG0e4pKF8mXLlLq QTR9NA8a7LT/t8FymhKPR4B9uEqSVWLBWB8xxyi6SGc3b3KxwKs2RMfBqB8Y/d6EwUQZPVa/Zgi CMym/xw== X-Google-Smtp-Source: AGHT+IG1WZsAc3lPRP77iEQQXckaxZHr9WfKAcaEGVsUHpNkoPNnZyMJcAHHeeC9LUAvllApKQi7HQ== X-Received: by 2002:a05:6830:1193:b0:700:d506:cfe9 with SMTP id 46e09a7af769-70375a1268bmr10142251a34.19.1720734207379; Thu, 11 Jul 2024 14:43:27 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:27 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 09/15] arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 vcvtbq_f32_f16 vcvttq_f32_f16 Date: Thu, 11 Jul 2024 21:42:59 +0000 Message-Id: <20240711214305.3193022-9-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtbq_f16_f32, vcvttq_f16_f32, vcvtbq_f32_f16 and vcvttq_f32_f16 using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtxq_impl): New. (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.def (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.h (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins.cc (cvt_f16_f32, cvt_f32_f16): New types. (function_instance::has_inactive_argument): Support vcvtbq and vcvttq. * config/arm/arm_mve.h (vcvttq_f32): Delete. (vcvtbq_f32): Delete. (vcvtbq_m): Delete. (vcvttq_m): Delete. (vcvttq_f32_f16): Delete. (vcvtbq_f32_f16): Delete. (vcvttq_f16_f32): Delete. (vcvtbq_f16_f32): Delete. (vcvtbq_m_f16_f32): Delete. (vcvtbq_m_f32_f16): Delete. (vcvttq_m_f16_f32): Delete. (vcvttq_m_f32_f16): Delete. (vcvtbq_x_f32_f16): Delete. (vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32_f16): Delete. (__arm_vcvtbq_f32_f16): Delete. (__arm_vcvttq_f16_f32): Delete. (__arm_vcvtbq_f16_f32): Delete. (__arm_vcvtbq_m_f16_f32): Delete. (__arm_vcvtbq_m_f32_f16): Delete. (__arm_vcvttq_m_f16_f32): Delete. (__arm_vcvttq_m_f32_f16): Delete. (__arm_vcvtbq_x_f32_f16): Delete. (__arm_vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32): Delete. (__arm_vcvtbq_f32): Delete. (__arm_vcvtbq_m): Delete. (__arm_vcvttq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 56 +++++++++ gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm-mve-builtins.cc | 12 ++ gcc/config/arm/arm_mve.h | 146 ----------------------- 5 files changed, 74 insertions(+), 146 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index a780d686eb1..760378c91b1 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -251,6 +251,60 @@ public: } }; + /* Implements vcvt[bt]q_f32_f16 and vcvt[bt]q_f16_f32 + intrinsics. */ +class vcvtxq_impl : public function_base +{ +public: + CONSTEXPR vcvtxq_impl (int unspec_f16_f32, int unspec_for_m_f16_f32, + int unspec_f32_f16, int unspec_for_m_f32_f16) + : m_unspec_f16_f32 (unspec_f16_f32), + m_unspec_for_m_f16_f32 (unspec_for_m_f16_f32), + m_unspec_f32_f16 (unspec_f32_f16), + m_unspec_for_m_f32_f16 (unspec_for_m_f32_f16) + {} + + /* The unspec code associated with vcvt[bt]q. */ + int m_unspec_f16_f32; + int m_unspec_for_m_f16_f32; + int m_unspec_f32_f16; + int m_unspec_for_m_f32_f16; + + rtx + expand (function_expander &e) const override + { + insn_code code; + switch (e.pred) + { + case PRED_none: + /* No predicate. */ + if (e.type_suffix (0).element_bits == 16) + code = code_for_mve_q_f16_f32v8hf (m_unspec_f16_f32); + else + code = code_for_mve_q_f32_f16v4sf (m_unspec_f32_f16); + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + /* "m" or "x" predicate. */ + if (e.type_suffix (0).element_bits == 16) + code = code_for_mve_q_m_f16_f32v8hf (m_unspec_for_m_f16_f32); + else + code = code_for_mve_q_m_f32_f16v4sf (m_unspec_for_m_f32_f16); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + + gcc_unreachable (); + } +}; + } /* end anonymous namespace */ namespace arm_mve { @@ -452,6 +506,8 @@ FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNK FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) FUNCTION (vcvtq, vcvtq_impl,) +FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) +FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) FUNCTION_ONLY_N (vdupq, VDUPQ) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 4aaf02eab7b..5a6f1a59c7c 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -180,6 +180,10 @@ DEF_MVE_FUNCTION (vcmpltq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpneq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) +DEF_MVE_FUNCTION (vcvtbq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) +DEF_MVE_FUNCTION (vcvtbq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) +DEF_MVE_FUNCTION (vcvttq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) +DEF_MVE_FUNCTION (vcvttq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) DEF_MVE_FUNCTION (vdupq, unary_n, all_float, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vfmaq, ternary_opt_n, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index dee73d9c457..25467769d05 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -55,6 +55,8 @@ extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; extern const function_base *const vcvtq; +extern const function_base *const vcvtbq; +extern const function_base *const vcvttq; extern const function_base *const vdupq; extern const function_base *const veorq; extern const function_base *const vfmaq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 3c5b54dade1..4c554a47d85 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -219,6 +219,14 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { D (u16, f16), \ D (u32, f32) +/* vcvt[bt]q_f16_f132. */ +#define TYPES_cvt_f16_f32(S, D) \ + D (f16, f32) + +/* vcvt[bt]q_f32_f16. */ +#define TYPES_cvt_f32_f16(S, D) \ + D (f32, f16) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -299,6 +307,8 @@ DEF_MVE_TYPES_ARRAY (poly_8_16); DEF_MVE_TYPES_ARRAY (signed_16_32); DEF_MVE_TYPES_ARRAY (signed_32); DEF_MVE_TYPES_ARRAY (cvt); +DEF_MVE_TYPES_ARRAY (cvt_f16_f32); +DEF_MVE_TYPES_ARRAY (cvt_f32_f16); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); @@ -730,6 +740,8 @@ function_instance::has_inactive_argument () const || base == functions::vcmpltq || base == functions::vcmpcsq || base == functions::vcmphiq + || (base == functions::vcvtbq && type_suffix (0).element_bits == 16) + || (base == functions::vcvttq && type_suffix (0).element_bits == 16) || base == functions::vfmaq || base == functions::vfmasq || base == functions::vfmsq diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 07897f510f5..5c35e08d754 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -137,11 +137,7 @@ #define vsetq_lane(__a, __b, __idx) __arm_vsetq_lane(__a, __b, __idx) #define vgetq_lane(__a, __idx) __arm_vgetq_lane(__a, __idx) #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) -#define vcvttq_f32(__a) __arm_vcvttq_f32(__a) -#define vcvtbq_f32(__a) __arm_vcvtbq_f32(__a) #define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) -#define vcvttq_m(__a, __b, __p) __arm_vcvttq_m(__a, __b, __p) #define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) #define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) #define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) @@ -155,8 +151,6 @@ #define vst4q_u32( __addr, __value) __arm_vst4q_u32( __addr, __value) #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) -#define vcvttq_f32_f16(__a) __arm_vcvttq_f32_f16(__a) -#define vcvtbq_f32_f16(__a) __arm_vcvtbq_f32_f16(__a) #define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) #define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) #define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) @@ -202,8 +196,6 @@ #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) #define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p) -#define vcvttq_f16_f32(__a, __b) __arm_vcvttq_f16_f32(__a, __b) -#define vcvtbq_f16_f32(__a, __b) __arm_vcvtbq_f16_f32(__a, __b) #define vbicq_m_n_s16(__a, __imm, __p) __arm_vbicq_m_n_s16(__a, __imm, __p) #define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) #define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) @@ -218,10 +210,6 @@ #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vcvtbq_m_f16_f32(__a, __b, __p) __arm_vcvtbq_m_f16_f32(__a, __b, __p) -#define vcvtbq_m_f32_f16(__inactive, __a, __p) __arm_vcvtbq_m_f32_f16(__inactive, __a, __p) -#define vcvttq_m_f16_f32(__a, __b, __p) __arm_vcvttq_m_f16_f32(__a, __b, __p) -#define vcvttq_m_f32_f16(__inactive, __a, __p) __arm_vcvttq_m_f32_f16(__inactive, __a, __p) #define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) #define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) #define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) @@ -560,8 +548,6 @@ #define vcvtmq_x_s32_f32(__a, __p) __arm_vcvtmq_x_s32_f32(__a, __p) #define vcvtmq_x_u16_f16(__a, __p) __arm_vcvtmq_x_u16_f16(__a, __p) #define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) -#define vcvtbq_x_f32_f16(__a, __p) __arm_vcvtbq_x_f32_f16(__a, __p) -#define vcvttq_x_f32_f16(__a, __p) __arm_vcvttq_x_f32_f16(__a, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3704,20 +3690,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f32_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvttq_f32_f16v4sf (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f32_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtbq_f32_f16v4sf (__a); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtpq_u16_f16 (float16x8_t __a) @@ -3858,20 +3830,6 @@ __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vbicq_fv4sf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f16_f32 (float16x8_t __a, float32x4_t __b) -{ - return __builtin_mve_vcvttq_f16_f32v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f16_f32 (float16x8_t __a, float32x4_t __b) -{ - return __builtin_mve_vcvtbq_f16_f32v8hf (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtaq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -3901,34 +3859,6 @@ __arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m_f16_f32 (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f16_f32v8hf (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m_f32_f16 (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f32_f16v4sf (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m_f16_f32 (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f16_f32v8hf (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m_f32_f16 (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f32_f16v4sf (__inactive, __a, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -4383,20 +4313,6 @@ __arm_vcvtmq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) return __builtin_mve_vcvtmq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtbq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_x_f32_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvttq_m_f32_f16v4sf (__arm_vuninitializedq_f32 (), __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -6827,20 +6743,6 @@ __arm_vst4q (float32_t * __addr, float32x4x4_t __value) __arm_vst4q_f32 (__addr, __value); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_f32 (float16x8_t __a) -{ - return __arm_vcvttq_f32_f16 (__a); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_f32 (float16x8_t __a) -{ - return __arm_vcvtbq_f32_f16 (__a); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float16x8_t __a, float16x8_t __b) @@ -6897,34 +6799,6 @@ __arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcvtbq_m_f16_f32 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtbq_m (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtbq_m_f32_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m (float16x8_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcvttq_m_f16_f32 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvttq_m (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvttq_m_f32_f16 (__inactive, __a, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -7654,14 +7528,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vcvtbq_f32(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvtbq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) - -#define __arm_vcvttq_f32(p0) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_float16x8_t]: __arm_vcvttq_f32_f16 (__ARM_mve_coerce(__p0, float16x8_t)));}) - #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -7714,18 +7580,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) -#define __arm_vcvtbq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float16x8_t]: __arm_vcvtbq_m_f32_f16 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float32x4_t]: __arm_vcvtbq_m_f16_f32 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvttq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float16x8_t]: __arm_vcvttq_m_f32_f16 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float32x4_t]: __arm_vcvttq_m_f16_f32 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - #define __arm_vcvtmq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ From patchwork Thu Jul 11 21:43:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959555 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=LpsVo1e1; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpGm5QKjz1xqx for ; Fri, 12 Jul 2024 07:47:12 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 074E23877036 for ; Thu, 11 Jul 2024 21:47:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2d.google.com (mail-oo1-xc2d.google.com [IPv6:2607:f8b0:4864:20::c2d]) by sourceware.org (Postfix) with ESMTPS id 785673876073 for ; Thu, 11 Jul 2024 21:43:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 785673876073 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 785673876073 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; cv=none; b=pO0gIzwFhd8OjmSZ4xwTkcSMQ5wQ+/e7Qz14WuPMSElJ4mwYGNgGnHaf1ztbxd9tAneA7z+0QIrcMAdNh+sG0iH6XeXISWZwJ644N3ywiaHMYJIlruDtJOmockJRbNBmEcENrqZ7sSRCqjz3uZyCEVsvDe+WzdNt32VkoTlgb8s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; c=relaxed/simple; bh=YVlsC2QnAup2b2EwQTBwA9w59nfg+NYs42HEb5Mf7x8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=mPlr2GMUfLg3Fd7/RYdrMaP+GCb7pQmQ8BMwQbuQ2NwrhOXq+hZMecbXWmRWPv7oCWHAUQ4klIBSrmmYm8lV49tebd9lt1x+lXn+zmJq92UCB1guPBQvoHuc0fwYO1docQRCUvf2onKuqgzmhoK38SLN9ZzqUn9P/9snNkDSVu0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2d.google.com with SMTP id 006d021491bc7-5c47c41eb84so754070eaf.0 for ; Thu, 11 Jul 2024 14:43:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734208; x=1721339008; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/YZfM7PvN/JInCDY4VjHEMpBSWE7lXr4DK0yjtgb0jM=; b=LpsVo1e159BNOksVN0L2tib3MhlyUyT3jg4MjeD7sJzmXQGHEl/qY/3iiHq+PSLfTE W0OQm/2R8/A+lekux58F1vnIi7YeOy3jbh332S+sKpEjZ7H+K0bwHmO3ReXZQmG/1Fyk ryOB6ehPNsD89A9TDU/hvcLKEEAj0iyxGoA8VDiey+/DFe8aStYlPuzy8oBRw57gLv2h KTVCnoLwKhdTvD/xozfTFC1+2gL3Zoih5RhtFmKsTjP9BijSoKYf6Dej298G68UY6IPo wWS185dNLH0gJtSKL4aElpeJRREpOVczo4fZNyNedhMug/pMGVq8OMakVYaOCVqyjglZ mAoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734208; x=1721339008; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/YZfM7PvN/JInCDY4VjHEMpBSWE7lXr4DK0yjtgb0jM=; b=MV0Kwc/XnRJ5rNTKu4E5zt2wuHQOx/IoTxSs8Hi+aQmKJG3YYQFWo9U8EKzP7q4GNa kUUDiTpbhxqTstAYAM6jmWoPTWpzsdL/amR+OQP7gztikYEjhEqr/QZClDlgzSkeAMa9 3E7HLUM/CozR7axMG5cageI5/BLbqzctxAj4CqJtIi8FQYXMMEwNxVye+l9ZLBPY4zfc okFaaj4bliwMLSxI1Xx47sRiPPWdMAfeLDk8vSkP5T9OoVDSMau9IDYAvccogsQP+vs0 ZtiQx8M47B01/LdfKfZO++df2NGEgO57ITcwCqMlPxfYC/fbkAobHTqkSIdEddvPJ+dm GuhA== X-Gm-Message-State: AOJu0YzBHsXM7Lr2WYmiMxe6D4BSyYvgarZYHyH6wjJhADMa9JmbZzbn RvSLH9XzViz4mfCR6VCJHYDl/xoP3FThyXV9bNXVEhigE/OTlmMPOREzPuxSah2RX1bOkDlyFgi DekE1Jw== X-Google-Smtp-Source: AGHT+IGJlZdK+ycSaGnt+Q4fyirqHhyNIzOh2TisMfHzqihUqrE/45a3/fyL8FM12+UHRvLyYg117g== X-Received: by 2002:a05:6820:2901:b0:5c6:949e:3056 with SMTP id 006d021491bc7-5cce22ff251mr230883eaf.3.1720734208046; Thu, 11 Jul 2024 14:43:28 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:27 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 10/15] arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq Date: Thu, 11 Jul 2024 21:43:00 +0000 Message-Id: <20240711214305.3193022-10-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vcvtaq vcvtmq vcvtnq vcvtpq builtins so that they use parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTAQ_M_S, VCVTAQ_M_U, VCVTAQ_S, VCVTAQ_U, VCVTMQ_M_S, VCVTMQ_M_U, VCVTMQ_S, VCVTMQ_U, VCVTNQ_M_S, VCVTNQ_M_U, VCVTNQ_S, VCVTNQ_U, VCVTPQ_M_S, VCVTPQ_M_U, VCVTPQ_S, VCVTPQ_U. (VCVTAQ, VCVTPQ, VCVTNQ, VCVTMQ, VCVTAQ_M, VCVTMQ_M, VCVTNQ_M) (VCVTPQ_M): Delete. (VCVTxQ, VCVTxQ_M): New. * config/arm/mve.md (mve_vcvtpq_) (mve_vcvtnq_, mve_vcvtmq_) (mve_vcvtaq_): Merge into ... (@mve_q_): ... this. (mve_vcvtaq_m_, mve_vcvtmq_m_) (mve_vcvtpq_m_, mve_vcvtnq_m_): Merge into ... (@mve_q_m_): ... this. --- gcc/config/arm/iterators.md | 18 +++--- gcc/config/arm/mve.md | 117 +++++------------------------------- 2 files changed, 24 insertions(+), 111 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index b9c39a98ca2..162c0d56bfb 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -964,10 +964,18 @@ (define_int_attr mve_insn [ (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") + (VCVTAQ_M_S "vcvta") (VCVTAQ_M_U "vcvta") + (VCVTAQ_S "vcvta") (VCVTAQ_U "vcvta") (VCVTBQ_F16_F32 "vcvtb") (VCVTTQ_F16_F32 "vcvtt") (VCVTBQ_F32_F16 "vcvtb") (VCVTTQ_F32_F16 "vcvtt") (VCVTBQ_M_F16_F32 "vcvtb") (VCVTTQ_M_F16_F32 "vcvtt") (VCVTBQ_M_F32_F16 "vcvtb") (VCVTTQ_M_F32_F16 "vcvtt") + (VCVTMQ_M_S "vcvtm") (VCVTMQ_M_U "vcvtm") + (VCVTMQ_S "vcvtm") (VCVTMQ_U "vcvtm") + (VCVTNQ_M_S "vcvtn") (VCVTNQ_M_U "vcvtn") + (VCVTNQ_S "vcvtn") (VCVTNQ_U "vcvtn") + (VCVTPQ_M_S "vcvtp") (VCVTPQ_M_U "vcvtp") + (VCVTPQ_S "vcvtp") (VCVTPQ_U "vcvtp") (VCVTQ_FROM_F_S "vcvt") (VCVTQ_FROM_F_U "vcvt") (VCVTQ_M_FROM_F_S "vcvt") (VCVTQ_M_FROM_F_U "vcvt") (VCVTQ_M_N_FROM_F_S "vcvt") (VCVTQ_M_N_FROM_F_U "vcvt") @@ -2732,14 +2740,10 @@ (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S]) (define_int_iterator VREV64Q [VREV64Q_S VREV64Q_U]) (define_int_iterator VCVTQ_FROM_F [VCVTQ_FROM_F_S VCVTQ_FROM_F_U]) (define_int_iterator VREV16Q [VREV16Q_U VREV16Q_S]) -(define_int_iterator VCVTAQ [VCVTAQ_U VCVTAQ_S]) (define_int_iterator VDUPQ_N [VDUPQ_N_U VDUPQ_N_S]) (define_int_iterator VADDVQ [VADDVQ_U VADDVQ_S]) (define_int_iterator VREV32Q [VREV32Q_U VREV32Q_S]) (define_int_iterator VMOVLxQ [VMOVLBQ_S VMOVLBQ_U VMOVLTQ_U VMOVLTQ_S]) -(define_int_iterator VCVTPQ [VCVTPQ_S VCVTPQ_U]) -(define_int_iterator VCVTNQ [VCVTNQ_S VCVTNQ_U]) -(define_int_iterator VCVTMQ [VCVTMQ_S VCVTMQ_U]) (define_int_iterator VADDLVQ [VADDLVQ_U VADDLVQ_S]) (define_int_iterator VCVTQ_N_TO_F [VCVTQ_N_TO_F_S VCVTQ_N_TO_F_U]) (define_int_iterator VCREATEQ [VCREATEQ_U VCREATEQ_S]) @@ -2795,7 +2799,6 @@ (define_int_iterator VQMOVNTQ [VQMOVNTQ_U VQMOVNTQ_S]) (define_int_iterator VSHLLxQ_N [VSHLLBQ_N_S VSHLLBQ_N_U VSHLLTQ_N_S VSHLLTQ_N_U]) (define_int_iterator VRMLALDAVHQ [VRMLALDAVHQ_U VRMLALDAVHQ_S]) (define_int_iterator VBICQ_M_N [VBICQ_M_N_S VBICQ_M_N_U]) -(define_int_iterator VCVTAQ_M [VCVTAQ_M_S VCVTAQ_M_U]) (define_int_iterator VCVTQ_M_TO_F [VCVTQ_M_TO_F_S VCVTQ_M_TO_F_U]) (define_int_iterator VQRSHRNBQ_N [VQRSHRNBQ_N_U VQRSHRNBQ_N_S]) (define_int_iterator VABAVQ [VABAVQ_S VABAVQ_U]) @@ -2845,9 +2848,6 @@ (define_int_iterator VQMOVNTQ_M [VQMOVNTQ_M_U VQMOVNTQ_M_S]) (define_int_iterator VMVNQ_M_N [VMVNQ_M_N_U VMVNQ_M_N_S]) (define_int_iterator VQSHRNTQ_N [VQSHRNTQ_N_U VQSHRNTQ_N_S]) (define_int_iterator VSHRNTQ_N [VSHRNTQ_N_S VSHRNTQ_N_U]) -(define_int_iterator VCVTMQ_M [VCVTMQ_M_S VCVTMQ_M_U]) -(define_int_iterator VCVTNQ_M [VCVTNQ_M_S VCVTNQ_M_U]) -(define_int_iterator VCVTPQ_M [VCVTPQ_M_S VCVTPQ_M_U]) (define_int_iterator VCVTQ_M_N_FROM_F [VCVTQ_M_N_FROM_F_S VCVTQ_M_N_FROM_F_U]) (define_int_iterator VCVTQ_M_FROM_F [VCVTQ_M_FROM_F_U VCVTQ_M_FROM_F_S]) (define_int_iterator VRMLALDAVHQ_P [VRMLALDAVHQ_P_S VRMLALDAVHQ_P_U]) @@ -2956,6 +2956,8 @@ (define_int_iterator VCVTxQ_F16_F32 [VCVTBQ_F16_F32 VCVTTQ_F16_F32]) (define_int_iterator VCVTxQ_F32_F16 [VCVTBQ_F32_F16 VCVTTQ_F32_F16]) (define_int_iterator VCVTxQ_M_F16_F32 [VCVTBQ_M_F16_F32 VCVTTQ_M_F16_F32]) (define_int_iterator VCVTxQ_M_F32_F16 [VCVTBQ_M_F32_F16 VCVTTQ_M_F32_F16]) +(define_int_iterator VCVTxQ [VCVTAQ_S VCVTAQ_U VCVTMQ_S VCVTMQ_U VCVTNQ_S VCVTNQ_U VCVTPQ_S VCVTPQ_U]) +(define_int_iterator VCVTxQ_M [VCVTAQ_M_S VCVTAQ_M_U VCVTMQ_M_S VCVTMQ_M_U VCVTNQ_M_S VCVTNQ_M_U VCVTPQ_M_S VCVTPQ_M_U]) (define_int_iterator DLSTP [DLSTP8 DLSTP16 DLSTP32 DLSTP64]) (define_int_iterator LETP [LETP8 LETP16 LETP32 diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 7a05a216516..cded22be1ee 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -415,63 +415,21 @@ (define_insn "@mve_q_" (set_attr "type" "mve_move") ]) -;; -;; [vcvtpq_s, vcvtpq_u]) -;; -(define_insn "mve_vcvtpq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTPQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtp.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtpq_")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvtnq_s, vcvtnq_u]) -;; -(define_insn "mve_vcvtnq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTNQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtn.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtnq_")) - (set_attr "type" "mve_move") -]) - -;; -;; [vcvtmq_s, vcvtmq_u]) -;; -(define_insn "mve_vcvtmq_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTMQ)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvtm.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtmq_")) - (set_attr "type" "mve_move") -]) - ;; ;; [vcvtaq_u, vcvtaq_s]) +;; [vcvtmq_s, vcvtmq_u]) +;; [vcvtnq_s, vcvtnq_u]) +;; [vcvtpq_s, vcvtpq_u]) ;; -(define_insn "mve_vcvtaq_" +(define_insn "@mve_q_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand: 1 "s_register_operand" "w")] - VCVTAQ)) + VCVTxQ)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcvta.%#.f%#\t%q0, %q1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtaq_")) + ".%#.f%#\t%q0, %q1" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_")) (set_attr "type" "mve_move") ]) @@ -1628,18 +1586,21 @@ (define_insn "@mve_vcmpq_m_f" ;; ;; [vcvtaq_m_u, vcvtaq_m_s]) +;; [vcvtmq_m_s, vcvtmq_m_u]) +;; [vcvtnq_m_s, vcvtnq_m_u]) +;; [vcvtpq_m_u, vcvtpq_m_s]) ;; -(define_insn "mve_vcvtaq_m_" +(define_insn "@mve_q_m_" [ (set (match_operand:MVE_5 0 "s_register_operand" "=w") (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") (match_operand: 2 "s_register_operand" "w") (match_operand: 3 "vpr_register_operand" "Up")] - VCVTAQ_M)) + VCVTxQ_M)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtat.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtaq_")) + "vpst\;t.%#.f%#\t%q0, %q2" + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2539,56 +2500,6 @@ (define_insn "@mve_q_p_v4si" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vcvtmq_m_s, vcvtmq_m_u]) -;; -(define_insn "mve_vcvtmq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTMQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtmt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtmq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvtpq_m_u, vcvtpq_m_s]) -;; -(define_insn "mve_vcvtpq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTPQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtpt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtpq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcvtnq_m_s, vcvtnq_m_u]) -;; -(define_insn "mve_vcvtnq_m_" - [ - (set (match_operand:MVE_5 0 "s_register_operand" "=w") - (unspec:MVE_5 [(match_operand:MVE_5 1 "s_register_operand" "0") - (match_operand: 2 "s_register_operand" "w") - (match_operand: 3 "vpr_register_operand" "Up")] - VCVTNQ_M)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcvtnt.%#.f%#\t%q0, %q2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vcvtnq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) ;; ;; [vcvtq_m_n_from_f_s, vcvtq_m_n_from_f_u]) From patchwork Thu Jul 11 21:43:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959551 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=IuBENRiQ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpFT6fQJz1xqx for ; Fri, 12 Jul 2024 07:46:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 27E8A387703A for ; Thu, 11 Jul 2024 21:46:04 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id 793CE3858D34 for ; Thu, 11 Jul 2024 21:43:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 793CE3858D34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 793CE3858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; cv=none; b=l7UFlCku91E2z4lDd1AdLj0Q/p2JxvzRv1qXi6fJhxKt/fw4G0CtXuVvA00ltzOj3oXx0xlKr438wKKrKjoRnpBGChkuYQMT2LW2R/3XwERzOvLjEHgJutYYo8yeNwLQ3nfmtCVPyebBlxB94kZtSg+s/u+p+81mM7RVSzZhzNc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734212; c=relaxed/simple; bh=AYNneQG0oYoAIlTjitruZCGAejnNT/+oT7zoJn6wG60=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=D0rk1shTGB1PyUV/IG8IPHzHXtpIbB57SO0NtLmPGWS67qpo1jazEwCPNnqTJfY2mLY7ZQORsWgriPv/jK8jMum9wI94T39pGTmHnFK/WHb62b04GKMXs0GS/w8Kh+11l3E8G7WeRsYD3HAGKxiVAsITUTXl9/wklMuPqyH+ys0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-5c667b28c82so623701eaf.1 for ; Thu, 11 Jul 2024 14:43:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734209; x=1721339009; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tXSbSUudcyP+6k1NC5aMvJhQIl/GzD0L6BksgsrmwbI=; b=IuBENRiQNRAJkPBfDCXvzthVmDd7hdMnng3+cpQxmjcpw8MH/qyVLMY19XRPyByhP3 8MIIPBFUvb5RbApmApyJBh104BNzY8A8ybTzkRIG3sDJVR3JdaCrPAcHcjk4PtO/jdau qohTTyPBI0N5rZvcUq1esq53ekvORHmKISu4T9Gj31DFy4lEuueVqK0HeHiY8AxLKBx8 cshUfwkKGnSM61PkeoDlVkP1zgXCgQ8RB97ki1UJ1BhfM+RoCqz4C4Qm74Gxh13RorD9 rCqFE2/uP5oWjMdgl52P+guZ24D0hrOwEukVUUxeDxJJ0nDftp0BMw4fTtlpYa6ZNtD1 MoGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734209; x=1721339009; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tXSbSUudcyP+6k1NC5aMvJhQIl/GzD0L6BksgsrmwbI=; b=i9sW7YJcmRaC9/1gthDaBI5u0A1EgDcIHTnDh+43bGvBSzpI5vwnTiHlVS7UPXLtdU OsxXMZ1OzH7caA+pp0GMMWo4M2oCneZtZdfxiyEN/YXtwqePgYaGhJFBwBcMWFYVxTLs llJMdCyDVz6eXWlu8bJ/6csceZOxCg/GGn1Co2A9CBea1+dCfqDS2OBV4aPMaAGlUUGv XzsM1n0QJc551eNK8qcwlGZy6HMBljHyux2jxc4M4W6jcvcy5GyU07q5slpYCB7jUZKN WCaWrjKUMTMTj0Si997F4bJpqhI2A1JSZ2rKZvK5otk2C/anXyUb2ey/ZkCHA2Vexn/x PU1w== X-Gm-Message-State: AOJu0YxP3A0WwRzlEl3D7p3rHc7PSav49WmMDRxB+h5plR9YN2/BKqkI gWWRdMHlCBrS8CZxOgFBBVJvxCy4SxAomtlRuhC7q90JOQK0erPsG1HXXkokKnD9LP+9elYJSv9 83SpulQ== X-Google-Smtp-Source: AGHT+IH1rhL4VkqZk/XzLvHtZgR39f0VxhUVqSTaFcIXouRTK0PgpnNOEA3LIB3LHbM2TaYywAgdQw== X-Received: by 2002:a05:6820:c89:b0:5c6:931f:6950 with SMTP id 006d021491bc7-5cc9a4ec8b0mr422135eaf.0.1720734209391; Thu, 11 Jul 2024 14:43:29 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:28 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 11/15] arm: [MVE intrinsics] add vcvtx shape Date: Thu, 11 Jul 2024 21:43:01 +0000 Message-Id: <20240711214305.3193022-11-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vcvtx shape description for vcvtaq, vcvtmq, vcvtnq, vcvtpq. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvtx): New. * config/arm/arm-mve-builtins-shapes.h (vcvtx): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 59 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 60 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index c311f255e1b..7937d401696 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2116,6 +2116,65 @@ struct vcvt_f32_f16_def : public nonoverloaded_base }; SHAPE (vcvt_f32_f16) +/* _t foo_t0[_t1](_t) + + Example: vcvtaq. + int16x8_t [__arm_]vcvtaq_s16_f16(float16x8_t a) + int16x8_t [__arm_]vcvtaq_m[_s16_f16](int16x8_t inactive, float16x8_t a, mve_pred16_t p) + int16x8_t [__arm_]vcvtaq_x_s16_f16(float16x8_t a, mve_pred16_t p) +*/ +struct vcvtx_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info type_info) const override + { + return pred != PRED_m; + } + + bool + skip_overload_p (enum predication_index pred, enum mode_suffix_index mode) + const override + { + return pred != PRED_m; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,v1", group, MODE_none, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index from_type; + tree res; + + if (!r.check_gp_argument (1, i, nargs) + || (from_type + = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + type_suffix_index to_type; + + gcc_assert (r.pred == PRED_m); + + /* Get the return type from the 'inactive' argument. */ + to_type = r.infer_vector_type (0); + + if ((res = r.lookup_form (r.mode_suffix_id, to_type, from_type))) + return res; + + return r.report_no_such_form (from_type); + } +}; +SHAPE (vcvtx) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 50157b57571..ef497b6c97a 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -80,6 +80,7 @@ namespace arm_mve extern const function_shape *const vcvt; extern const function_shape *const vcvt_f16_f32; extern const function_shape *const vcvt_f32_f16; + extern const function_shape *const vcvtx; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */ From patchwork Thu Jul 11 21:43:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959559 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=B3xLv15o; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpJ93Rklz1xqx for ; Fri, 12 Jul 2024 07:48:25 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A93C13839166 for ; Thu, 11 Jul 2024 21:48:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2e.google.com (mail-oo1-xc2e.google.com [IPv6:2607:f8b0:4864:20::c2e]) by sourceware.org (Postfix) with ESMTPS id 19EB33875DED for ; Thu, 11 Jul 2024 21:43:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 19EB33875DED Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 19EB33875DED Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734216; cv=none; b=wvruTXxRawRUbYsYey3Y4jc2iME+M0rWpLftbSLmo6FChAR6KkJnfgvDh4hYmKN1dri8KQx/0l+JzVwZqxlPMIY7RPsOTbeAeOlzLdbh/EhVZxIkG4ImQJWtK8xiOkTG2ccU0eNWl0CD2fEOP3hEe0o2elDFqxHxDsip6KXSa88= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734216; c=relaxed/simple; bh=4n6LfUPar/xJMWntkGFK/LqhLykYvr61blPYU5cadeM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=hyJUYrN9R5MBjFghf/nxV4TkTDrkcM/XLd6WeMGDL30xIgKuDdwVH4WdevGBxelNud/g7pDrUntEreGl7rYsCBYKQa1n4AehNx5ffzgl2x2rj1qXU/vKdFc33t9RUmZvZAlPokt/Y04xCsOvuhRMOP6553UOLEsd8PEsBXUCWNg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2e.google.com with SMTP id 006d021491bc7-5c47c41eb84so754082eaf.0 for ; Thu, 11 Jul 2024 14:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734211; x=1721339011; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o9T1ikEsdhcGpBgEXgtVw30HU0VvJZo9Gq/vlNHTZOQ=; b=B3xLv15otzOpS4tbTweSP1zNEWp3bNA7OezqZicK5S9Q3YTvwJBG1NGum372+UlW3v R08Qp6j1kGa/Rv6gSHo57+UisEba1ZErLxoOdEXUZ6s2Wt4vPf/2krU5bySjiDPq+SpQ qVxtP3sWmJbJLEXNUI0ss5SSsxnOM7vcaw30ky3g0S8wF/2eJFQuxl0EHAGB+GeXu46f zCKJrnqM41rSI7dSgM11AFwZgCLBsz7aCMfeQN1g4NEKjBlLFjSqLoM5qZA4cySzagGE SKm1IrkNqXMfisEFUL8RxNQJ0pMBiFyOZL5cb4YLFgme6UpUIYgjv2K2Dqwrc9qCmK0X SEyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734211; x=1721339011; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o9T1ikEsdhcGpBgEXgtVw30HU0VvJZo9Gq/vlNHTZOQ=; b=j/QT3C6Q7Vj70AtcK/F/JDIC9wRgEGIA7g8MmZw378K7YNi/YW2s+nAqDc5K0Vix39 wVPuoE/xXF4DjSwcd/81kH9A8xPVRgMqBf9AxX5RzKl7MBQHvNLn1pb4fPG2//qAMjC6 1nbGWoAGHI0UgUddaW1/nUoITOeXfAlasr91sfV2haiMYzGT9F//55mbTMvMLuxYBnn1 VZDp6/Wpy4UBn2cc4bTUZrsTG9Fhm3z+OYyqvdmo+clcym/LlvaueMXknNhiebgO+NoY wR7x/E9SFR5WJMGYLmgd1M98jYe4PszOM/jjo57rSX9TJNHMMU3cvXf+dYH2FjtU0W/T cuJg== X-Gm-Message-State: AOJu0YxB5U8k0Zwef6txeJ7a5lhSA380PPE4h6xBjj/Xm1SgOUQOb0v3 jc181JG7R3U7X5NCPqGVrORA/EdhbmnC189mYDMEYf7S4fH4A8uDmJvsd5NNQvFQm0jTG5L1vPl gMu2AEA== X-Google-Smtp-Source: AGHT+IFyYiR+oCrvZTDw7u6KLEdhIVppENvxl1kmAiEGw0G9B3hKiIfJA2MXXzkCwBqnp3QcFNvR5w== X-Received: by 2002:a05:6820:1e81:b0:5c4:177a:23b1 with SMTP id 006d021491bc7-5cce52ac13emr157929eaf.8.1720734210462; Thu, 11 Jul 2024 14:43:30 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:29 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 12/15] arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq Date: Thu, 11 Jul 2024 21:43:02 +0000 Message-Id: <20240711214305.3193022-12-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.def (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.h: (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins.cc (cvtx): New type. * config/arm/arm_mve.h (vcvtaq_m): Delete. (vcvtmq_m): Delete. (vcvtnq_m): Delete. (vcvtpq_m): Delete. (vcvtaq_s16_f16): Delete. (vcvtaq_s32_f32): Delete. (vcvtnq_s16_f16): Delete. (vcvtnq_s32_f32): Delete. (vcvtpq_s16_f16): Delete. (vcvtpq_s32_f32): Delete. (vcvtmq_s16_f16): Delete. (vcvtmq_s32_f32): Delete. (vcvtpq_u16_f16): Delete. (vcvtpq_u32_f32): Delete. (vcvtnq_u16_f16): Delete. (vcvtnq_u32_f32): Delete. (vcvtmq_u16_f16): Delete. (vcvtmq_u32_f32): Delete. (vcvtaq_u16_f16): Delete. (vcvtaq_u32_f32): Delete. (vcvtaq_m_s16_f16): Delete. (vcvtaq_m_u16_f16): Delete. (vcvtaq_m_s32_f32): Delete. (vcvtaq_m_u32_f32): Delete. (vcvtmq_m_s16_f16): Delete. (vcvtnq_m_s16_f16): Delete. (vcvtpq_m_s16_f16): Delete. (vcvtmq_m_u16_f16): Delete. (vcvtnq_m_u16_f16): Delete. (vcvtpq_m_u16_f16): Delete. (vcvtmq_m_s32_f32): Delete. (vcvtnq_m_s32_f32): Delete. (vcvtpq_m_s32_f32): Delete. (vcvtmq_m_u32_f32): Delete. (vcvtnq_m_u32_f32): Delete. (vcvtpq_m_u32_f32): Delete. (vcvtaq_x_s16_f16): Delete. (vcvtaq_x_s32_f32): Delete. (vcvtaq_x_u16_f16): Delete. (vcvtaq_x_u32_f32): Delete. (vcvtnq_x_s16_f16): Delete. (vcvtnq_x_s32_f32): Delete. (vcvtnq_x_u16_f16): Delete. (vcvtnq_x_u32_f32): Delete. (vcvtpq_x_s16_f16): Delete. (vcvtpq_x_s32_f32): Delete. (vcvtpq_x_u16_f16): Delete. (vcvtpq_x_u32_f32): Delete. (vcvtmq_x_s16_f16): Delete. (vcvtmq_x_s32_f32): Delete. (vcvtmq_x_u16_f16): Delete. (vcvtmq_x_u32_f32): Delete. (__arm_vcvtpq_u16_f16): Delete. (__arm_vcvtpq_u32_f32): Delete. (__arm_vcvtnq_u16_f16): Delete. (__arm_vcvtnq_u32_f32): Delete. (__arm_vcvtmq_u16_f16): Delete. (__arm_vcvtmq_u32_f32): Delete. (__arm_vcvtaq_u16_f16): Delete. (__arm_vcvtaq_u32_f32): Delete. (__arm_vcvtaq_s16_f16): Delete. (__arm_vcvtaq_s32_f32): Delete. (__arm_vcvtnq_s16_f16): Delete. (__arm_vcvtnq_s32_f32): Delete. (__arm_vcvtpq_s16_f16): Delete. (__arm_vcvtpq_s32_f32): Delete. (__arm_vcvtmq_s16_f16): Delete. (__arm_vcvtmq_s32_f32): Delete. (__arm_vcvtaq_m_s16_f16): Delete. (__arm_vcvtaq_m_u16_f16): Delete. (__arm_vcvtaq_m_s32_f32): Delete. (__arm_vcvtaq_m_u32_f32): Delete. (__arm_vcvtmq_m_s16_f16): Delete. (__arm_vcvtnq_m_s16_f16): Delete. (__arm_vcvtpq_m_s16_f16): Delete. (__arm_vcvtmq_m_u16_f16): Delete. (__arm_vcvtnq_m_u16_f16): Delete. (__arm_vcvtpq_m_u16_f16): Delete. (__arm_vcvtmq_m_s32_f32): Delete. (__arm_vcvtnq_m_s32_f32): Delete. (__arm_vcvtpq_m_s32_f32): Delete. (__arm_vcvtmq_m_u32_f32): Delete. (__arm_vcvtnq_m_u32_f32): Delete. (__arm_vcvtpq_m_u32_f32): Delete. (__arm_vcvtaq_x_s16_f16): Delete. (__arm_vcvtaq_x_s32_f32): Delete. (__arm_vcvtaq_x_u16_f16): Delete. (__arm_vcvtaq_x_u32_f32): Delete. (__arm_vcvtnq_x_s16_f16): Delete. (__arm_vcvtnq_x_s32_f32): Delete. (__arm_vcvtnq_x_u16_f16): Delete. (__arm_vcvtnq_x_u32_f32): Delete. (__arm_vcvtpq_x_s16_f16): Delete. (__arm_vcvtpq_x_s32_f32): Delete. (__arm_vcvtpq_x_u16_f16): Delete. (__arm_vcvtpq_x_u32_f32): Delete. (__arm_vcvtmq_x_s16_f16): Delete. (__arm_vcvtmq_x_s32_f32): Delete. (__arm_vcvtmq_x_u16_f16): Delete. (__arm_vcvtmq_x_u32_f32): Delete. (__arm_vcvtaq_m): Delete. (__arm_vcvtmq_m): Delete. (__arm_vcvtnq_m): Delete. (__arm_vcvtpq_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 4 + gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 4 + gcc/config/arm/arm-mve-builtins.cc | 9 + gcc/config/arm/arm_mve.h | 533 ----------------------- 5 files changed, 21 insertions(+), 533 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 760378c91b1..281f3749bce 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -506,6 +506,10 @@ FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNK FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) FUNCTION (vcvtq, vcvtq_impl,) +FUNCTION_WITHOUT_N_NO_F (vcvtaq, VCVTAQ) +FUNCTION_WITHOUT_N_NO_F (vcvtmq, VCVTMQ) +FUNCTION_WITHOUT_N_NO_F (vcvtnq, VCVTNQ) +FUNCTION_WITHOUT_N_NO_F (vcvtpq, VCVTPQ) FUNCTION (vcvtbq, vcvtxq_impl, (VCVTBQ_F16_F32, VCVTBQ_M_F16_F32, VCVTBQ_F32_F16, VCVTBQ_M_F32_F16)) FUNCTION (vcvttq, vcvtxq_impl, (VCVTTQ_F16_F32, VCVTTQ_M_F16_F32, VCVTTQ_F32_F16, VCVTTQ_M_F32_F16)) FUNCTION_ONLY_N (vdupq, VDUPQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 5a6f1a59c7c..7821007fe2c 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -180,6 +180,10 @@ DEF_MVE_FUNCTION (vcmpltq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpneq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcreateq, create, all_float, none) DEF_MVE_FUNCTION (vcvtq, vcvt, cvt, mx_or_none) +DEF_MVE_FUNCTION (vcvtaq, vcvtx, cvtx, mx_or_none) +DEF_MVE_FUNCTION (vcvtmq, vcvtx, cvtx, mx_or_none) +DEF_MVE_FUNCTION (vcvtnq, vcvtx, cvtx, mx_or_none) +DEF_MVE_FUNCTION (vcvtpq, vcvtx, cvtx, mx_or_none) DEF_MVE_FUNCTION (vcvtbq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) DEF_MVE_FUNCTION (vcvtbq, vcvt_f32_f16, cvt_f32_f16, mx_or_none) DEF_MVE_FUNCTION (vcvttq, vcvt_f16_f32, cvt_f16_f32, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 25467769d05..89e0f631c32 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -55,6 +55,10 @@ extern const function_base *const vcmulq_rot270; extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; extern const function_base *const vcvtq; +extern const function_base *const vcvtaq; +extern const function_base *const vcvtmq; +extern const function_base *const vcvtnq; +extern const function_base *const vcvtpq; extern const function_base *const vcvtbq; extern const function_base *const vcvttq; extern const function_base *const vdupq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 4c554a47d85..07e63df35e4 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -227,6 +227,14 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { #define TYPES_cvt_f32_f16(S, D) \ D (f32, f16) +/* All the type combinations allowed by vcvtXq. */ +#define TYPES_cvtx(S, D) \ + D (s16, f16), \ + D (s32, f32), \ + \ + D (u16, f16), \ + D (u32, f32) + #define TYPES_reinterpret_signed1(D, A) \ D (A, s8), D (A, s16), D (A, s32), D (A, s64) @@ -309,6 +317,7 @@ DEF_MVE_TYPES_ARRAY (signed_32); DEF_MVE_TYPES_ARRAY (cvt); DEF_MVE_TYPES_ARRAY (cvt_f16_f32); DEF_MVE_TYPES_ARRAY (cvt_f32_f16); +DEF_MVE_TYPES_ARRAY (cvtx); DEF_MVE_TYPES_ARRAY (reinterpret_integer); DEF_MVE_TYPES_ARRAY (reinterpret_float); diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 5c35e08d754..448407627e9 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -137,10 +137,6 @@ #define vsetq_lane(__a, __b, __idx) __arm_vsetq_lane(__a, __b, __idx) #define vgetq_lane(__a, __idx) __arm_vgetq_lane(__a, __idx) #define vshlcq_m(__a, __b, __imm, __p) __arm_vshlcq_m(__a, __b, __imm, __p) -#define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) -#define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) -#define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) -#define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) #define vst4q_s8( __addr, __value) __arm_vst4q_s8( __addr, __value) @@ -151,22 +147,6 @@ #define vst4q_u32( __addr, __value) __arm_vst4q_u32( __addr, __value) #define vst4q_f16( __addr, __value) __arm_vst4q_f16( __addr, __value) #define vst4q_f32( __addr, __value) __arm_vst4q_f32( __addr, __value) -#define vcvtaq_s16_f16(__a) __arm_vcvtaq_s16_f16(__a) -#define vcvtaq_s32_f32(__a) __arm_vcvtaq_s32_f32(__a) -#define vcvtnq_s16_f16(__a) __arm_vcvtnq_s16_f16(__a) -#define vcvtnq_s32_f32(__a) __arm_vcvtnq_s32_f32(__a) -#define vcvtpq_s16_f16(__a) __arm_vcvtpq_s16_f16(__a) -#define vcvtpq_s32_f32(__a) __arm_vcvtpq_s32_f32(__a) -#define vcvtmq_s16_f16(__a) __arm_vcvtmq_s16_f16(__a) -#define vcvtmq_s32_f32(__a) __arm_vcvtmq_s32_f32(__a) -#define vcvtpq_u16_f16(__a) __arm_vcvtpq_u16_f16(__a) -#define vcvtpq_u32_f32(__a) __arm_vcvtpq_u32_f32(__a) -#define vcvtnq_u16_f16(__a) __arm_vcvtnq_u16_f16(__a) -#define vcvtnq_u32_f32(__a) __arm_vcvtnq_u32_f32(__a) -#define vcvtmq_u16_f16(__a) __arm_vcvtmq_u16_f16(__a) -#define vcvtmq_u32_f32(__a) __arm_vcvtmq_u32_f32(__a) -#define vcvtaq_u16_f16(__a) __arm_vcvtaq_u16_f16(__a) -#define vcvtaq_u32_f32(__a) __arm_vcvtaq_u32_f32(__a) #define vctp16q(__a) __arm_vctp16q(__a) #define vctp32q(__a) __arm_vctp32q(__a) #define vctp64q(__a) __arm_vctp64q(__a) @@ -200,28 +180,12 @@ #define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) #define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) #define vbicq_m_n_u32(__a, __imm, __p) __arm_vbicq_m_n_u32(__a, __imm, __p) -#define vcvtaq_m_s16_f16(__inactive, __a, __p) __arm_vcvtaq_m_s16_f16(__inactive, __a, __p) -#define vcvtaq_m_u16_f16(__inactive, __a, __p) __arm_vcvtaq_m_u16_f16(__inactive, __a, __p) -#define vcvtaq_m_s32_f32(__inactive, __a, __p) __arm_vcvtaq_m_s32_f32(__inactive, __a, __p) -#define vcvtaq_m_u32_f32(__inactive, __a, __p) __arm_vcvtaq_m_u32_f32(__inactive, __a, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) -#define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) -#define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) -#define vcvtmq_m_u16_f16(__inactive, __a, __p) __arm_vcvtmq_m_u16_f16(__inactive, __a, __p) -#define vcvtnq_m_u16_f16(__inactive, __a, __p) __arm_vcvtnq_m_u16_f16(__inactive, __a, __p) -#define vcvtpq_m_u16_f16(__inactive, __a, __p) __arm_vcvtpq_m_u16_f16(__inactive, __a, __p) -#define vcvtmq_m_s32_f32(__inactive, __a, __p) __arm_vcvtmq_m_s32_f32(__inactive, __a, __p) -#define vcvtnq_m_s32_f32(__inactive, __a, __p) __arm_vcvtnq_m_s32_f32(__inactive, __a, __p) -#define vcvtpq_m_s32_f32(__inactive, __a, __p) __arm_vcvtpq_m_s32_f32(__inactive, __a, __p) -#define vcvtmq_m_u32_f32(__inactive, __a, __p) __arm_vcvtmq_m_u32_f32(__inactive, __a, __p) -#define vcvtnq_m_u32_f32(__inactive, __a, __p) __arm_vcvtnq_m_u32_f32(__inactive, __a, __p) -#define vcvtpq_m_u32_f32(__inactive, __a, __p) __arm_vcvtpq_m_u32_f32(__inactive, __a, __p) #define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) #define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) #define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) @@ -532,22 +496,6 @@ #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vcvtaq_x_s16_f16(__a, __p) __arm_vcvtaq_x_s16_f16(__a, __p) -#define vcvtaq_x_s32_f32(__a, __p) __arm_vcvtaq_x_s32_f32(__a, __p) -#define vcvtaq_x_u16_f16(__a, __p) __arm_vcvtaq_x_u16_f16(__a, __p) -#define vcvtaq_x_u32_f32(__a, __p) __arm_vcvtaq_x_u32_f32(__a, __p) -#define vcvtnq_x_s16_f16(__a, __p) __arm_vcvtnq_x_s16_f16(__a, __p) -#define vcvtnq_x_s32_f32(__a, __p) __arm_vcvtnq_x_s32_f32(__a, __p) -#define vcvtnq_x_u16_f16(__a, __p) __arm_vcvtnq_x_u16_f16(__a, __p) -#define vcvtnq_x_u32_f32(__a, __p) __arm_vcvtnq_x_u32_f32(__a, __p) -#define vcvtpq_x_s16_f16(__a, __p) __arm_vcvtpq_x_s16_f16(__a, __p) -#define vcvtpq_x_s32_f32(__a, __p) __arm_vcvtpq_x_s32_f32(__a, __p) -#define vcvtpq_x_u16_f16(__a, __p) __arm_vcvtpq_x_u16_f16(__a, __p) -#define vcvtpq_x_u32_f32(__a, __p) __arm_vcvtpq_x_u32_f32(__a, __p) -#define vcvtmq_x_s16_f16(__a, __p) __arm_vcvtmq_x_s16_f16(__a, __p) -#define vcvtmq_x_s32_f32(__a, __p) __arm_vcvtmq_x_s32_f32(__a, __p) -#define vcvtmq_x_u16_f16(__a, __p) __arm_vcvtmq_x_u16_f16(__a, __p) -#define vcvtmq_x_u32_f32(__a, __p) __arm_vcvtmq_x_u32_f32(__a, __p) #define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) #define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) @@ -3690,118 +3638,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtpq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtpq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtnq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtnq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtmq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtmq_uv4si (__a); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_u16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtaq_uv8hi (__a); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_u32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtaq_uv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtaq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtaq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtnq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtnq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtpq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtpq_sv4si (__a); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_s16_f16 (float16x8_t __a) -{ - return __builtin_mve_vcvtmq_sv8hi (__a); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_s32_f32 (float32x4_t __a) -{ - return __builtin_mve_vcvtmq_sv4si (__a); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) @@ -3830,119 +3666,6 @@ __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vbicq_fv4sf (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv4si (__inactive, __a, __p); -} - - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv8hi (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv4si (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m_u32_f32 (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv4si (__inactive, __a, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -4201,118 +3924,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtaq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtnq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtpq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_s32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_u16_f16 (float16x8_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_x_u32_f32 (float32x4_t __a, mve_pred16_t __p) -{ - return __builtin_mve_vcvtmq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -6771,118 +6382,6 @@ __arm_vbicq (float32x4_t __a, float32x4_t __b) return __arm_vbicq_f32 (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtaq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtaq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_s16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_u16_f16 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_s32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtmq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtmq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtnq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtnq_m_u32_f32 (__inactive, __a, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcvtpq_m (uint32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) -{ - return __arm_vcvtpq_m_u32_f32 (__inactive, __a, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -7572,38 +7071,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vcvtaq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtaq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtaq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtmq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtmq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtmq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtmq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtmq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtnq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtnq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtnq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtnq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtnq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - -#define __arm_vcvtpq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtpq_m_s16_f16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtpq_m_s32_f32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcvtpq_m_u16_f16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcvtpq_m_u32_f32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, float32x4_t), p2));}) - #define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ From patchwork Thu Jul 11 21:43:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959549 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BJNlx5vw; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpDg3rVsz1xqx for ; Fri, 12 Jul 2024 07:45:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C3CA53838A0A for ; Thu, 11 Jul 2024 21:45:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) by sourceware.org (Postfix) with ESMTPS id 241AB383A60E for ; Thu, 11 Jul 2024 21:43:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 241AB383A60E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 241AB383A60E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734219; cv=none; b=xqKbrUu2FoZX6zmqX98BSeGjgabO6pIsBXkwSSYlqqrVD0EeLghZAwgVzCV4ODqHtydoiPO+qrJNwKk3qwVXpkOpzpDjg6TDp1eKz85bRv6no5yLPbrO8HbvXTWGXuTeXt5N9hPAwM3zhX+7AOK9nXc98r2g6F6xINVBrbcrWzw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734219; c=relaxed/simple; bh=cf7ai++UExX9OdkvzO00MvZM/USIeuUT9zlhXeRmQLs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=TAV2BZWFNVx73FK/Sh06N6ezYIa7/3bKyPUnR2QvOxaXS7VOBsYaW5L6817RCnEhFplVIxlfZfTgoL8VW21+rLYrXrH/FaLPm1tmqcIVmcKzGmUzMewGXyseyprk9ZFKch9+VAo/OQbyFqKP7cFrSLt7MV72y9TK2cQ2mCz99GI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-5c6661bca43so648676eaf.0 for ; Thu, 11 Jul 2024 14:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734212; x=1721339012; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=T4Wmcir/DfCi30l+eFWaOg7gOPjfQesA/N44w1qD9vo=; b=BJNlx5vwYoBXPqkbhEFGa4B9CSE5PYGI4AibTd08JBE7l+spW1InjfVve2fb7I6VA5 86lgdSqoknZ3D97J9WTir1J7shNDUxuMLII/ousFLXibcnV/CEpyaKoqpwZ0XepSc/ys dNxH5vat3ULL4kr37izZJiPtQ+uaRDSecxGMP/BLWMuThJBVuLPIDUee8n//Do07xjo0 G5+eiU+2TxKRyKIuw8hHgFQaGyvVD6yOEchPEfJc5K3F9yvSlBhts4V9m2I/YRO79YHy +VBFGBSQLdj0rRRKPd7raxccYWIoJiUKPyHlInlLAOp1i44UoELSAJpgHxNTzscjCs2y ZStw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734212; x=1721339012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T4Wmcir/DfCi30l+eFWaOg7gOPjfQesA/N44w1qD9vo=; b=XzTvty6TATclc+7Fnz4JpiU0zUFEED21MrGHKWQCLmqtIaRDtGnsW2QLCpKD/hGKeX lhrJXs84a3zq1GJVWLeYWNaZ/sOHqj75Z97QgbFEDhaBMa190kYgeIEqpwAlaYaQNr1I wjDAkgco0/JCnpVbYyc3r3RqBEfyNCLFwWo4Rmin0C20OgOAnr3FMcwKdkErDEDmRdgu fRLMReWh40dl3UXRnCFIxy2rDRWUUnfzoE3CGSPp6kM4NYgwBU/6x0T+WkttmhhKKGwF gZ3Kl9QdTRrIxJXpmXZ3FZ/hhMY3cB2GQU2ONZw2fo1fQTtUr8linAt4SSCMFBRghW5t XW9g== X-Gm-Message-State: AOJu0YypUUXNc0RcEsKpAf90Hsg1/ef7gIDdf5N4D0eOUAgb5m40pIAP pbAd2UIapYbKpQSPRa94EHlufQgZvKVuQnWz56PDxffa5h67PWA36ZK5FeKMRq75CL6tXJBXLD3 33nyHyQ== X-Google-Smtp-Source: AGHT+IEoEuIk18fdJTHF00+PhHCJFh3wLEVeE+RmXh0trk6Whb9Bsh9G9SuUpQH+IRrk20lPUwby/Q== X-Received: by 2002:a05:6820:983:b0:5c4:2497:c92d with SMTP id 006d021491bc7-5ccdf1ab01dmr303655eaf.2.1720734211187; Thu, 11 Jul 2024 14:43:31 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:30 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 13/15] arm: [MVE intrinsics] rework vbicq Date: Thu, 11 Jul 2024 21:43:03 +0000 Message-Id: <20240711214305.3193022-13-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vbicq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vbicq): New. * config/arm/arm-mve-builtins-base.def (vbicq): New. * config/arm/arm-mve-builtins-base.h (vbicq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vbic): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vbicq. * config/arm/arm_mve.h (vbicq): Delete. (vbicq_m_n): Delete. (vbicq_m): Delete. (vbicq_x): Delete. (vbicq_u8): Delete. (vbicq_s8): Delete. (vbicq_u16): Delete. (vbicq_s16): Delete. (vbicq_u32): Delete. (vbicq_s32): Delete. (vbicq_n_u16): Delete. (vbicq_f16): Delete. (vbicq_n_s16): Delete. (vbicq_n_u32): Delete. (vbicq_f32): Delete. (vbicq_n_s32): Delete. (vbicq_m_n_s16): Delete. (vbicq_m_n_s32): Delete. (vbicq_m_n_u16): Delete. (vbicq_m_n_u32): Delete. (vbicq_m_s8): Delete. (vbicq_m_s32): Delete. (vbicq_m_s16): Delete. (vbicq_m_u8): Delete. (vbicq_m_u32): Delete. (vbicq_m_u16): Delete. (vbicq_m_f32): Delete. (vbicq_m_f16): Delete. (vbicq_x_s8): Delete. (vbicq_x_s16): Delete. (vbicq_x_s32): Delete. (vbicq_x_u8): Delete. (vbicq_x_u16): Delete. (vbicq_x_u32): Delete. (vbicq_x_f16): Delete. (vbicq_x_f32): Delete. (__arm_vbicq_u8): Delete. (__arm_vbicq_s8): Delete. (__arm_vbicq_u16): Delete. (__arm_vbicq_s16): Delete. (__arm_vbicq_u32): Delete. (__arm_vbicq_s32): Delete. (__arm_vbicq_n_u16): Delete. (__arm_vbicq_n_s16): Delete. (__arm_vbicq_n_u32): Delete. (__arm_vbicq_n_s32): Delete. (__arm_vbicq_m_n_s16): Delete. (__arm_vbicq_m_n_s32): Delete. (__arm_vbicq_m_n_u16): Delete. (__arm_vbicq_m_n_u32): Delete. (__arm_vbicq_m_s8): Delete. (__arm_vbicq_m_s32): Delete. (__arm_vbicq_m_s16): Delete. (__arm_vbicq_m_u8): Delete. (__arm_vbicq_m_u32): Delete. (__arm_vbicq_m_u16): Delete. (__arm_vbicq_x_s8): Delete. (__arm_vbicq_x_s16): Delete. (__arm_vbicq_x_s32): Delete. (__arm_vbicq_x_u8): Delete. (__arm_vbicq_x_u16): Delete. (__arm_vbicq_x_u32): Delete. (__arm_vbicq_f16): Delete. (__arm_vbicq_f32): Delete. (__arm_vbicq_m_f32): Delete. (__arm_vbicq_m_f16): Delete. (__arm_vbicq_x_f16): Delete. (__arm_vbicq_x_f32): Delete. (__arm_vbicq): Delete. (__arm_vbicq_m_n): Delete. (__arm_vbicq_m): Delete. (__arm_vbicq_x): Delete. * config/arm/mve.md (mve_vbicq_u): Rename into ... (@mve_vbicq_u): ... this. (mve_vbicq_s): Rename into ... (@mve_vbicq_s): ... this. (mve_vbicq_f): Rename into ... (@mve_vbicq_f): ... this. --- gcc/config/arm/arm-mve-builtins-base.cc | 1 + gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins-functions.h | 54 ++ gcc/config/arm/arm-mve-builtins.cc | 1 + gcc/config/arm/arm_mve.h | 574 -------------------- gcc/config/arm/mve.md | 6 +- 7 files changed, 62 insertions(+), 577 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 281f3749bce..e33603ec1f3 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -481,6 +481,7 @@ FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ) FUNCTION_PRED_P_S_U (vaddvq, VADDVQ) FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ) FUNCTION_WITH_RTX_M (vandq, AND, VANDQ) +FUNCTION (vbicq, unspec_based_mve_function_exact_insn_vbic, (VBICQ_N_S, VBICQ_N_U, VBICQ_M_S, VBICQ_M_U, VBICQ_M_F, VBICQ_M_N_S, VBICQ_M_N_U)) FUNCTION_ONLY_N (vbrsrq, VBRSRQ) FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M, VCADDQ_ROT90_M, VCADDQ_ROT90_M_F)) FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M, VCADDQ_ROT270_M, VCADDQ_ROT270_M_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 7821007fe2c..a1b3eea32d3 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -27,6 +27,7 @@ DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none) DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vbicq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_integer, mx_or_none) @@ -161,6 +162,7 @@ DEF_MVE_FUNCTION (vabdq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vabsq, unary, all_float, mx_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vbicq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 89e0f631c32..2abe640b840 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -32,6 +32,7 @@ extern const function_base *const vaddq; extern const function_base *const vaddvaq; extern const function_base *const vaddvq; extern const function_base *const vandq; +extern const function_base *const vbicq; extern const function_base *const vbrsrq; extern const function_base *const vcaddq_rot270; extern const function_base *const vcaddq_rot90; diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index 43c4aaffeb1..a06c91b3a45 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -455,6 +455,60 @@ public: } }; +/* Map the function directly to CODE (UNSPEC, M) for vbic-like + builtins. The difference with unspec_based_mve_function_exact_insn + is that this function has vbic hardcoded for the PRED_none, + MODE_none version, rather than using an RTX. */ +class unspec_based_mve_function_exact_insn_vbic : public unspec_based_mve_function_base +{ +public: + CONSTEXPR unspec_based_mve_function_exact_insn_vbic (int unspec_for_n_sint, + int unspec_for_n_uint, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp, + int unspec_for_m_n_sint, + int unspec_for_m_n_uint) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + -1, -1, -1, /* No non-predicated, no mode intrinsics. */ + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1) + {} + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + + /* No suffix, no predicate, use the right RTX code. */ + if (e.pred == PRED_none + && e.mode_suffix_id == MODE_none) + { + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_vbicq_u (mode); + else + code = code_for_mve_vbicq_s (mode); + else + code = code_for_mve_vbicq_f (mode); + + return e.use_exact_insn (code); + } + + return expand_unspec (e); + } +}; + /* Map the comparison functions. */ class unspec_based_mve_function_exact_insn_vcmp : public unspec_based_mve_function_base { diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 07e63df35e4..13c666b8f6a 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -737,6 +737,7 @@ function_instance::has_inactive_argument () const return false; if (mode_suffix_id == MODE_r + || (base == functions::vbicq && mode_suffix_id == MODE_n) || base == functions::vcmlaq || base == functions::vcmlaq_rot90 || base == functions::vcmlaq_rot180 diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 448407627e9..3fd6980a58d 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -43,10 +43,7 @@ #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) #define vornq(__a, __b) __arm_vornq(__a, __b) -#define vbicq(__a, __b) __arm_vbicq(__a, __b) -#define vbicq_m_n(__a, __imm, __p) __arm_vbicq_m_n(__a, __imm, __p) #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) -#define vbicq_m(__inactive, __a, __b, __p) __arm_vbicq_m(__inactive, __a, __b, __p) #define vornq_m(__inactive, __a, __b, __p) __arm_vornq_m(__inactive, __a, __b, __p) #define vstrbq_scatter_offset(__base, __offset, __value) __arm_vstrbq_scatter_offset(__base, __offset, __value) #define vstrbq(__addr, __value) __arm_vstrbq(__addr, __value) @@ -119,7 +116,6 @@ #define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) #define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) #define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) -#define vbicq_x(__a, __b, __p) __arm_vbicq_x(__a, __b, __p) #define vornq_x(__a, __b, __p) __arm_vornq_x(__a, __b, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) #define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) @@ -153,53 +149,29 @@ #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) -#define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b) #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) -#define vbicq_s8(__a, __b) __arm_vbicq_s8(__a, __b) #define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b) -#define vbicq_u16(__a, __b) __arm_vbicq_u16(__a, __b) #define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b) -#define vbicq_s16(__a, __b) __arm_vbicq_s16(__a, __b) #define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b) -#define vbicq_u32(__a, __b) __arm_vbicq_u32(__a, __b) #define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b) -#define vbicq_s32(__a, __b) __arm_vbicq_s32(__a, __b) -#define vbicq_n_u16(__a, __imm) __arm_vbicq_n_u16(__a, __imm) #define vornq_f16(__a, __b) __arm_vornq_f16(__a, __b) -#define vbicq_f16(__a, __b) __arm_vbicq_f16(__a, __b) -#define vbicq_n_s16(__a, __imm) __arm_vbicq_n_s16(__a, __imm) -#define vbicq_n_u32(__a, __imm) __arm_vbicq_n_u32(__a, __imm) #define vornq_f32(__a, __b) __arm_vornq_f32(__a, __b) -#define vbicq_f32(__a, __b) __arm_vbicq_f32(__a, __b) -#define vbicq_n_s32(__a, __imm) __arm_vbicq_n_s32(__a, __imm) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) #define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p) -#define vbicq_m_n_s16(__a, __imm, __p) __arm_vbicq_m_n_s16(__a, __imm, __p) -#define vbicq_m_n_s32(__a, __imm, __p) __arm_vbicq_m_n_s32(__a, __imm, __p) -#define vbicq_m_n_u16(__a, __imm, __p) __arm_vbicq_m_n_u16(__a, __imm, __p) -#define vbicq_m_n_u32(__a, __imm, __p) __arm_vbicq_m_n_u32(__a, __imm, __p) #define vshlcq_s8(__a, __b, __imm) __arm_vshlcq_s8(__a, __b, __imm) #define vshlcq_u8(__a, __b, __imm) __arm_vshlcq_u8(__a, __b, __imm) #define vshlcq_s16(__a, __b, __imm) __arm_vshlcq_s16(__a, __b, __imm) #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vbicq_m_s8(__inactive, __a, __b, __p) __arm_vbicq_m_s8(__inactive, __a, __b, __p) -#define vbicq_m_s32(__inactive, __a, __b, __p) __arm_vbicq_m_s32(__inactive, __a, __b, __p) -#define vbicq_m_s16(__inactive, __a, __b, __p) __arm_vbicq_m_s16(__inactive, __a, __b, __p) -#define vbicq_m_u8(__inactive, __a, __b, __p) __arm_vbicq_m_u8(__inactive, __a, __b, __p) -#define vbicq_m_u32(__inactive, __a, __b, __p) __arm_vbicq_m_u32(__inactive, __a, __b, __p) -#define vbicq_m_u16(__inactive, __a, __b, __p) __arm_vbicq_m_u16(__inactive, __a, __b, __p) #define vornq_m_s8(__inactive, __a, __b, __p) __arm_vornq_m_s8(__inactive, __a, __b, __p) #define vornq_m_s32(__inactive, __a, __b, __p) __arm_vornq_m_s32(__inactive, __a, __b, __p) #define vornq_m_s16(__inactive, __a, __b, __p) __arm_vornq_m_s16(__inactive, __a, __b, __p) #define vornq_m_u8(__inactive, __a, __b, __p) __arm_vornq_m_u8(__inactive, __a, __b, __p) #define vornq_m_u32(__inactive, __a, __b, __p) __arm_vornq_m_u32(__inactive, __a, __b, __p) #define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) -#define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) -#define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) #define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) #define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) @@ -484,20 +456,12 @@ #define viwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u8(__a, __b, __imm, __p) #define viwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u16(__a, __b, __imm, __p) #define viwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u32(__a, __b, __imm, __p) -#define vbicq_x_s8(__a, __b, __p) __arm_vbicq_x_s8(__a, __b, __p) -#define vbicq_x_s16(__a, __b, __p) __arm_vbicq_x_s16(__a, __b, __p) -#define vbicq_x_s32(__a, __b, __p) __arm_vbicq_x_s32(__a, __b, __p) -#define vbicq_x_u8(__a, __b, __p) __arm_vbicq_x_u8(__a, __b, __p) -#define vbicq_x_u16(__a, __b, __p) __arm_vbicq_x_u16(__a, __b, __p) -#define vbicq_x_u32(__a, __b, __p) __arm_vbicq_x_u32(__a, __b, __p) #define vornq_x_s8(__a, __b, __p) __arm_vornq_x_s8(__a, __b, __p) #define vornq_x_s16(__a, __b, __p) __arm_vornq_x_s16(__a, __b, __p) #define vornq_x_s32(__a, __b, __p) __arm_vornq_x_s32(__a, __b, __p) #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vbicq_x_f16(__a, __b, __p) __arm_vbicq_x_f16(__a, __b, __p) -#define vbicq_x_f32(__a, __b, __p) __arm_vbicq_x_f32(__a, __b, __p) #define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) #define vornq_x_f32(__a, __b, __p) __arm_vornq_x_f32(__a, __b, __p) #define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) @@ -708,13 +672,6 @@ __arm_vornq_u8 (uint8x16_t __a, uint8x16_t __b) return __builtin_mve_vornq_uv16qi (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return __builtin_mve_vbicq_uv16qi (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s8 (int8x16_t __a, int8x16_t __b) @@ -722,13 +679,6 @@ __arm_vornq_s8 (int8x16_t __a, int8x16_t __b) return __builtin_mve_vornq_sv16qi (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vbicq_sv16qi (__a, __b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) @@ -736,13 +686,6 @@ __arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) return __builtin_mve_vornq_uv8hi (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return __builtin_mve_vbicq_uv8hi (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s16 (int16x8_t __a, int16x8_t __b) @@ -750,13 +693,6 @@ __arm_vornq_s16 (int16x8_t __a, int16x8_t __b) return __builtin_mve_vornq_sv8hi (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vbicq_sv8hi (__a, __b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) @@ -764,13 +700,6 @@ __arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) return __builtin_mve_vornq_uv4si (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return __builtin_mve_vbicq_uv4si (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_s32 (int32x4_t __a, int32x4_t __b) @@ -778,41 +707,6 @@ __arm_vornq_s32 (int32x4_t __a, int32x4_t __b) return __builtin_mve_vornq_sv4si (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vbicq_sv4si (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u16 (uint16x8_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_uv8hi (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s16 (int16x8_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_sv8hi (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u32 (uint32x4_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_uv4si (__a, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s32 (int32x4_t __a, const int __imm) -{ - return __builtin_mve_vbicq_n_sv4si (__a, __imm); -} - __extension__ extern __inline mve_pred16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vctp8q_m (uint32_t __a, mve_pred16_t __p) @@ -841,34 +735,6 @@ __arm_vctp16q_m (uint32_t __a, mve_pred16_t __p) return __builtin_mve_vctp16q_mv8bi (__a, __p); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_s16 (int16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_sv8hi (__a, __imm, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_s32 (int32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_sv4si (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_u16 (uint16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_uv8hi (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n_u32 (uint32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_n_uv4si (__a, __imm, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq_s8 (int8x16_t __a, uint32_t * __b, const int __imm) @@ -923,48 +789,6 @@ __arm_vshlcq_u32 (uint32x4_t __a, uint32_t * __b, const int __imm) return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv8hi (__inactive, __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -2834,48 +2658,6 @@ __arm_viwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16 return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -3645,13 +3427,6 @@ __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) return __builtin_mve_vornq_fv8hf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vbicq_fv8hf (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_f32 (float32x4_t __a, float32x4_t __b) @@ -3659,27 +3434,6 @@ __arm_vornq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vornq_fv4sf (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vbicq_fv4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -3924,20 +3678,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -4119,13 +3859,6 @@ __arm_vornq (uint8x16_t __a, uint8x16_t __b) return __arm_vornq_u8 (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vbicq_u8 (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int8x16_t __a, int8x16_t __b) @@ -4133,13 +3866,6 @@ __arm_vornq (int8x16_t __a, int8x16_t __b) return __arm_vornq_s8 (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int8x16_t __a, int8x16_t __b) -{ - return __arm_vbicq_s8 (__a, __b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (uint16x8_t __a, uint16x8_t __b) @@ -4147,13 +3873,6 @@ __arm_vornq (uint16x8_t __a, uint16x8_t __b) return __arm_vornq_u16 (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vbicq_u16 (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int16x8_t __a, int16x8_t __b) @@ -4161,13 +3880,6 @@ __arm_vornq (int16x8_t __a, int16x8_t __b) return __arm_vornq_s16 (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, int16x8_t __b) -{ - return __arm_vbicq_s16 (__a, __b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (uint32x4_t __a, uint32x4_t __b) @@ -4175,13 +3887,6 @@ __arm_vornq (uint32x4_t __a, uint32x4_t __b) return __arm_vornq_u32 (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vbicq_u32 (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (int32x4_t __a, int32x4_t __b) @@ -4189,69 +3894,6 @@ __arm_vornq (int32x4_t __a, int32x4_t __b) return __arm_vornq_s32 (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, int32x4_t __b) -{ - return __arm_vbicq_s32 (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, const int __imm) -{ - return __arm_vbicq_n_u16 (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, const int __imm) -{ - return __arm_vbicq_n_s16 (__a, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, const int __imm) -{ - return __arm_vbicq_n_u32 (__a, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, const int __imm) -{ - return __arm_vbicq_n_s32 (__a, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (int16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_s16 (__a, __imm, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (int32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_s32 (__a, __imm, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u32 (__a, __imm, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) @@ -4294,48 +3936,6 @@ __arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) return __arm_vshlcq_u32 (__a, __b, __imm); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -5778,48 +5378,6 @@ __arm_viwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t return __arm_viwdupq_x_wb_u32 (__a, __b, __imm, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_u32 (__a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -6361,13 +5919,6 @@ __arm_vornq (float16x8_t __a, float16x8_t __b) return __arm_vornq_f16 (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (float16x8_t __a, float16x8_t __b) -{ - return __arm_vbicq_f16 (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq (float32x4_t __a, float32x4_t __b) @@ -6375,27 +5926,6 @@ __arm_vornq (float32x4_t __a, float32x4_t __b) return __arm_vornq_f32 (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (float32x4_t __a, float32x4_t __b) -{ - return __arm_vbicq_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) @@ -6578,20 +6108,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vornq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -7027,22 +6543,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -7055,13 +6555,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) -#define __arm_vbicq_m_n(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vbicq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vbicq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -7071,19 +6564,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -7387,18 +6867,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_wb_p_u32 (p0, p1, __ARM_mve_coerce(__p2, uint32x4_t), p3), \ int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_wb_p_f32 (p0, p1, __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vbicq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ @@ -7469,20 +6937,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce_i_scalar (__p1, int)), \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -7492,24 +6946,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vbicq_m_n(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ - int (*)[__ARM_mve_type_int16x8_t]: __arm_vbicq_m_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_int32x4_t]: __arm_vbicq_m_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1, p2), \ - int (*)[__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ - int (*)[__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) - -#define __arm_vbicq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -7750,16 +7186,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) -#define __arm_vbicq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vbicq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ int (*)[__ARM_mve_type_int16_t_ptr]: __arm_vld1q_z_s16 (__ARM_mve_coerce_s16_ptr(p0, int16_t *), p1), \ diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index cded22be1ee..2ed19ff66fc 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -803,7 +803,7 @@ (define_expand "mve_vandq_s" ;; ;; [vbicq_s, vbicq_u]) ;; -(define_insn "mve_vbicq_u" +(define_insn "@mve_vbicq_u" [ (set (match_operand:MVE_2 0 "s_register_operand" "=w") (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand" "w")) @@ -815,7 +815,7 @@ (define_insn "mve_vbicq_u" (set_attr "type" "mve_move") ]) -(define_expand "mve_vbicq_s" +(define_expand "@mve_vbicq_s" [ (set (match_operand:MVE_2 0 "s_register_operand") (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand")) @@ -1209,7 +1209,7 @@ (define_insn "mve_vandq_f" ;; ;; [vbicq_f]) ;; -(define_insn "mve_vbicq_f" +(define_insn "@mve_vbicq_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (and:MVE_0 (not:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")) From patchwork Thu Jul 11 21:43:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959550 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=RdUkqwNI; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpDk5fVrz1xqx for ; Fri, 12 Jul 2024 07:45:26 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 158123838A29 for ; Thu, 11 Jul 2024 21:45:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) by sourceware.org (Postfix) with ESMTPS id A61603858410 for ; Thu, 11 Jul 2024 21:43:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A61603858410 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A61603858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734217; cv=none; b=u4j02le5NeV2nXdblcr2nXOvLApypg+WNpTNyWO4uIlkB7gOwxQU/1UIaxOtbr9I1gc0aI+ISv3jCER17xgB08Gef9DUgLIKSFyJBDF25aKWUqFblQtOx6l0BdLhKcdy9vdCKw0QJfO2hWAjL0gSk516XihD1snP3gDwHRzG1TE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734217; c=relaxed/simple; bh=8D1rEqxzCXulRO7AQtoPHBYVDJmez687sb/CjTk07xI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=ocIsTrxe0J86ePP87zvCGUWz7Tz/ZBCdnV00xaLSUWLqLfCKQoFN5AJVK1I4niv6UTYP5O5vPi8YBEfKddMeke5dtja4AuRCR4Q2XkC0KVqEPkSZhWDjiiPMXtQ4HVCf+npSXIjwQFwtsbA67Mcw73xMSBlKqyxLXvCWukCZysg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc33.google.com with SMTP id 006d021491bc7-5c46bcde211so631998eaf.0 for ; Thu, 11 Jul 2024 14:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734212; x=1721339012; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8iYMPSZW92IrQK2b8O6TL4sxT4+RHaByauH+QzC8HHA=; b=RdUkqwNIGzT6Eb6GYEp8Gr/dn/gBovLKY6f8bdgMAiBB8cd+6ch1dpBg+rfOzi+VuV Q02yUXEii0pCdUO1k5w/mobRiFKv7MG0a5eOaNtTYGLIFfQaMHUcjyiOqaYu6tNtec2E LXGA2cwGvNEwV/jZn5wEHrZDiqo4q8/8+kkggKp0uROfXHTKxdF8JZ1hgPfOVq9ElteK zUs3r1CzvPzkTp/pT8cEQzexTzUEVm0GmqQ872/+Lyml7cZPIws81yBPFa2anJl7DQd7 UDvNTOViWsjSTPvG32kOgFVm2752ZrTp0yovkAZoC1i9eg/HrvPRINDY5PLERsZLn2Ya cSdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734212; x=1721339012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8iYMPSZW92IrQK2b8O6TL4sxT4+RHaByauH+QzC8HHA=; b=HFln61KJB7/mXBOPwXJcPMyMgLAbyG+RBcGGnvZ5xXU0b+wdS2tRixxfZ/M/vUrLUG x7QJsbIsY2xEVub3k+UXBffaiAIa2/alenLUr+C125fmdvSWy+LEJ9/zD7d5FkBxtFvC KI1nouawgpR98fgsQTugkvTpWF1fMr4ZOaCAL0R/BMHgQnMOIaM2f8dsSSRxNvW+OnOS 3JRkT/5d46ebM9SPtAvX8pD+jmFdxU8eCOe+ND1ola4FsStvXKdXYTmbdfdU+N2Hz3nK 3qfpRq0vqR90uyTyQQg8EOrIJjDAQJb2y/uSAmgKvBlI3VYp1S3+eBjH6hg2XnHHT0lU h7oQ== X-Gm-Message-State: AOJu0YyywaIHFqm/A7iXhdqCNajP9xCpGE+uQLWVktACK0neM64QRYnb jUM8R3V6sE4CiMVu0eMiSdUCkLDYumhvzHpi9b+J/Xpv/8qvY1XPj4ApIes3By0MLB2sBNCx4ie fbhs79g== X-Google-Smtp-Source: AGHT+IHNvpuh5JKO5Pq5qEaj0+k0o4G48pFm0dMvfusmerBPUEDKC94p3pHOeth+VEEDJSZTzGeL2w== X-Received: by 2002:a05:6820:983:b0:5c4:57d:691e with SMTP id 006d021491bc7-5ccdd1e0386mr282051eaf.2.1720734212029; Thu, 11 Jul 2024 14:43:32 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:31 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 14/15] arm: [MVE intrinsics] factorize vorn Date: Thu, 11 Jul 2024 21:43:04 +0000 Message-Id: <20240711214305.3193022-14-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Factorize vorn so that they use the parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (MVE_INT_M_BINARY_LOGIC): Add VORNQ_M_S, VORNQ_M_U. (MVE_FP_M_BINARY_LOGIC): Add VORNQ_M_F. (mve_insn): Add VORNQ_M_S, VORNQ_M_U, VORNQ_M_F. * config/arm/mve.md (mve_vornq_s): Rename into ... (@mve_vornq_s): ... this. (mve_vornq_u): Rename into ... (@mve_vornq_u): ... this. (mve_vornq_f): Rename into ... (@mve_vornq_f): ... this. (mve_vornq_m_): Merge into vand/vbic pattern. (mve_vornq_m_f): Likewise. --- gcc/config/arm/iterators.md | 3 +++ gcc/config/arm/mve.md | 48 ++++++------------------------------- 2 files changed, 10 insertions(+), 41 deletions(-) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 162c0d56bfb..3a1825ebab2 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -444,6 +444,7 @@ (define_int_iterator MVE_INT_M_BINARY_LOGIC [ VANDQ_M_S VANDQ_M_U VBICQ_M_S VBICQ_M_U VEORQ_M_S VEORQ_M_U + VORNQ_M_S VORNQ_M_U VORRQ_M_S VORRQ_M_U ]) @@ -594,6 +595,7 @@ (define_int_iterator MVE_FP_M_BINARY_LOGIC [ VANDQ_M_F VBICQ_M_F VEORQ_M_F + VORNQ_M_F VORRQ_M_F ]) @@ -1094,6 +1096,7 @@ (define_int_attr mve_insn [ (VMVNQ_N_S "vmvn") (VMVNQ_N_U "vmvn") (VNEGQ_M_F "vneg") (VNEGQ_M_S "vneg") + (VORNQ_M_S "vorn") (VORNQ_M_U "vorn") (VORNQ_M_F "vorn") (VORRQ_M_N_S "vorr") (VORRQ_M_N_U "vorr") (VORRQ_M_S "vorr") (VORRQ_M_U "vorr") (VORRQ_M_F "vorr") (VORRQ_N_S "vorr") (VORRQ_N_U "vorr") diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 2ed19ff66fc..982f92d92d8 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1021,9 +1021,9 @@ (define_insn "mve_q" ]) ;; -;; [vornq_u, vornq_s]) +;; [vornq_u, vornq_s] ;; -(define_insn "mve_vornq_s" +(define_insn "@mve_vornq_s" [ (set (match_operand:MVE_2 0 "s_register_operand" "=w") (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand" "w")) @@ -1035,7 +1035,7 @@ (define_insn "mve_vornq_s" (set_attr "type" "mve_move") ]) -(define_expand "mve_vornq_u" +(define_expand "@mve_vornq_u" [ (set (match_operand:MVE_2 0 "s_register_operand") (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand")) @@ -1429,9 +1429,9 @@ (define_insn "mve_q_f" ]) ;; -;; [vornq_f]) +;; [vornq_f] ;; -(define_insn "mve_vornq_f" +(define_insn "@mve_vornq_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (ior:MVE_0 (not:MVE_0 (match_operand:MVE_0 2 "s_register_operand" "w")) @@ -2710,6 +2710,7 @@ (define_insn "@mve_q_m_" ;; [vandq_m_u, vandq_m_s] ;; [vbicq_m_u, vbicq_m_s] ;; [veorq_m_u, veorq_m_s] +;; [vornq_m_u, vornq_m_s] ;; [vorrq_m_u, vorrq_m_s] ;; (define_insn "@mve_q_m_" @@ -2836,24 +2837,6 @@ (define_insn "@mve_q_int_m_" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vornq_m_u, vornq_m_s]) -;; -(define_insn "mve_vornq_m_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:MVE_2 2 "s_register_operand" "w") - (match_operand:MVE_2 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VORNQ_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;vornt\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vornq_")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vqshlq_m_n_s, vqshlq_m_n_u] ;; [vshlq_m_n_s, vshlq_m_n_u] @@ -3108,6 +3091,7 @@ (define_insn "@mve_q_m_n_f" ;; [vandq_m_f] ;; [vbicq_m_f] ;; [veorq_m_f] +;; [vornq_m_f] ;; [vorrq_m_f] ;; (define_insn "@mve_q_m_f" @@ -3187,24 +3171,6 @@ (define_insn "@mve_q_m_f" (set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vornq_m_f]) -;; -(define_insn "mve_vornq_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VORNQ_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vornt\t%q0, %q2, %q3" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vornq_f")) - (set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vstrbq_s vstrbq_u] ;; From patchwork Thu Jul 11 21:43:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959553 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=a4Itphwv; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKpFw4bdLz1xqx for ; Fri, 12 Jul 2024 07:46:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E866C3839177 for ; Thu, 11 Jul 2024 21:46:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2d.google.com (mail-oo1-xc2d.google.com [IPv6:2607:f8b0:4864:20::c2d]) by sourceware.org (Postfix) with ESMTPS id 51A343875DC9 for ; Thu, 11 Jul 2024 21:43:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 51A343875DC9 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 51A343875DC9 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734225; cv=none; b=P/sbwrWO6BL92/Y8Hep/hmbgigpvsILxDusJCTIiRVtvrA3KgaFvZNLDmDxnNFTupVx04UJt84sNagUovU8cNW5LYUqbCTRTya4PFflw7DvCQUd5dubf+f2SJSTi8mrhbYed3glrREKLAiiLAm2iv5IeHoO08oRCBe8rQ9hcX9k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720734225; c=relaxed/simple; bh=gjELDXAziagsFsN6U5wLrnw4BwX5ambXQ3+9ZajgbXA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=F4ZUnpDdxRjVifKGliBGLQOv6ARAnwqOqc9+ilCgNAuoFO/0xXyOcChSVTApcU488q4W5c7KPMcpHUOun1VMvihf9SAnGMsJ2VX0+hK7gvwPN8h0SRLrAUCvqp2mRcAtz5GjR8zCMWGTo/iOq1PC9MlomSeu1Y58vnzmGCNCTc8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2d.google.com with SMTP id 006d021491bc7-5ca9835db95so769143eaf.3 for ; Thu, 11 Jul 2024 14:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1720734214; x=1721339014; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cuSkRa28Nz+8oknNJ53ZnCnfNvDa3hFDHg7zUpvn/5A=; b=a4Itphwv0gxDhvMrU6LuiLQWGKdN0Jds1EIm08BtTwWk4t/arykXy750aRk4MmVJ1B IXg9fzOz7FaTdlrmXvPt3YcR3lhyxbhGjDXwyHhZY8lw0EZy4iESWaC3WvvRR/9SQ50T 86w04WKL+dwuYOVkMWi7Bw0/HrY1lZFus67T/lXZj4VRsBSzkXRVcYmXiBhmZPi+mhu9 ParllWkagf98r/wvzFT2uPA7nImB0bw6ER/o6acZFSLjHsalne1wtZpE6kv3NRZ5Qg7k trUxxog2Y4z2fIWryLTh9jXULIZcLs1vI/r78RBLMQgKRLdAZrxp8Ql1h4FblLpIHaJJ CQXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734214; x=1721339014; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cuSkRa28Nz+8oknNJ53ZnCnfNvDa3hFDHg7zUpvn/5A=; b=mQniIj60zbdnTIl4mKSxW2zsx7tSr9zOPkAgad7kzZrGxurVSlCzjFbJckxcGRPyj9 nkST9QZ+xTPj0Nq9qp+fn7DJsDEgsVtMYBaAQDUCQzuOwzhZi4MgMNuc77FZajmzRI8B UFCHStp0yLdSfSqY1deYZUQV06HXVg/MkMoCPun3JMbtSD3FeBkvW3PCOGI0v1JHmsbO BFGBCdWp9tTRG+6J8LPBJMZAWeDy2tGqzftcxaHShhRWP+2qMKPubUIKshxIU0jDQ+Xi oH6op1GYIxufD+r0JPspBqFDkBG9JAe+QlbBeaZLnjPjz1H+ouTJX9HhSXg1RX9ur0a4 0Nxw== X-Gm-Message-State: AOJu0YztEiDm9Wda3bGOnTj3njXsAsjy0fqJR8wXvO0+yBE/PyPMbX+Q KAPgZJmX9csJpOvG5dhM7z9bjuOCAXmMXEL4v45bbhu4z+el4XnqU+zTObU7kAgtevUKkoetFse 726OsLg== X-Google-Smtp-Source: AGHT+IF5o5H3jg8Gie+XAcILlQtKm2fDvz79qkhwYLxPndDdxzN8UZgIgzoAnzmfW1WHUcz7GR7VVw== X-Received: by 2002:a05:6820:f01:b0:5c4:27f0:ae with SMTP id 006d021491bc7-5cce3e1dc4amr222025eaf.1.1720734213742; Thu, 11 Jul 2024 14:43:33 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5c7b606db30sm540950eaf.8.2024.07.11.14.43.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jul 2024 14:43:32 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH 15/15] arm: [MVE intrinsics] rework vorn Date: Thu, 11 Jul 2024 21:43:05 +0000 Message-Id: <20240711214305.3193022-15-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711214305.3193022-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Implement vorn using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vornq): New. * config/arm/arm-mve-builtins-base.def (vornq): New. * config/arm/arm-mve-builtins-base.h (vornq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vorn): New. * config/arm/arm_mve.h (vornq): Delete. (vornq_m): Delete. (vornq_x): Delete. (vornq_u8): Delete. (vornq_s8): Delete. (vornq_u16): Delete. (vornq_s16): Delete. (vornq_u32): Delete. (vornq_s32): Delete. (vornq_f16): Delete. (vornq_f32): Delete. (vornq_m_s8): Delete. (vornq_m_s32): Delete. (vornq_m_s16): Delete. (vornq_m_u8): Delete. (vornq_m_u32): Delete. (vornq_m_u16): Delete. (vornq_m_f32): Delete. (vornq_m_f16): Delete. (vornq_x_s8): Delete. (vornq_x_s16): Delete. (vornq_x_s32): Delete. (vornq_x_u8): Delete. (vornq_x_u16): Delete. (vornq_x_u32): Delete. (vornq_x_f16): Delete. (vornq_x_f32): Delete. (__arm_vornq_u8): Delete. (__arm_vornq_s8): Delete. (__arm_vornq_u16): Delete. (__arm_vornq_s16): Delete. (__arm_vornq_u32): Delete. (__arm_vornq_s32): Delete. (__arm_vornq_m_s8): Delete. (__arm_vornq_m_s32): Delete. (__arm_vornq_m_s16): Delete. (__arm_vornq_m_u8): Delete. (__arm_vornq_m_u32): Delete. (__arm_vornq_m_u16): Delete. (__arm_vornq_x_s8): Delete. (__arm_vornq_x_s16): Delete. (__arm_vornq_x_s32): Delete. (__arm_vornq_x_u8): Delete. (__arm_vornq_x_u16): Delete. (__arm_vornq_x_u32): Delete. (__arm_vornq_f16): Delete. (__arm_vornq_f32): Delete. (__arm_vornq_m_f32): Delete. (__arm_vornq_m_f16): Delete. (__arm_vornq_x_f16): Delete. (__arm_vornq_x_f32): Delete. (__arm_vornq): Delete. (__arm_vornq_m): Delete. (__arm_vornq_x): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 1 + gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 1 + gcc/config/arm/arm-mve-builtins-functions.h | 53 +++ gcc/config/arm/arm_mve.h | 431 -------------------- 5 files changed, 57 insertions(+), 431 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e33603ec1f3..f8260f5f483 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -568,6 +568,7 @@ FUNCTION_WITH_RTX_M_N (vmulq, MULT, VMULQ) FUNCTION_WITH_RTX_M_N_NO_F (vmvnq, NOT, VMVNQ) FUNCTION (vnegq, unspec_based_mve_function_exact_insn, (NEG, NEG, NEG, -1, -1, -1, VNEGQ_M_S, -1, VNEGQ_M_F, -1, -1, -1)) FUNCTION_WITHOUT_M_N (vpselq, VPSELQ) +FUNCTION (vornq, unspec_based_mve_function_exact_insn_vorn, (-1, -1, VORNQ_M_S, VORNQ_M_U, VORNQ_M_F, -1, -1)) FUNCTION_WITH_RTX_M_N_NO_N_F (vorrq, IOR, VORRQ) FUNCTION_WITHOUT_N_NO_U_F (vqabsq, VQABSQ) FUNCTION_WITH_M_N_NO_F (vqaddq, VQADDQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index a1b3eea32d3..3595dab36df 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -87,6 +87,7 @@ DEF_MVE_FUNCTION (vmulltq_poly, binary_widen_poly, poly_8_16, mx_or_none) DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmvnq, mvn, all_integer, mx_or_none) DEF_MVE_FUNCTION (vnegq, unary, all_signed, mx_or_none) +DEF_MVE_FUNCTION (vornq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vorrq, binary_orrq, all_integer, mx_or_none) DEF_MVE_FUNCTION (vpselq, vpsel, all_integer_with_64, none) DEF_MVE_FUNCTION (vqabsq, unary, all_signed, m_or_none) @@ -206,6 +207,7 @@ DEF_MVE_FUNCTION (vminnmq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vminnmvq, binary_maxvminv, all_float, p_or_none) DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vnegq, unary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vornq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vorrq, binary_orrq, all_float, mx_or_none) DEF_MVE_FUNCTION (vpselq, vpsel, all_float, none) DEF_MVE_FUNCTION (vreinterpretq, unary_convert, reinterpret_float, none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 2abe640b840..39e39c307bd 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -118,6 +118,7 @@ extern const function_base *const vmulltq_poly; extern const function_base *const vmulq; extern const function_base *const vmvnq; extern const function_base *const vnegq; +extern const function_base *const vornq; extern const function_base *const vorrq; extern const function_base *const vpselq; extern const function_base *const vqabsq; diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index a06c91b3a45..a2e989f65ea 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -509,6 +509,59 @@ public: } }; +/* Map the function directly to CODE (UNSPEC, M) for vbic-like + builtins. The difference with unspec_based_mve_function_exact_insn + is that this function has vbic hardcoded for the PRED_none, + MODE_none version, rather than using an RTX. */ +class unspec_based_mve_function_exact_insn_vorn : public unspec_based_mve_function_base +{ +public: + CONSTEXPR unspec_based_mve_function_exact_insn_vorn (int unspec_for_n_sint, + int unspec_for_n_uint, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp, + int unspec_for_m_n_sint, + int unspec_for_m_n_uint) + : unspec_based_mve_function_base (UNKNOWN, + UNKNOWN, + UNKNOWN, + -1, -1, -1, /* No non-predicated, no mode unspec intrinsics. */ + unspec_for_n_sint, + unspec_for_n_uint, + -1, + unspec_for_m_sint, + unspec_for_m_uint, + unspec_for_m_fp, + unspec_for_m_n_sint, + unspec_for_m_n_uint, + -1) + {} + + rtx + expand (function_expander &e) const override + { + machine_mode mode = e.vector_mode (0); + insn_code code; + + /* No suffix, no predicate, use the right RTX code. */ + if (e.pred == PRED_none + && e.mode_suffix_id == MODE_none) + { + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_vornq_u (mode); + else + code = code_for_mve_vornq_s (mode); + else + code = code_for_mve_vornq_f (mode); + return e.use_exact_insn (code); + } + + return expand_unspec (e); + } +}; + /* Map the comparison functions. */ class unspec_based_mve_function_exact_insn_vcmp : public unspec_based_mve_function_base { diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 3fd6980a58d..7aa61103a7d 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -42,9 +42,7 @@ #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) -#define vornq(__a, __b) __arm_vornq(__a, __b) #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) -#define vornq_m(__inactive, __a, __b, __p) __arm_vornq_m(__inactive, __a, __b, __p) #define vstrbq_scatter_offset(__base, __offset, __value) __arm_vstrbq_scatter_offset(__base, __offset, __value) #define vstrbq(__addr, __value) __arm_vstrbq(__addr, __value) #define vstrwq_scatter_base(__addr, __offset, __value) __arm_vstrwq_scatter_base(__addr, __offset, __value) @@ -116,7 +114,6 @@ #define viwdupq_x_u8(__a, __b, __imm, __p) __arm_viwdupq_x_u8(__a, __b, __imm, __p) #define viwdupq_x_u16(__a, __b, __imm, __p) __arm_viwdupq_x_u16(__a, __b, __imm, __p) #define viwdupq_x_u32(__a, __b, __imm, __p) __arm_viwdupq_x_u32(__a, __b, __imm, __p) -#define vornq_x(__a, __b, __p) __arm_vornq_x(__a, __b, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) #define vadciq_m(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m(__inactive, __a, __b, __carry_out, __p) #define vadcq(__a, __b, __carry) __arm_vadcq(__a, __b, __carry) @@ -148,14 +145,6 @@ #define vctp64q(__a) __arm_vctp64q(__a) #define vctp8q(__a) __arm_vctp8q(__a) #define vpnot(__a) __arm_vpnot(__a) -#define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) -#define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) -#define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b) -#define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b) -#define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b) -#define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b) -#define vornq_f16(__a, __b) __arm_vornq_f16(__a, __b) -#define vornq_f32(__a, __b) __arm_vornq_f32(__a, __b) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p) #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p) @@ -166,14 +155,6 @@ #define vshlcq_u16(__a, __b, __imm) __arm_vshlcq_u16(__a, __b, __imm) #define vshlcq_s32(__a, __b, __imm) __arm_vshlcq_s32(__a, __b, __imm) #define vshlcq_u32(__a, __b, __imm) __arm_vshlcq_u32(__a, __b, __imm) -#define vornq_m_s8(__inactive, __a, __b, __p) __arm_vornq_m_s8(__inactive, __a, __b, __p) -#define vornq_m_s32(__inactive, __a, __b, __p) __arm_vornq_m_s32(__inactive, __a, __b, __p) -#define vornq_m_s16(__inactive, __a, __b, __p) __arm_vornq_m_s16(__inactive, __a, __b, __p) -#define vornq_m_u8(__inactive, __a, __b, __p) __arm_vornq_m_u8(__inactive, __a, __b, __p) -#define vornq_m_u32(__inactive, __a, __b, __p) __arm_vornq_m_u32(__inactive, __a, __b, __p) -#define vornq_m_u16(__inactive, __a, __b, __p) __arm_vornq_m_u16(__inactive, __a, __b, __p) -#define vornq_m_f32(__inactive, __a, __b, __p) __arm_vornq_m_f32(__inactive, __a, __b, __p) -#define vornq_m_f16(__inactive, __a, __b, __p) __arm_vornq_m_f16(__inactive, __a, __b, __p) #define vstrbq_s8( __addr, __value) __arm_vstrbq_s8( __addr, __value) #define vstrbq_u8( __addr, __value) __arm_vstrbq_u8( __addr, __value) #define vstrbq_u16( __addr, __value) __arm_vstrbq_u16( __addr, __value) @@ -456,14 +437,6 @@ #define viwdupq_x_wb_u8(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u8(__a, __b, __imm, __p) #define viwdupq_x_wb_u16(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u16(__a, __b, __imm, __p) #define viwdupq_x_wb_u32(__a, __b, __imm, __p) __arm_viwdupq_x_wb_u32(__a, __b, __imm, __p) -#define vornq_x_s8(__a, __b, __p) __arm_vornq_x_s8(__a, __b, __p) -#define vornq_x_s16(__a, __b, __p) __arm_vornq_x_s16(__a, __b, __p) -#define vornq_x_s32(__a, __b, __p) __arm_vornq_x_s32(__a, __b, __p) -#define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) -#define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) -#define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vornq_x_f16(__a, __b, __p) __arm_vornq_x_f16(__a, __b, __p) -#define vornq_x_f32(__a, __b, __p) __arm_vornq_x_f32(__a, __b, __p) #define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) #define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) #define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) @@ -665,48 +638,6 @@ __arm_vpnot (mve_pred16_t __a) return __builtin_mve_vpnotv16bi (__a); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return __builtin_mve_vornq_uv16qi (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vornq_sv16qi (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return __builtin_mve_vornq_uv8hi (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vornq_sv8hi (__a, __b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return __builtin_mve_vornq_uv4si (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vornq_sv4si (__a, __b); -} - __extension__ extern __inline mve_pred16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vctp8q_m (uint32_t __a, mve_pred16_t __p) @@ -789,48 +720,6 @@ __arm_vshlcq_u32 (uint32x4_t __a, uint32_t * __b, const int __imm) return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv8hi (__inactive, __a, __b, __p); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset_s8 (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -2658,48 +2547,6 @@ __arm_viwdupq_x_wb_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16 return __res; } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -3420,34 +3267,6 @@ __arm_vst4q_f32 (float32_t * __addr, float32x4x4_t __value) __builtin_mve_vst4qv4sf (__addr, __rv.__o); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vornq_fv8hf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vornq_fv4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_f32 (float32_t const * __base) @@ -3678,20 +3497,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline float16x8x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vld4q_f16 (float16_t const * __addr) @@ -3852,48 +3657,6 @@ __arm_vst4q (uint32_t * __addr, uint32x4x4_t __value) __arm_vst4q_u32 (__addr, __value); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vornq_u8 (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int8x16_t __a, int8x16_t __b) -{ - return __arm_vornq_s8 (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vornq_u16 (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int16x8_t __a, int16x8_t __b) -{ - return __arm_vornq_s16 (__a, __b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vornq_u32 (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (int32x4_t __a, int32x4_t __b) -{ - return __arm_vornq_s32 (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) @@ -3936,48 +3699,6 @@ __arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) return __arm_vshlcq_u32 (__a, __b, __imm); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_u16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline void __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vstrbq_scatter_offset (int8_t * __base, uint8x16_t __offset, int8x16_t __value) @@ -5378,48 +5099,6 @@ __arm_viwdupq_x_u32 (uint32_t *__a, uint32_t __b, const int __imm, mve_pred16_t return __arm_viwdupq_x_wb_u32 (__a, __b, __imm, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_u32 (__a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadciq (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) @@ -5912,34 +5591,6 @@ __arm_vst4q (float32_t * __addr, float32x4x4_t __value) __arm_vst4q_f32 (__addr, __value); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (float16x8_t __a, float16x8_t __b) -{ - return __arm_vornq_f16 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq (float32x4_t __a, float32x4_t __b) -{ - return __arm_vornq_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrhq_gather_offset (float16_t const * __base, uint16x8_t __offset) @@ -6108,20 +5759,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vornq_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vld4q (float16_t const * __addr) @@ -6543,18 +6180,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16_t_ptr][__ARM_mve_type_float16x8x4_t]: __arm_vst4q_f16 (__ARM_mve_coerce_f16_ptr(__p0, float16_t *), __ARM_mve_coerce(__p1, float16x8x4_t)), \ int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4x4_t]: __arm_vst4q_f32 (__ARM_mve_coerce_f32_ptr(__p0, float32_t *), __ARM_mve_coerce(__p1, float32x4x4_t)));}) -#define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -6564,19 +6189,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( \ _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ @@ -6867,18 +6479,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_wb_p_u32 (p0, p1, __ARM_mve_coerce(__p2, uint32x4_t), p3), \ int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_wb_p_f32 (p0, p1, __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \ @@ -6927,16 +6527,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16_t_ptr][__ARM_mve_type_uint16x8x4_t]: __arm_vst4q_u16 (__ARM_mve_coerce_u16_ptr(p0, uint16_t *), __ARM_mve_coerce(__p1, uint16x8x4_t)), \ int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4x4_t]: __arm_vst4q_u32 (__ARM_mve_coerce_u32_ptr(p0, uint32_t *), __ARM_mve_coerce(__p1, uint32x4x4_t)));}) -#define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - #define __arm_vshlcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vshlcq_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1, p2), \ @@ -6946,17 +6536,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshlcq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1, p2), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshlcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1, p2));}) -#define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vstrwq_scatter_base(p0,p1,p2) ({ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p2)])0, \ int (*)[__ARM_mve_type_int32x4_t]: __arm_vstrwq_scatter_base_s32(p0, p1, __ARM_mve_coerce(__p2, int32x4_t)), \ @@ -7176,16 +6755,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vuninitializedq_u32 (), \ int (*)[__ARM_mve_type_uint64x2_t]: __arm_vuninitializedq_u64 ());}) -#define __arm_vornq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vornq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vornq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vornq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vornq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vornq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vornq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vld1q_z(p0,p1) ( _Generic( (int (*)[__ARM_mve_typeid(p0)])0, \ int (*)[__ARM_mve_type_int8_t_ptr]: __arm_vld1q_z_s8 (__ARM_mve_coerce_s8_ptr(p0, int8_t *), p1), \ int (*)[__ARM_mve_type_int16_t_ptr]: __arm_vld1q_z_s16 (__ARM_mve_coerce_s16_ptr(p0, int16_t *), p1), \