From patchwork Wed Sep 4 13:26:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1980818 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=KuTaDsk2; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WzNlw3L1kz1yXY for ; Wed, 4 Sep 2024 23:35:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50B68386EC02 for ; Wed, 4 Sep 2024 13:35:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 4402F385020D for ; Wed, 4 Sep 2024 13:27:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4402F385020D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4402F385020D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; cv=none; b=O+T3GkP0UwpUhJsTpjQBL8r4Zd+oNcrtOylFaj6PXXhliztJ3lTOnZdcgvqqwXeLof0798v2X7IvXgYIrMrNG/fHsORJCPfZiP2kNoaeDVez3jeqWXBREqpmyPdnsEWXMNjm8C3HWYEBkzK1WwEoY1klEf7zXrxyVPcYkQ8mGNQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725456465; c=relaxed/simple; bh=Vy4ZI9CEImiJeS05woRZKVMQsAgVue/IWCv0jcKgNH0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IXCLnuYiFHDKPVChaS2a+0ufkCXdSxX6zAnTHq+ykdqpmlczp/2MwVTobJEL96LaWNwVL59Gd+Uz+GVDmModxRWBpptzgsJ1wH7ey6MGsJIKVnrC0falHXyYakvkfuOwC/eUa3IOOoq1JQ6BPLQvsfR6MM6KcrO5xjFWvait+w0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5de8a3f1cc6so360491eaf.1 for ; Wed, 04 Sep 2024 06:27:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725456462; x=1726061262; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=g+3LCwSdhBetUIcbyx1hnl5ttonqbScOUBo403cdi4Q=; b=KuTaDsk2j15xt8OaSbxFHRXiDH251ajTjcWanzrZErggE6VnGw6xwp8Ojhk10wLXhU Bg+sFTjhKDnCuG8JFER6KdtZLDfElGKvTgUlz6lDrCPzXG4u63u+GrPIzIg4SZcpQIvR fEfMBEn5CybSM5x9Do3nY0gJ0Ag/2XtwM5QB6FHeKM6Cf3CbvyrhXzzi683St65F0yUL m3ISy0svxhfd/b+2RugW/dRQ8oN90pRBliuNNt1v2o8cwQoiMxx+GRRjYGGMJ6N6k3/h rG463cO2cOp/XIcUn+MBs+P8IoSgSISlmbAKzEffPSP42PqxwmXpfkrInmfGs4+DqS31 C9xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725456462; x=1726061262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g+3LCwSdhBetUIcbyx1hnl5ttonqbScOUBo403cdi4Q=; b=ZAjyBWbVpDaWdztIHgprt1YnkK0oEZ0hZ8YeNt3KrZ6mMIZMAZxNbUHxTEDJvd+wmj wKv7j9nBmHAe1whlWUrSRMPkzoXMC07vedw2kEq4p+c7HgPEs+ZXMvzkiPMG59A3w35o yoeFK5nzBPARKA5D1CqLiqmoMlnbXlOQo0751tHvOjf5wMS+bprhf6o9z0AHtkviM3U2 GIJHHGwehsYp9xuIskA+WGcDpXRKw9nc+CCG3mykfFTCIkv+zbjrXh323rLTbaxn/9Ki pLuVUuIKUexUs9encYxbi1MVwZg0MIYhzOYPoJlB6Hywg+Gy54FM3r1/zNqw4dAImIo3 GOxQ== X-Gm-Message-State: AOJu0YxvnBOh+Pqx1msQ7Z+ydMmaeZNeA8OQkMZ1IwPCDsGXJkB9RyDY 5wRHoWT4r36XmHjKgfn6R6jpiHcWrJ32E3dlN5+EvPOkep5yywz7QPUMuDHNwe9BWit3lQHO4n5 XOrjLUw== X-Google-Smtp-Source: AGHT+IGl06qI0yhRvAOMGcfRc4t94itQNK7fR7mn4uAGhOAbRdTFk3Mt34+pFcgkR8HC9dyTA431Lw== X-Received: by 2002:a05:6820:61c:b0:5d5:d5e9:4e38 with SMTP id 006d021491bc7-5e18ec45f73mr1019955eaf.2.1725456462239; Wed, 04 Sep 2024 06:27:42 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5dfa0580692sm2308062eaf.46.2024.09.04.06.27.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 06:27:41 -0700 (PDT) From: Christophe Lyon To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon Subject: [PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape Date: Wed, 4 Sep 2024 13:26:38 +0000 Message-Id: <20240904132650.2720446-25-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-builtins-shapes.h (vidwdup): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 88 +++++++++++++++++++++++ gcc/config/arm/arm-mve-builtins-shapes.h | 1 + 2 files changed, 89 insertions(+) diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc index a1d2e243128..510f15ae73a 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.cc +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -2291,6 +2291,94 @@ struct viddup_def : public overloaded_base<0> }; SHAPE (viddup) +/* _t vfoo[_n]_t0(uint32_t, uint32_t, const int) + _t vfoo[_wb]_t0(uint32_t *, uint32_t, const int) + + Shape for vector increment or decrement with wrap and duplicate operations + that take an integer or pointer to integer first argument, an integer second + argument and an immediate, and produce a vector. + + Check that 'imm' is one of 1, 2, 4 or 8. + + Example: vdwdupq. + uint8x16_t [__arm_]vdwdupq[_n]_u8(uint32_t a, uint32_t b, const int imm) + uint8x16_t [__arm_]vdwdupq[_wb]_u8(uint32_t *a, uint32_t b, const int imm) + uint8x16_t [__arm_]vdwdupq_m[_n_u8](uint8x16_t inactive, uint32_t a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_m[_wb_u8](uint8x16_t inactive, uint32_t *a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_x[_n]_u8(uint32_t a, uint32_t b, const int imm, mve_pred16_t p) + uint8x16_t [__arm_]vdwdupq_x[_wb]_u8(uint32_t *a, uint32_t b, const int imm, mve_pred16_t p) */ +struct vidwdup_def : public overloaded_base<0> +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index pred, + enum mode_suffix_index, + type_suffix_info) const override + { + return ((i == 0) && (pred != PRED_m)); + } + + bool + skip_overload_p (enum predication_index, enum mode_suffix_index mode) const override + { + /* For MODE_wb, share the overloaded instance with MODE_n. */ + if (mode == MODE_wb) + return true; + + return false; + } + + void + build (function_builder &b, const function_group_info &group, + bool preserve_user_namespace) const override + { + b.add_overloaded_functions (group, MODE_none, preserve_user_namespace); + build_all (b, "v0,su32,su32,su64", group, MODE_n, preserve_user_namespace); + build_all (b, "v0,as,su32,su64", group, MODE_wb, preserve_user_namespace); + } + + tree + resolve (function_resolver &r) const override + { + unsigned int i, nargs; + type_suffix_index type_suffix = NUM_TYPE_SUFFIXES; + if (!r.check_gp_argument (3, i, nargs)) + return error_mark_node; + + type_suffix = r.type_suffix_ids[0]; + /* With PRED_m, ther is no type suffix, so infer it from the first (inactive) + argument. */ + if (type_suffix == NUM_TYPE_SUFFIXES) + type_suffix = r.infer_vector_type (0); + + unsigned int last_arg = i - 2; + /* Check that last_arg is either scalar or pointer. */ + if (!r.scalar_argument_p (last_arg)) + return error_mark_node; + + if (!r.scalar_argument_p (last_arg + 1)) + return error_mark_node; + + if (!r.require_integer_immediate (last_arg + 2)) + return error_mark_node; + + /* With MODE_n we expect a scalar, with MODE_wb we expect a pointer. */ + mode_suffix_index mode_suffix; + if (POINTER_TYPE_P (r.get_argument_type (last_arg))) + mode_suffix = MODE_wb; + else + mode_suffix = MODE_n; + + return r.resolve_to (mode_suffix, type_suffix); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_one_of (2, 1, 2, 4, 8); + } +}; +SHAPE (vidwdup) + /* _t vfoo[_t0](_t, _t, mve_pred16_t) i.e. a version of the standard ternary shape in which diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h index 186287c1620..b3d08ab3866 100644 --- a/gcc/config/arm/arm-mve-builtins-shapes.h +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -83,6 +83,7 @@ namespace arm_mve extern const function_shape *const vcvt_f32_f16; extern const function_shape *const vcvtx; extern const function_shape *const viddup; + extern const function_shape *const vidwdup; extern const function_shape *const vpsel; } /* end namespace arm_mve::shapes */