From patchwork Sat Jun  8 03:05:15 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Pengxuan Zheng <quic_pzheng@quicinc.com>
X-Patchwork-Id: 1945387
Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256
 header.s=qcppdkim1 header.b=WOJOTJiv;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4Vx2yw2pVTz20Py
	for <incoming@patchwork.ozlabs.org>; Sat,  8 Jun 2024 13:06:30 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 115093858D39
	for <incoming@patchwork.ozlabs.org>; Sat,  8 Jun 2024 03:06:27 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com
 [205.220.180.131])
 by sourceware.org (Postfix) with ESMTPS id 649EC3858D26
 for <gcc-patches@gcc.gnu.org>; Sat,  8 Jun 2024 03:06:07 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 649EC3858D26
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=quicinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 649EC3858D26
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=205.220.180.131
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717815969; cv=none;
 b=vNTMNhZ7yLaNbYDD0SLhiGamKykrEfK5QE64oQmNafcKKgWyG4nYj2S27aVVmDUXAEY25bd+YwR87T2RfGjotraT1uxKv/NT20aveg6he4CXgYj4BDE5bLPYXHdIJnbxkIIpgEZcJTXWB72oNgnoSTb6UKdnDV9wAGrweJ2oAGI=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1717815969; c=relaxed/simple;
 bh=aFvUdOGcwU+EBhDZRoiq6YJPmArJGolod8K28HNJK2E=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=iST3kliRCRJYpyLekc/KhY9/URyS1wrJgkUY4h18vgVvQYQ39U2P1ceBaJ4RX3gjN1oIP+hV2XW0NsEB0qf27GoMPuUnexLMr6tig9xltIlm0kZXpvMHXIkznf2wZhUzjZtHFSREz0LRX/5nDYGthmlb6utkeJ+X4zdJ9HNITjA=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from pps.filterd (m0279868.ppops.net [127.0.0.1])
 by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id
 4582bYa2010485
 for <gcc-patches@gcc.gnu.org>; Sat, 8 Jun 2024 03:06:06 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=
 cc:content-type:date:from:message-id:mime-version:subject:to; s=
 qcppdkim1; bh=T8rsKhc9f3SLW3AKnzetAHVeUAKdB4bOkFaxzCASycU=; b=WO
 JOTJivNxR09khwSTMku5XAlzUG42nVrCZVw0aawVjSbx/KhAxrlCKYJikDfRMwLZ
 mD3eqyWJrd2gBbmOgVX+4s5CHPjGtnbSGG+QDfr/UNZivyJpeeRAxU9qUL3FTcTZ
 nqVTVETrzVoi/ZijLOgiqIIrXaFOC/qzOD3OFfa1w1FnTAxu8zVo1JKjPPmvogTO
 Tm+E99lhhQVOkbEDsUai3OUZiS0UmQ4NnDuENQNxfUsYdG/QMoUjk2PzyQ1K4BOc
 wSDoIT2QjznhYdQrToWe8Tr/le8oG2muLF5UQjq3suCeEQjrQAmGgJh9bbOIuCi0
 rSgZ7G6Dwg+FoVv63J1A==
Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com
 [129.46.96.20])
 by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3ymemgg112-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT)
 for <gcc-patches@gcc.gnu.org>; Sat, 08 Jun 2024 03:06:06 +0000 (GMT)
Received: from nalasex01c.na.qualcomm.com (nalasex01c.na.qualcomm.com
 [10.47.97.35])
 by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id
 458365U5028507
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT)
 for <gcc-patches@gcc.gnu.org>; Sat, 8 Jun 2024 03:06:05 GMT
Received: from hu-pzheng-lv.qualcomm.com (10.49.16.6) by
 nalasex01c.na.qualcomm.com (10.47.97.35) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.2.1544.9; Fri, 7 Jun 2024 20:06:04 -0700
From: Pengxuan Zheng <quic_pzheng@quicinc.com>
To: <gcc-patches@gcc.gnu.org>
CC: Pengxuan Zheng <quic_pzheng@quicinc.com>
Subject: [PATCH] aarch64: Add vector floating point trunc pattern
Date: Fri, 7 Jun 2024 20:05:15 -0700
Message-ID: <20240608030515.3693-1-quic_pzheng@quicinc.com>
X-Mailer: git-send-email 2.17.1
MIME-Version: 1.0
X-Originating-IP: [10.49.16.6]
X-ClientProxiedBy: nalasex01c.na.qualcomm.com (10.47.97.35) To
 nalasex01c.na.qualcomm.com (10.47.97.35)
X-QCInternal: smtphost
X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800
 signatures=585085
X-Proofpoint-GUID: zjqPS1YYHdXyvAbGGWfg28uLdDK3ZOzH
X-Proofpoint-ORIG-GUID: zjqPS1YYHdXyvAbGGWfg28uLdDK3ZOzH
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16
 definitions=2024-06-07_16,2024-06-06_02,2024-05-17_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 bulkscore=0 mlxlogscore=999
 suspectscore=0 adultscore=0 impostorscore=0 spamscore=0 priorityscore=1501
 lowpriorityscore=0 phishscore=0 malwarescore=0 mlxscore=0 clxscore=1015
 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2405170001
 definitions=main-2406080022
X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org

This patch is a follow-up of r15-1079-g230d62a2cdd16c to add vector floating
point trunc pattern for V2DF->V2SF and V4SF->V4HF conversions by renaming the
existing aarch64_float_truncate_lo_<mode><vczle><vczbe> pattern to the standard
optab one, i.e., trunc<Vwide><mode>2<vczle><vczbe>. This allows the vectorizer
to vectorize certain floating point narrowing operations for the aarch64 target.

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.cc (VAR1): Remap float_truncate_lo_
	builtin codes to standard optab ones.
	* config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_<mode><vczle><vczbe>):
	Rename to...
	(trunc<Vwide><mode>2<vczle><vczbe>): ... This.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/trunc-vec.c: New test.

Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
---
 gcc/config/aarch64/aarch64-builtins.cc       |  7 +++++++
 gcc/config/aarch64/aarch64-simd.md           |  6 +++---
 gcc/testsuite/gcc.target/aarch64/trunc-vec.c | 21 ++++++++++++++++++++
 3 files changed, 31 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/trunc-vec.c

diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc
index 25189888d17..d589e59defc 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -543,6 +543,13 @@ BUILTIN_VDQ_BHSI (uhadd, uavg, _floor, 0)
 VAR1 (float_extend_lo_, extend, v2sf, v2df)
 VAR1 (float_extend_lo_, extend, v4hf, v4sf)
 
+/* __builtin_aarch64_float_truncate_lo_<mode> should be expanded through the
+   standard optabs CODE_FOR_trunc<Vwide><mode>2. */
+constexpr insn_code CODE_FOR_aarch64_float_truncate_lo_v4hf
+    = CODE_FOR_truncv4sfv4hf2;
+constexpr insn_code CODE_FOR_aarch64_float_truncate_lo_v2sf
+    = CODE_FOR_truncv2dfv2sf2;
+
 #undef VAR1
 #define VAR1(T, N, MAP, FLAG, A) \
   {#N #A, UP (A), CF##MAP (N, A), 0, TYPES_##T, FLAG_##FLAG},
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index c5e2c9f00d0..f644bd1731e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3197,7 +3197,7 @@ (define_expand "aarch64_float_trunc_rodd_hi_v4sf"
 }
 )
 
-(define_insn "aarch64_float_truncate_lo_<mode><vczle><vczbe>"
+(define_insn "trunc<Vwide><mode>2<vczle><vczbe>"
   [(set (match_operand:VDF 0 "register_operand" "=w")
       (float_truncate:VDF
 	(match_operand:<VWIDE> 1 "register_operand" "w")))]
@@ -3256,7 +3256,7 @@ (define_expand "vec_pack_trunc_v2df"
     int lo = BYTES_BIG_ENDIAN ? 2 : 1;
     int hi = BYTES_BIG_ENDIAN ? 1 : 2;
 
-    emit_insn (gen_aarch64_float_truncate_lo_v2sf (tmp, operands[lo]));
+    emit_insn (gen_truncv2dfv2sf2 (tmp, operands[lo]));
     emit_insn (gen_aarch64_float_truncate_hi_v4sf (operands[0],
 						   tmp, operands[hi]));
     DONE;
@@ -3272,7 +3272,7 @@ (define_expand "vec_pack_trunc_df"
   {
     rtx tmp = gen_reg_rtx (V2SFmode);
     emit_insn (gen_aarch64_vec_concatdf (tmp, operands[1], operands[2]));
-    emit_insn (gen_aarch64_float_truncate_lo_v2sf (operands[0], tmp));
+    emit_insn (gen_truncv2dfv2sf2 (operands[0], tmp));
     DONE;
   }
 )
diff --git a/gcc/testsuite/gcc.target/aarch64/trunc-vec.c b/gcc/testsuite/gcc.target/aarch64/trunc-vec.c
new file mode 100644
index 00000000000..05e8af7912d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/trunc-vec.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* { dg-final { scan-assembler-times {fcvtn\tv[0-9]+.2s, v[0-9]+.2d} 1 } } */
+void
+f (double *__restrict a, float *__restrict b)
+{
+  b[0] = a[0];
+  b[1] = a[1];
+}
+
+/* { dg-final { scan-assembler-times {fcvtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */
+void
+f1 (float *__restrict a, _Float16 *__restrict b)
+{
+
+  b[0] = a[0];
+  b[1] = a[1];
+  b[2] = a[2];
+  b[3] = a[3];
+}