From patchwork Fri Sep 20 10:57:29 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: "Li, Pan2" <pan2.li@intel.com>
X-Patchwork-Id: 1987878
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256
 header.s=Intel header.b=J+Jz46p0;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4X98XF4Rw3z1y27
	for <incoming@patchwork.ozlabs.org>; Fri, 20 Sep 2024 20:59:09 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 5925C385DDD5
	for <incoming@patchwork.ozlabs.org>; Fri, 20 Sep 2024 10:59:07 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17])
 by sourceware.org (Postfix) with ESMTPS id D97A23857011
 for <gcc-patches@gcc.gnu.org>; Fri, 20 Sep 2024 10:58:37 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D97A23857011
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=intel.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D97A23857011
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=192.198.163.17
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726829920; cv=none;
 b=H7OoNqtqzzcHgDUxJ2HQLlCFfznEwbG4WtoRs/ioYbZYgr/5XqzeuFNF62XQmNOCxqI514wxxdzP2PwpKSWJUwRqe9BBYWVsU3V1UW2rsrkD3kIrO0D7JSo5JRthjt6K5wlOSyzJqJKjBueusE1vTUejsfhE8qFi/3dLRy5Z32o=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1726829920; c=relaxed/simple;
 bh=8/VX98900wALCZjuPlvLZIH8SAxV1X7JciXtNwvDp1Y=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=fUIvb31UWipXBF18Ed0ywCleUpEiaz0HFs1+xGeIXPvS3KSBPYXjnOWwiadH0unEqy1O1NRnwaxdSyp5wDl7w/ZrNoICS6zT3Tyn29nYXIUnkaDdFlwL1YENrf1r78iQYpQcf4Zn6yoM2/Died8rjNm2o/kqRmUGVI+Ze67IKiQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1726829918; x=1758365918;
 h=from:to:cc:subject:date:message-id:mime-version:
 content-transfer-encoding;
 bh=8/VX98900wALCZjuPlvLZIH8SAxV1X7JciXtNwvDp1Y=;
 b=J+Jz46p0H/bFtqUazmeeD7tk8tgPFAj1vFqh5tCpIlG0duK94RVB8y2W
 4Ldmi9D2FvjIavTCcPjPRWcIayxS02nF789qJEErCBuGQ1TpTtLiXCZ47
 F0oB5mZ759QI2SkYQGNpVACkinc7O36K5IeWPPcIdONgR2lh8R4BpoNbP
 gO/OY47s0BRxqioneME+XjBpXJ7e9OgKQKhup6GFkgYf+pVZ/qRrcPOGb
 oMhEiQ14UOURERLrZsrEcRURZI/iqlEsJrj08As32YFN2Zwz1esDM8epm
 Uv/odhl0xTOn38WkOlkWehY5gznAmsgs7fX0hm5oTuo1jlbFQSVCO/j2M w==;
X-CSE-ConnectionGUID: r9hGbdlHQkiDiuaJIV5yMA==
X-CSE-MsgGUID: VaamtBzwRWqvXuZH7Cp4RA==
X-IronPort-AV: E=McAfee;i="6700,10204,11200"; a="25704625"
X-IronPort-AV: E=Sophos;i="6.10,244,1719903600"; d="scan'208";a="25704625"
Received: from fmviesa002.fm.intel.com ([10.60.135.142])
 by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 20 Sep 2024 03:58:36 -0700
X-CSE-ConnectionGUID: pDH4Ig6LSv2cZWEG9vk1xA==
X-CSE-MsgGUID: +PRxXCE4TOK+cl5dlkpv2w==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.10,244,1719903600"; d="scan'208";a="93566928"
Received: from panli.sh.intel.com ([10.239.154.73])
 by fmviesa002.fm.intel.com with ESMTP; 20 Sep 2024 03:58:34 -0700
From: pan2.li@intel.com
To: gcc-patches@gcc.gnu.org
Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai,
 kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com,
 Pan Li <pan2.li@intel.com>
Subject: [PATCH v1 1/2] Match: Support form 2 for vector signed integer
 .SAT_ADD
Date: Fri, 20 Sep 2024 18:57:29 +0800
Message-ID: <20240920105729.1058948-1-pan2.li@intel.com>
X-Mailer: git-send-email 2.43.0
MIME-Version: 1.0
X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

From: Pan Li <pan2.li@intel.com>

This patch would like to support the form 2 of the vector signed
integer .SAT_ADD.  Aka below example:

Form 2:
  #define DEF_VEC_SAT_S_ADD_FMT_2(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T sum = (UT)x + (UT)y;                                         \
        if ((x ^ y) < 0 || (sum ^ x) >= 0)                             \
          out[i] = sum;                                                \
        else                                                           \
          out[i] = x < 0 ? MIN : MAX;                                  \
      }                                                                \
  }

DEF_VEC_SAT_S_ADD_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
 104   │   loop_len_79 = MIN_EXPR <ivtmp.51_53, POLY_INT_CST [16, 16]>;
 105   │   _50 = &MEM <vector([16,16]) signed char> [(int8_t *)vectp_op_1.9_77];
 106   │   vect_x_18.11_80 = .MASK_LEN_LOAD (_50, 8B, { -1, ... }, loop_len_79, 0);
 107   │   _70 = vect_x_18.11_80 >> 7;
 108   │   vect_x.12_81 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_18.11_80);
 109   │   _26 = (void *) ivtmp.47_20;
 110   │   _27 = &MEM <vector([16,16]) signed char> [(int8_t *)_26];
 111   │   vect_y_20.15_84 = .MASK_LEN_LOAD (_27, 8B, { -1, ... }, loop_len_79, 0);
 112   │   vect__7.21_90 = vect_x_18.11_80 ^ vect_y_20.15_84;
 113   │   mask__50.23_92 = vect__7.21_90 >= { 0, ... };
 114   │   vect_y.16_85 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_y_20.15_84);
 115   │   vect__6.17_86 = vect_x.12_81 + vect_y.16_85;
 116   │   vect_sum_21.18_87 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__6.17_86);
 117   │   vect__8.19_88 = vect_x_18.11_80 ^ vect_sum_21.18_87;
 118   │   mask__45.20_89 = vect__8.19_88 < { 0, ... };
 119   │   mask__44.24_93 = mask__45.20_89 & mask__50.23_92;
 120   │   _40 = .COND_XOR (mask__44.24_93, _70, { 127, ... }, vect_sum_21.18_87);
 121   │   _60 = (void *) ivtmp.49_6;
 122   │   _61 = &MEM <vector([16,16]) signed char> [(int8_t *)_60];
 123   │   .MASK_LEN_STORE (_61, 8B, { -1, ... }, loop_len_79, 0, _40);
 124   │   vectp_op_1.9_78 = vectp_op_1.9_77 + POLY_INT_CST [16, 16];
 125   │   ivtmp.47_4 = ivtmp.47_20 + POLY_INT_CST [16, 16];
 126   │   ivtmp.49_21 = ivtmp.49_6 + POLY_INT_CST [16, 16];
 127   │   ivtmp.51_98 = ivtmp.51_53;
 128   │   ivtmp.51_8 = ivtmp.51_53 + POLY_INT_CST [18446744073709551600, 18446744073709551600];

After this patch:
  88   │   _103 = .SELECT_VL (ivtmp_101, POLY_INT_CST [16, 16]);
  89   │   vect_x_18.11_90 = .MASK_LEN_LOAD (vectp_op_1.9_88, 8B, { -1, ... }, _103, 0);
  90   │   vect_y_20.14_94 = .MASK_LEN_LOAD (vectp_op_2.12_92, 8B, { -1, ... }, _103, 0);
  91   │   vect_patt_49.15_95 = .SAT_ADD (vect_x_18.11_90, vect_y_20.14_94);
  92   │   .MASK_LEN_STORE (vectp_out.16_97, 8B, { -1, ... }, _103, 0, vect_patt_49.15_95);
  93   │   vectp_op_1.9_89 = vectp_op_1.9_88 + _103;
  94   │   vectp_op_2.12_93 = vectp_op_2.12_92 + _103;
  95   │   vectp_out.16_98 = vectp_out.16_97 + _103;
  96   │   ivtmp_102 = ivtmp_101 - _103;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

	* match.pd: Add the case 3 for signed .SAT_ADD matching.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/match.pd | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index fdb59ff0d44..940292d0d49 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3251,6 +3251,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* Signed saturation add, case 5:
+   T sum = (T)((UT)X + (UT)Y);
+   SAT_S_ADD = (X ^ sum) < 0 & ~((X ^ Y) < 0) ? (-(T)(X < 0) ^ MAX) : sum;
+
+   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
+(match (signed_integer_sat_add @0 @1)
+ (cond^ (bit_and:c (lt (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
+							 (nop_convert @1))))
+		       integer_zerop)
+		   (bit_not (lt (bit_xor:c @0 @1) integer_zerop)))
+	(bit_xor:c (nop_convert (negate (nop_convert (convert
+						      (lt @0 integer_zerop)))))
+		   max_value)
+	@2)
+ (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type))))
+
 /* Unsigned saturation sub, case 1 (branch with gt):
    SAT_U_SUB = X > Y ? X - Y : 0  */
 (match (unsigned_integer_sat_sub @0 @1)