From patchwork Thu Jun 6 06:26:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1944443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=BTNYt6ut; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VvvWR26xpz20Q8 for ; Thu, 6 Jun 2024 16:27:14 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C84A438C6A3C for ; Thu, 6 Jun 2024 06:27:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) by sourceware.org (Postfix) with ESMTPS id DECE138C6A02 for ; Thu, 6 Jun 2024 06:26:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DECE138C6A02 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DECE138C6A02 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717655212; cv=none; b=piysMmwQPSdyeS0r4N0rrZVmjH8y+d+cF8Fl0amZPfo71PALXLl6p7Ord6ixdjjHYdnAQpxsbH8oRJc+ic9TzQ6gGXyvc+2nymHvjoBvD56nxDVGCJewuMZZvG1aq7c4anYEJJlA7i75IBGnr/4dT9XhntCWC+FyrU23OZ+NPPA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717655212; c=relaxed/simple; bh=Yp80oJuhaGom9J2HOzK97DET4oVcrX1oTUtvsmpNnBk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=JDaT8Uk5FiYdDg11y/AeLUS41JQtsnb7NkDbJ2DlToD1xK4291I/Vq6FqyOTOSQstqbPj45yQoFusTxIpb6kw24JcGShb5sXvhDs2fCmQU+T8ZZZvcC7eMphaVvj38/R+TfVMMQW0DXjTMBuXL8ADjk7qdJ0oFeN14bbne6rJJ4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1717655210; x=1749191210; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Yp80oJuhaGom9J2HOzK97DET4oVcrX1oTUtvsmpNnBk=; b=BTNYt6utj+72hX7UfeSUeAqSw6BICwyNSgRMh3WI/qEQMG/hT0Ay8flr Pbyh2o+cNDis7t6i2reyJNlAqIKsr9a5njc4c6bEhalctv/bQEFSHlo2Y 8iCHiI/jYyuNSwnt/ThWVAoyl+q2W4kvGJWiCJYP0vOPn9toCZC3a6U9s 61m4vRnLe+CPQyYbFnMF0QJKElwoU9N3siCJd8zHS0PFfGRyy9NPU7uuk WnUlr5Iz+OWY6KSxfvtg/6PQCR4ySGoPeV5ZkDXLFu6MTayOQ7vpihJBX rqPE8K13BOUjJ/XJK1nS+O97Blm5tPpkFTgXLjUC+hSzvjdonKdy9c0jd Q==; X-CSE-ConnectionGUID: Bng8d//fTuqFIE5pwFE5Yw== X-CSE-MsgGUID: bACT4FhbTlyWWTxewR5EBw== X-IronPort-AV: E=McAfee;i="6600,9927,11094"; a="24933655" X-IronPort-AV: E=Sophos;i="6.08,218,1712646000"; d="scan'208";a="24933655" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jun 2024 23:26:49 -0700 X-CSE-ConnectionGUID: m+lkYdXIR/OmhOsbccP9Hw== X-CSE-MsgGUID: RSynYxJsQrSJB27LPR+bGQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,218,1712646000"; d="scan'208";a="37772236" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa010.jf.intel.com with ESMTP; 05 Jun 2024 23:26:44 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BC5E21007363; Thu, 6 Jun 2024 14:26:43 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, ubizjak@gmail.com, Pan Li Subject: [PATCH v2] Vect: Support IFN SAT_SUB for unsigned vector int Date: Thu, 6 Jun 2024 14:26:42 +0800 Message-Id: <20240606062642.2070702-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240529114548.52057-1-pan2.li@intel.com> References: <20240529114548.52057-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to support the .SAT_SUB for the unsigned vector int. Given we have below example code: void vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { for (unsigned i = 0; i < n; i++) out[i] = (x[i] - y[i]) & (-(uint64_t)(x[i] >= y[i])); } Before this patch: void vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]); ivtmp_56 = _77 * 8; vect__4.7_59 = .MASK_LEN_LOAD (vectp_x.5_57, 64B, { -1, ... }, _77, 0); vect__6.10_63 = .MASK_LEN_LOAD (vectp_y.8_61, 64B, { -1, ... }, _77, 0); mask__7.11_64 = vect__4.7_59 >= vect__6.10_63; _66 = .COND_SUB (mask__7.11_64, vect__4.7_59, vect__6.10_63, { 0, ... }); .MASK_LEN_STORE (vectp_out.15_71, 64B, { -1, ... }, _77, 0, _66); vectp_x.5_58 = vectp_x.5_57 + ivtmp_56; vectp_y.8_62 = vectp_y.8_61 + ivtmp_56; vectp_out.15_72 = vectp_out.15_71 + ivtmp_56; ivtmp_76 = ivtmp_75 - _77; ... } After this patch: void vec_sat_sub_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _76 = .SELECT_VL (ivtmp_74, POLY_INT_CST [2, 2]); ivtmp_60 = _76 * 8; vect__4.7_63 = .MASK_LEN_LOAD (vectp_x.5_61, 64B, { -1, ... }, _76, 0); vect__6.10_67 = .MASK_LEN_LOAD (vectp_y.8_65, 64B, { -1, ... }, _76, 0); vect_patt_37.11_68 = .SAT_SUB (vect__4.7_63, vect__6.10_67); .MASK_LEN_STORE (vectp_out.12_70, 64B, { -1, ... }, _76, 0, vect_patt_37.11_68); vectp_x.5_62 = vectp_x.5_61 + ivtmp_60; vectp_y.8_66 = vectp_y.8_65 + ivtmp_60; vectp_out.12_71 = vectp_out.12_70 + ivtmp_60; ivtmp_75 = ivtmp_74 - _76; ... } The below test suites are passed for this patch * The x86 bootstrap test. * The x86 fully regression test. * The riscv fully regression tests. gcc/ChangeLog: * match.pd: Add new form for vector mode recog. * tree-vect-patterns.cc (gimple_unsigned_integer_sat_sub): Add new match func decl; (vect_recog_build_binary_gimple_call): Extract helper func to build gcall with given internal_fn. (vect_recog_sat_sub_pattern): Add new func impl to recog .SAT_SUB. Signed-off-by: Pan Li --- gcc/match.pd | 14 +++++++ gcc/tree-vect-patterns.cc | 85 ++++++++++++++++++++++++++++++++------- 2 files changed, 84 insertions(+), 15 deletions(-) diff --git a/gcc/match.pd b/gcc/match.pd index 7c1ad428a3c..ebc60eba8dc 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3110,6 +3110,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) && types_match (type, @0, @1)))) +/* Unsigned saturation sub, case 3 (branchless with gt): + SAT_U_SUB = (X - Y) * (X > Y). */ +(match (unsigned_integer_sat_sub @0 @1) + (mult:c (minus @0 @1) (convert (gt @0 @1))) + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && types_match (type, @0, @1)))) + +/* Unsigned saturation sub, case 4 (branchless with ge): + SAT_U_SUB = (X - Y) * (X >= Y). */ +(match (unsigned_integer_sat_sub @0 @1) + (mult:c (minus @0 @1) (convert (ge @0 @1))) + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && types_match (type, @0, @1)))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 81e8fdc9122..cef901808eb 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -4488,6 +4488,32 @@ vect_recog_mult_pattern (vec_info *vinfo, } extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); +extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree)); + +static gcall * +vect_recog_build_binary_gimple_call (vec_info *vinfo, gimple *stmt, + internal_fn fn, tree *type_out, + tree op_0, tree op_1) +{ + tree itype = TREE_TYPE (op_0); + tree vtype = get_vectype_for_scalar_type (vinfo, itype); + + if (vtype != NULL_TREE + && direct_internal_fn_supported_p (fn, vtype, OPTIMIZE_FOR_BOTH)) + { + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); + + gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); + gimple_call_set_nothrow (call, /* nothrow_p */ false); + gimple_set_location (call, gimple_location (stmt)); + + *type_out = vtype; + + return call; + } + + return NULL; +} /* * Try to detect saturation add pattern (SAT_ADD), aka below gimple: @@ -4510,27 +4536,55 @@ vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, if (!is_gimple_assign (last_stmt)) return NULL; - tree res_ops[2]; + tree ops[2]; tree lhs = gimple_assign_lhs (last_stmt); - if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL)) + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL)) { - tree itype = TREE_TYPE (res_ops[0]); - tree vtype = get_vectype_for_scalar_type (vinfo, itype); - - if (vtype != NULL_TREE - && direct_internal_fn_supported_p (IFN_SAT_ADD, vtype, - OPTIMIZE_FOR_BOTH)) + gcall *call = vect_recog_build_binary_gimple_call (vinfo, last_stmt, + IFN_SAT_ADD, type_out, + ops[0], ops[1]); + if (call) { - *type_out = vtype; - gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, res_ops[0], - res_ops[1]); + vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); + return call; + } + } + + return NULL; +} - gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); - gimple_call_set_nothrow (call, /* nothrow_p */ false); - gimple_set_location (call, gimple_location (last_stmt)); +/* + * Try to detect saturation sub pattern (SAT_ADD), aka below gimple: + * _7 = _1 >= _2; + * _8 = _1 - _2; + * _10 = (long unsigned int) _7; + * _9 = _8 * _10; + * + * And then simplied to + * _9 = .SAT_SUB (_1, _2); + */ - vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); +static gimple * +vect_recog_sat_sub_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo); + + if (!is_gimple_assign (last_stmt)) + return NULL; + + tree ops[2]; + tree lhs = gimple_assign_lhs (last_stmt); + + if (gimple_unsigned_integer_sat_sub (lhs, ops, NULL)) + { + gcall *call = vect_recog_build_binary_gimple_call (vinfo, last_stmt, + IFN_SAT_SUB, type_out, + ops[0], ops[1]); + if (call) + { + vect_pattern_detected ("vect_recog_sat_sub_pattern", last_stmt); return call; } } @@ -6999,6 +7053,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_divmod_pattern, "divmod" }, { vect_recog_mult_pattern, "mult" }, { vect_recog_sat_add_pattern, "sat_add" }, + { vect_recog_sat_sub_pattern, "sat_sub" }, { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, { vect_recog_gcond_pattern, "gcond" }, { vect_recog_bool_pattern, "bool" },