From patchwork Wed May 15 02:14:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1935239 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=LC1acAFx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfGyJ6Y2pz1ymf for ; Wed, 15 May 2024 12:14:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 77B4A3849AE2 for ; Wed, 15 May 2024 02:14:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by sourceware.org (Postfix) with ESMTPS id A08533858D35 for ; Wed, 15 May 2024 02:14:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A08533858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A08533858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739261; cv=none; b=w4Y2mpaaMBT4xPZasPvaXz8ZgJYtB4ot5Fef20zaJDd4pcmPAJEXXf23flJUxetk3+Pz0xv/J+GxRZSma8o7kx7P3GlmEcHDT9k2rCKrXTcgsGnIYImkkJjzqrC4an6R5ln7Z8cg/JgizFC0xN/ijcjqptxcpjsQ4Lb2pCcvxwI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739261; c=relaxed/simple; bh=UDlhstPXQo+2IKr3btFWxFkb2f5NFo74ru8fs6BQdCs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IyR7y0nBCDTWACZ66ZwEd2dh3l9qc6ggwvWaaGtOZOtp6DWNt+bDMsOEFzewxw4Iw6ya29JnXNtCl2Y2elqGXQjOF6mr1QZyRehSf88vP0PemAZrFVV0Wg4hnnEnqIowmrAbI4hQVfwkGbmeCDoYY9O0um0d9xILGX3Je4Nkx+0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715739258; x=1747275258; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=UDlhstPXQo+2IKr3btFWxFkb2f5NFo74ru8fs6BQdCs=; b=LC1acAFx1dv0/gynmOUUI3ft1EIwl0ltzrQdUOPldxa8swHUu4JrcLUk hmWz7w3w5Kf4rP0Dv+XLvXJ/tQpGzlkc697M3MewuAoI/mcjpjeDMNG27 SJhXOf6kCjh1EEBlvwFqFO1X7yRWpSr9GJ2EYr5vOAmRLGJ57xVg96Mur e9JRaCKSPcpLKU1bNF/CINUlMtQUAOH+b8obp0ZRGg03nIoCe/0LpVwTW zgMjduIzfgCkiCPESftEy9IdijpF6VWPxqCRE0tIVor5y3koOID2a2kvK nVGrZgzLd1BffPwF9mbtB8DM/KAi4eF+0UiF/xC/hPNhJ9lD16Giy2QHC g==; X-CSE-ConnectionGUID: 3TkDC9oVQ9O2HimqfCdQ0Q== X-CSE-MsgGUID: 75Etd+wqRQqCHaaVSw0l5g== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="23168667" X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="23168667" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 May 2024 19:14:16 -0700 X-CSE-ConnectionGUID: Yv1Bvj1dQjmNujakdXBlyg== X-CSE-MsgGUID: tnysZp8KTGqvEkBmgnB8Fg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="61716356" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 14 May 2024 19:14:12 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 02D8F10081FF; Wed, 15 May 2024 10:14:12 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Date: Wed, 15 May 2024 10:14:05 +0800 Message-Id: <20240515021407.1287623-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. Given below example for the unsigned scalar integer uint64_t: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;; pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;; succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;; pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;; succ: EXIT } The below tests are passed for this patch: 1. The riscv fully regression tests. 3. The x86 bootstrap tests. 4. The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD to the return true switch case(s). * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. * match.pd: Add unsigned SAT_ADD match(es). * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern func decl generated in match.pd match. (match_saturation_arith): New func impl to match the saturation arith. (math_opts_dom_walker::after_dom_children): Try match saturation arith when IOR expr. Signed-off-by: Pan Li --- gcc/internal-fn.cc | 1 + gcc/internal-fn.def | 2 ++ gcc/match.pd | 51 +++++++++++++++++++++++++++++++++++++++ gcc/optabs.def | 4 +-- gcc/tree-ssa-math-opts.cc | 32 ++++++++++++++++++++++++ 5 files changed, 88 insertions(+), 2 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 0a7053c2286..73045ca8c8c 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_SAT_ADD: case IFN_VEC_WIDEN_PLUS: case IFN_VEC_WIDEN_PLUS_LO: case IFN_VEC_WIDEN_PLUS_HI: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 848bb9dbff3..25badbb86e5 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | ECF_NOTHROW, first, DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first, smulhrs, umulhrs, binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, binary) + DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) diff --git a/gcc/match.pd b/gcc/match.pd index 07e743ae464..0f9c34fa897 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3043,6 +3043,57 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || POINTER_TYPE_P (itype)) && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) +/* Unsigned Saturation Add */ +(match (usadd_left_part_1 @0 @1) + (plus:c @0 @1) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_left_part_2 @0 @1) + (realpart (IFN_ADD_OVERFLOW:c @0 @1)) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_1 @0 @1) + (negate (convert (lt (plus:c @0 @1) @0))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_1 @0 @1) + (negate (convert (gt @0 (plus:c @0 @1)))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_2 @0 @1) + (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +/* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2 + because the sub part of left_part_2 cannot work with right_part_1. + For example, left_part_2 pattern focus one .ADD_OVERFLOW but the + right_part_1 has nothing to do with .ADD_OVERFLOW. */ + +/* Unsigned saturation add, case 1 (branchless): + SAT_U_ADD = (X + Y) | - ((X + Y) < X) or + SAT_U_ADD = (X + Y) | - (X > (X + Y)). */ +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c (usadd_left_part_1 @0 @1) (usadd_right_part_1 @0 @1))) + +/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW). */ +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/optabs.def b/gcc/optabs.def index ad14f9328b9..3f2cb46aff8 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") OPTAB_NX(add_optab, "add$Q$a3") OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfunc) OPTAB_VX(addv_optab, "add$F$a3") -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_libfunc) OPTAB_NX(sub_optab, "sub$F$a3") OPTAB_NX(sub_optab, "sub$Q$a3") diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index e8c804f09b7..62da1c5ee08 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -4086,6 +4086,36 @@ arith_overflow_check_p (gimple *stmt, gimple *cast_stmt, gimple *&use_stmt, return 0; } +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); + +/* + * Try to match saturation arith pattern(s). + * 1. SAT_ADD (unsigned) + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * => + * _12 = .SAT_ADD (_4, _6); */ +static void +match_saturation_arith (gimple_stmt_iterator *gsi, gassign *stmt) +{ + gcall *call = NULL; + + tree ops[2]; + tree lhs = gimple_assign_lhs (stmt); + + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) + && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), + OPTIMIZE_FOR_BOTH)) + { + call = gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops[1]); + gimple_call_set_lhs (call, lhs); + gsi_replace (gsi, call, true); + } +} + /* Recognize for unsigned x x = y - z; if (x > y) @@ -6048,6 +6078,8 @@ math_opts_dom_walker::after_dom_children (basic_block bb) break; case BIT_IOR_EXPR: + match_saturation_arith (&gsi, as_a (stmt)); + /* fall-through */ case BIT_XOR_EXPR: match_uaddc_usubc (&gsi, stmt, code); break; From patchwork Wed May 15 02:14:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1935241 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=m2vvPcd9; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfGzQ4tSJz20KD for ; Wed, 15 May 2024 12:15:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ECDF03849AD6 for ; Wed, 15 May 2024 02:15:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by sourceware.org (Postfix) with ESMTPS id 1724F3858C3A for ; Wed, 15 May 2024 02:14:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1724F3858C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1724F3858C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739265; cv=none; b=lLGl3dcBAi7w0gJqNkAFhiJ+GGHYW4W/DhD6DFH0gqDktZ19/kv40hkzGJZKVz8sEvNLYfhA0z0EXhFVWxkRVJ53o8Q7R6dtL/9DiWLOGDSD/LLpdBToXFvKW4nXoLVf2q+VmKnSDJIn79bdrCtLSDVI7g+rfFwOUBABMa4YiqA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739265; c=relaxed/simple; bh=oS4upI7ULE4op/hS0Qp8NWKQsHB2F9qukuZo+a+BSSU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=spYOQIvyAhFTEPvnltSVhfZ+N2Snr6quP/BU4U6Zo7RH0vGldxklmtaPVoVFQG9gGLMzWCBZXNk3ElN5IpVBzUap/R/rE4S3DlBF8wjJ2XzsNBrYz6B10m8y/tAjD5vR5I2DJwEYZ6QIOwIUfFHe1rQ0Vb8Stwq1w5FJY+QGbvM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715739263; x=1747275263; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oS4upI7ULE4op/hS0Qp8NWKQsHB2F9qukuZo+a+BSSU=; b=m2vvPcd9pb3VOTtwL4VL39yG48Id9VO6YlHdKrwXw9pvuCI93bpAY1oV Y2K+8cTMuLm3i4OqkwMpNj0V0iXOIq/XfQ7R072pXlx3WGKqyvkkO1Js/ RGfwpUoEcgUBzZRXmRpyDjGtHTbCRcDjt+IsZAJ7sPEllrbPzQzI2rA3w UrpYkFf8DO2pNnoHqsYmNoAnGJwjVxK5sLDYOO7QJSHw9TyYvNxWvkG/W HmFL+I8do9lMhBUGtyw4f+QF2ZVexDDqGJmu2IZ3WdVxIpAqoF504proH w8E8PYs2vuyi6fTPq38OGJpSjeYBoycFzALvzobRinUgWmpgdr3tqzwji A==; X-CSE-ConnectionGUID: 154nR420R0iN1zDKUGLGZg== X-CSE-MsgGUID: ba5Tu9lfRsWIDxtB2gXAlg== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="23168669" X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="23168669" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 May 2024 19:14:17 -0700 X-CSE-ConnectionGUID: tDVc8Zg4Rt2bMPl0FMODqA== X-CSE-MsgGUID: IlwCzfQ4TyKLg/gApydHgQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="61716360" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 14 May 2024 19:14:14 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 3E7311008F66; Wed, 15 May 2024 10:14:13 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v5 2/3] Vect: Support new IFN SAT_ADD for unsigned vector int Date: Wed, 15 May 2024 10:14:06 +0800 Message-Id: <20240515021407.1287623-2-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240515021407.1287623-1-pan2.li@intel.com> References: <20240515021407.1287623-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } The below test suites are passed for this patch. * The riscv fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * tree-vect-patterns.cc (gimple_unsigned_integer_sat_add): New func decl generated by match.pd match. (vect_recog_sat_add_pattern): New func impl to recog the pattern for unsigned SAT_ADD. Signed-off-by: Pan Li --- gcc/tree-vect-patterns.cc | 52 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index dfb7d800526..a313dc64643 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -4487,6 +4487,57 @@ vect_recog_mult_pattern (vec_info *vinfo, return pattern_stmt; } +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); + +/* + * Try to detect saturation add pattern (SAT_ADD), aka below gimple: + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * + * And then simplied to + * _12 = .SAT_ADD (_4, _6); + */ + +static gimple * +vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo); + + if (!is_gimple_assign (last_stmt)) + return NULL; + + tree res_ops[2]; + tree lhs = gimple_assign_lhs (last_stmt); + + if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL)) + { + tree itype = TREE_TYPE (res_ops[0]); + tree vtype = get_vectype_for_scalar_type (vinfo, itype); + + if (vtype != NULL_TREE + && direct_internal_fn_supported_p (IFN_SAT_ADD, vtype, + OPTIMIZE_FOR_BOTH)) + { + *type_out = vtype; + gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, res_ops[0], + res_ops[1]); + + gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); + gimple_call_set_nothrow (call, /* nothrow_p */ false); + gimple_set_location (call, gimple_location (last_stmt)); + + vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); + return call; + } + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: @@ -6987,6 +7038,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_vector_vector_shift_pattern, "vector_vector_shift" }, { vect_recog_divmod_pattern, "divmod" }, { vect_recog_mult_pattern, "mult" }, + { vect_recog_sat_add_pattern, "sat_add" }, { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, { vect_recog_gcond_pattern, "gcond" }, { vect_recog_bool_pattern, "bool" }, From patchwork Wed May 15 02:14:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1935240 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=C6sD6dWi; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VfGyX27ZCz1ymf for ; Wed, 15 May 2024 12:15:00 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 87E8D3849ADF for ; Wed, 15 May 2024 02:14:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by sourceware.org (Postfix) with ESMTPS id 1E189385840D for ; Wed, 15 May 2024 02:14:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E189385840D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1E189385840D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739270; cv=none; b=jsSBQBSqLLP+vzpbKKbOoLD9BZG5j0/zTmN7vZuRSIOig9EEn9u+7ln26ot5Z643dKvxD1aG5QUmC4IbQrmmu1oLxt4SVx2oVS4iC+aZFlSv4DLaB91srasNSYL7uRkuaY4FXYxW+bqYi6A7/EaTrRYkOcTVt2FAThhxp7KLUTE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715739270; c=relaxed/simple; bh=jMKB2+0Kd3ldFhZbckAP1zHYljrJM42naVlk+94rc7U=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=NX8d2YzLge63zcexFRa7IYf3xS0zhT3WR0LhzJ5kfnBpjT55ppTid8AR2DF2Fugtj+sUGxFGchXTkR/iMGBYxI11wGA+/Luc1ykbFcjiR+xzW8ifhnaGezuCB/6SOQEi1srlEAT15WM4BHSIn313UE7g/j1m7kPP3RRaDdlBTQk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715739263; x=1747275263; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jMKB2+0Kd3ldFhZbckAP1zHYljrJM42naVlk+94rc7U=; b=C6sD6dWiBPJyvHuwh8VTBHb+116ZyZHfoPxGZ5PsJz5zjX1YdDjxltFE uPNDZguO/XRLdGCKd81vK3RQ3Wc0lBgiTGsi/7ZjDRypoKCqYFcgPCyZY FzK+TwxXdnFGlGupk30/BVfwdHCNlwo+WrZ99vpr6Rae2hCNGPaTdU/5D mRC1lCL9MsA5I1xdseJekLgjfDIg+oL4ICUoomgcTAykgmn1RGQ7uymTY Hqw6Wdm6UW650Q7hVmQ500736XKfgqdrMEdDMyCwx5Vt0tFvtErV3eE4+ SlouUghhnTuNHmo5EB9zNsApu8JQR/sUE0JMNr2B8n+E0aCO9Gk8Df67M w==; X-CSE-ConnectionGUID: nhQYXzdMQRymhtj2pzU5bw== X-CSE-MsgGUID: /VOfqikBS0qgnD+3Gwhkvg== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="23168678" X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="23168678" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 May 2024 19:14:22 -0700 X-CSE-ConnectionGUID: 5GnWRzUqThGKiv78+JQBMA== X-CSE-MsgGUID: koKRT/gwQr6Ef7Oi9j9IZw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="61716376" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa002.jf.intel.com with ESMTP; 14 May 2024 19:14:14 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 316451008F69; Wed, 15 May 2024 10:14:14 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v5 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector Date: Wed, 15 May 2024 10:14:07 +0800 Message-Id: <20240515021407.1287623-3-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240515021407.1287623-1-pan2.li@intel.com> References: <20240515021407.1287623-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, KAM_SHORT, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li The patch implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below vector as example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 <= Vector Single-Width Saturating Add vse64.v v1,0(a0) ... The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for the unsigned SAT_ADD in vector mode. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for the unsigned SAT_ADD in scalar mode. * config/riscv/vector.md: Allow VLS mode for vsaddu. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: New test. * gcc.target/riscv/sat_arith.h: New test. * gcc.target/riscv/sat_u_add-1.c: New test. * gcc.target/riscv/sat_u_add-2.c: New test. * gcc.target/riscv/sat_u_add-3.c: New test. * gcc.target/riscv/sat_u_add-4.c: New test. * gcc.target/riscv/sat_u_add-run-1.c: New test. * gcc.target/riscv/sat_u_add-run-2.c: New test. * gcc.target/riscv/sat_u_add-run-3.c: New test. * gcc.target/riscv/sat_u_add-run-4.c: New test. * gcc.target/riscv/scalar_sat_binary.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 17 +++++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 16 ++++ gcc/config/riscv/riscv.cc | 47 ++++++++++++ gcc/config/riscv/riscv.md | 11 +++ gcc/config/riscv/vector.md | 12 +-- .../riscv/rvv/autovec/binop/vec_sat_binary.h | 33 ++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-1.c | 19 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-2.c | 20 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-3.c | 20 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-4.c | 20 +++++ .../rvv/autovec/binop/vec_sat_u_add-run-1.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-2.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-3.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-4.c | 75 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_arith.h | 31 ++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-1.c | 19 +++++ gcc/testsuite/gcc.target/riscv/sat_u_add-2.c | 21 ++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-3.c | 18 +++++ gcc/testsuite/gcc.target/riscv/sat_u_add-4.c | 17 +++++ .../gcc.target/riscv/sat_u_add-run-1.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-2.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-3.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-4.c | 25 +++++++ .../gcc.target/riscv/scalar_sat_binary.h | 27 +++++++ 25 files changed, 744 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index aa1ae0fe075..7ceeb8d64f6 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2612,3 +2612,20 @@ (define_expand "rawmemchr" DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Saturation ALU. +;; ------------------------------------------------------------------------- +;; Includes: +;; - add +;; ------------------------------------------------------------------------- +(define_expand "usadd3" + [(match_operand:V_VLSI 0 "register_operand") + (match_operand:V_VLSI 1 "register_operand") + (match_operand:V_VLSI 2 "register_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_vec_usadd (operands[0], operands[1], operands[2], mode); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5c8a52b78a2..1ada792a162 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -133,6 +133,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_usadd (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); @@ -623,6 +624,7 @@ void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode); +void expand_vec_usadd (rtx, rtx, rtx, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx), enum avl_type); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index c9e0feebca6..c34111f89b8 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4635,6 +4635,16 @@ emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, } } +static void +emit_vec_saddu (rtx op_dest, rtx op_1, rtx op_2, insn_type type, + machine_mode vec_mode) +{ + rtx ops[] = {op_dest, op_1, op_2}; + insn_code icode = code_for_pred (US_PLUS, vec_mode); + + emit_vlmax_insn (icode, type, ops); +} + void expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, machine_mode vec_int_mode) @@ -4862,6 +4872,12 @@ expand_vec_lfloor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, vec_int_mode); } +void +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); +} + /* Vectorize popcount by the Wilkes-Wheeler-Gill algorithm that libgcc uses as well. */ void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 4067505270e..cc0185f64cd 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -11281,6 +11281,53 @@ riscv_get_raw_result_mode (int regno) return default_get_reg_raw_mode (regno); } +/* Implements the unsigned saturation add standard name usadd for int mode. */ + +void +riscv_expand_usadd (rtx dest, rtx x, rtx y) +{ + machine_mode mode = GET_MODE (dest); + rtx xmode_sum = gen_reg_rtx (Xmode); + rtx xmode_lt = gen_reg_rtx (Xmode); + rtx xmode_x = gen_lowpart (Xmode, x); + rtx xmode_y = gen_lowpart (Xmode, y); + rtx xmode_dest = gen_reg_rtx (Xmode); + + /* Step-1: sum = x + y */ + if (mode == SImode && mode != Xmode) + { /* Take addw to avoid the sum truncate. */ + rtx simode_sum = gen_reg_rtx (SImode); + riscv_emit_binary (PLUS, simode_sum, x, y); + emit_move_insn (xmode_sum, gen_lowpart (Xmode, simode_sum)); + } + else + riscv_emit_binary (PLUS, xmode_sum, xmode_x, xmode_y); + + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI. */ + if (mode == HImode || mode == QImode) + { + int shift_bits = GET_MODE_BITSIZE (Xmode) + - GET_MODE_BITSIZE (mode).to_constant (); + + gcc_assert (shift_bits > 0); + + riscv_emit_binary (ASHIFT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + riscv_emit_binary (LSHIFTRT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + } + + /* Step-2: lt = sum < x */ + riscv_emit_binary (LTU, xmode_lt, xmode_sum, xmode_x); + + /* Step-3: lt = -lt */ + riscv_emit_unary (NEG, xmode_lt, xmode_lt); + + /* Step-4: xmode_dest = sum | lt */ + riscv_emit_binary (IOR, xmode_dest, xmode_lt, xmode_sum); + + /* Step-5: dest = xmode_dest */ + emit_move_insn (dest, gen_lowpart (mode, xmode_dest)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ee15c63db10..85a34adea83 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -4145,6 +4145,17 @@ (define_insn_and_split "" "{ operands[6] = gen_lowpart (SImode, operands[5]); }" [(set_attr "type" "arith")]) +(define_expand "usadd3" + [(match_operand:ANYI 0 "register_operand") + (match_operand:ANYI 1 "register_operand") + (match_operand:ANYI 2 "register_operand")] + "" + { + riscv_expand_usadd (operands[0], operands[1], operands[2]); + DONE; + } +) + (include "bitmanip.md") (include "crypto.md") (include "sync.md") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 228d0f9a766..f8ed61f4a13 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -4062,8 +4062,8 @@ (define_insn "@pred_trunc" ;; Saturating Add and Subtract (define_insn "@pred_" - [(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") - (if_then_else:VI + [(set (match_operand:V_VLSI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") + (if_then_else:V_VLSI (unspec: [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1, vm, vm,Wc1,Wc1") (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") @@ -4072,10 +4072,10 @@ (define_insn "@pred_" (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_sat_int_binop:VI - (match_operand:VI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") - (match_operand:VI 4 "" "")) - (match_operand:VI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (any_sat_int_binop:V_VLSI + (match_operand:V_VLSI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 4 "" "")) + (match_operand:V_VLSI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "@ v.vv\t%0,%3,%4%p1 diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h new file mode 100644 index 00000000000..0976ae97830 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h @@ -0,0 +1,33 @@ +#ifndef HAVE_DEFINED_VEC_SAT_BINARY +#define HAVE_DEFINED_VEC_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. defint N as the test array size, for example 16. + 3. define RUN_VEC_SAT_BINARY as run function. + 4. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i, k; + T out[N]; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + T *op_1 = test_data[i][0]; + T *op_2 = test_data[i][1]; + T *expect = test_data[i][2]; + + RUN_VEC_SAT_BINARY (T, out, op_1, op_2, N); + + for (k = 0; k < N; k++) + if (out[k] != expect[k]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c new file mode 100644 index 00000000000..dbbfa00afe2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint8_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint8_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c new file mode 100644 index 00000000000..1253fdb5f60 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint16_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint16_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c new file mode 100644 index 00000000000..74bba9cadd1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint32_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint32_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c new file mode 100644 index 00000000000..f3692b4cc25 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint64_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint64_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c new file mode 100644 index 00000000000..1dcb333f687 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c new file mode 100644 index 00000000000..dbf01ac863d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c new file mode 100644 index 00000000000..20ad2736403 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c new file mode 100644 index 00000000000..2f31edc527e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h b/gcc/testsuite/gcc.target/riscv/sat_arith.h new file mode 100644 index 00000000000..2ef9fd825f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h @@ -0,0 +1,31 @@ +#ifndef HAVE_SAT_ARITH +#define HAVE_SAT_ARITH + +#include + +#define DEF_SAT_U_ADD_FMT_1(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_1 (T x, T y) \ +{ \ + return (x + y) | (-(T)((T)(x + y) < x)); \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_1(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (x + y) | (-(T)((T)(x + y) < x)); \ + } \ +} + +#define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y) + +#define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N) + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c new file mode 100644 index 00000000000..609e1ea343b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint8_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint8_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c new file mode 100644 index 00000000000..d30436c782a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint16_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint16_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c new file mode 100644 index 00000000000..12347c607bd --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint32_t_fmt_1: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint32_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c new file mode 100644 index 00000000000..f2c6b74d917 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint64_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint64_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c new file mode 100644 index 00000000000..f1972490006 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c new file mode 100644 index 00000000000..cb3879d0cde --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c new file mode 100644 index 00000000000..c9a6080ca3b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c new file mode 100644 index 00000000000..c19b7e22387 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h new file mode 100644 index 00000000000..cbb2d750107 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h @@ -0,0 +1,27 @@ +#ifndef HAVE_DEFINED_SCALAR_SAT_BINARY +#define HAVE_DEFINED_SCALAR_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. define RUN_SAT_BINARY as run function. + 3. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i; + T *d; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + d = test_data[i]; + + if (RUN_SAT_BINARY (T, d[0], d[1]) != d[2]) + __builtin_abort (); + } + + return 0; +} + +#endif