From patchwork Thu Jun 27 05:12:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1952968 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=MNsG9384; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8msw2vf8z20XB for ; Thu, 27 Jun 2024 15:12:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC1E23875DEA for ; Thu, 27 Jun 2024 05:12:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by sourceware.org (Postfix) with ESMTPS id 8B6243875473 for ; Thu, 27 Jun 2024 05:12:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B6243875473 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8B6243875473 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719465146; cv=none; b=eu3Q2EGMLw5VCMwHDBCXVqa5lWbliTwhiRQhr72PCaHZiYdFYCeOfZfpvkDd96+zxQQPfoeyNt/PCQ9AW2HQ/tEgUZD6WFqAVZ+k/HnOBoOZ6qpiNarjFZwBaoFZJGctMAm6DM1KcBlVv77JuvXebC94Jy/8KQHQ6uV5IXSNv0I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719465146; c=relaxed/simple; bh=a+c2y9i0TrND2ZYFWBIYA4WxriZTM5dBwFQXzvWJTWk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=TASLFVe/L3gdGJQj7vmEEZP1iLHJJ0ops9SssTfNCxRfAr6cjUKRSQKUAWuNizOXjFKEsmDM6mkwWG7KtVZFTMRTZyQO7/idKGkLKjAWyDzq5BrLrg9k6NCNjd/s96+m+BsVTaQjl5SI7HfA5t2CelATFYR3TcaImQaRFbfnz/0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719465144; x=1751001144; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a+c2y9i0TrND2ZYFWBIYA4WxriZTM5dBwFQXzvWJTWk=; b=MNsG93844dFqG2vTyzB95+zrt/tqbF6w2jYj7REEN64nWKPuPXix3Y5a 4mAV0Uxw1ixk8nniG0C5ryeUj81+1eCnHj+t6PbbohPwL6U1acpkrv0vF ttv1Hz/GLxDAmIa/YbrKlIlpjw1t5BBd1kp4KdblYd97E0aaliTcBehAM xZmEMlG3MHgZk3xZ6baY/iGsYHsMpZN47k0NCY0UIWXyEuNIrw5QVnMjm XLDKkPlTDrEHgUOZeaT93P/SSK3ujtUPfhMpCES/3ZbeHiSj3QKBu0ibE ZNIT9qerxT/PzJ/UmEtlHNdQuBHjY4lB7kllJLZlx00mrvw2Uw1W6C4oL A==; X-CSE-ConnectionGUID: XF3sXd1QQRur0s+YEGQulQ== X-CSE-MsgGUID: EfZVmHTLQZ6rnF3S2FWdsQ== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="34024634" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="34024634" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2024 22:12:23 -0700 X-CSE-ConnectionGUID: flxY6S2jT2uclykDr9Y65g== X-CSE-MsgGUID: 9vMunXISRfiqJ+XhvvAi7g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44161157" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa007.fm.intel.com with ESMTP; 26 Jun 2024 22:12:18 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id E8CF31005670; Thu, 27 Jun 2024 13:12:17 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, richard.guenther@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH v2] Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int Date: Thu, 27 Jun 2024 13:12:16 +0800 Message-Id: <20240627051216.1817555-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240626014559.765149-1-pan2.li@intel.com> References: <20240626014559.765149-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to add the middle-end presentation for the saturation truncation. Aka set the result of truncated value to the max value when overflow. It will take the pattern similar as below. Form 1: #define DEF_SAT_U_TRUC_FMT_1(WT, NT) \ NT __attribute__((noinline)) \ sat_u_truc_##T##_fmt_1 (WT x) \ { \ bool overflow = x > (WT)(NT)(-1); \ return ((NT)x) | (NT)-overflow; \ } For example, truncated uint16_t to uint8_t, we have * SAT_TRUNC (254) => 254 * SAT_TRUNC (255) => 255 * SAT_TRUNC (256) => 255 * SAT_TRUNC (65536) => 255 Given below SAT_TRUNC from uint64_t to uint32_t. DEF_SAT_U_TRUC_FMT_1 (uint64_t, uint32_t) Before this patch: __attribute__((noinline)) uint32_t sat_u_truc_T_fmt_1 (uint64_t x) { _Bool overflow; unsigned int _1; unsigned int _2; unsigned int _3; uint32_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY overflow_5 = x_4(D) > 4294967295; _1 = (unsigned int) x_4(D); _2 = (unsigned int) overflow_5; _3 = -_2; _6 = _1 | _3; return _6; ;; succ: EXIT } After this patch: __attribute__((noinline)) uint32_t sat_u_truc_T_fmt_1 (uint64_t x) { uint32_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY _6 = .SAT_TRUNC (x_4(D)); [tail call] return _6; ;; succ: EXIT } The below tests are passed for this patch: *. The rv64gcv fully regression tests. *. The rv64gcv build with glibc. *. The x86 bootstrap tests. *. The x86 fully regression tests. gcc/ChangeLog: * internal-fn.def (SAT_TRUNC): Add new signed IFN sat_trunc as unary_convert. * match.pd: Add new matching pattern for unsigned int sat_trunc. * optabs.def (OPTAB_CL): Add unsigned and signed optab. * tree-ssa-math-opts.cc (gimple_unsigend_integer_sat_trunc): Add new decl for the matching pattern generated func. (match_unsigned_saturation_trunc): Add new func impl to match the .SAT_TRUNC. (math_opts_dom_walker::after_dom_children): Add .SAT_TRUNC match function under BIT_IOR_EXPR case. Signed-off-by: Pan Li --- gcc/internal-fn.def | 2 ++ gcc/match.pd | 16 ++++++++++++++++ gcc/optabs.def | 3 +++ gcc/tree-ssa-math-opts.cc | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 53 insertions(+) diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index a8c83437ada..915d329c05a 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -278,6 +278,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first, DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, binary) DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_SUB, ECF_CONST, first, sssub, ussub, binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_TRUNC, ECF_CONST, first, sstrunc, ustrunc, unary_convert) + DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) diff --git a/gcc/match.pd b/gcc/match.pd index 3d0689c9312..06120a1c62c 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3210,6 +3210,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) && types_match (type, @0, @1)))) +/* Unsigned saturation truncate, case 1 (), sizeof (WT) > sizeof (NT). + SAT_U_TRUNC = (NT)x | (NT)(-(X > (WT)(NT)(-1))). */ +(match (unsigned_integer_sat_trunc @0) + (bit_ior:c (negate (convert (gt @0 INTEGER_CST@1))) + (convert @0)) + (with { + unsigned itype_precision = TYPE_PRECISION (TREE_TYPE (@0)); + unsigned otype_precision = TYPE_PRECISION (type); + wide_int trunc_max = wi::mask (itype_precision / 2, false, itype_precision); + wide_int int_cst = wi::to_wide (@1, itype_precision); + } + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && otype_precision < itype_precision + && wi::eq_p (trunc_max, int_cst))))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/optabs.def b/gcc/optabs.def index bc2611abdc2..c16580ce956 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -63,6 +63,9 @@ OPTAB_CX(fractuns_optab, "fractuns$Q$b$I$a2") OPTAB_CL(satfract_optab, "satfract$b$Q$a2", SAT_FRACT, "satfract", gen_satfract_conv_libfunc) OPTAB_CL(satfractuns_optab, "satfractuns$I$b$Q$a2", UNSIGNED_SAT_FRACT, "satfractuns", gen_satfractuns_conv_libfunc) +OPTAB_CL(ustrunc_optab, "ustrunc$b$a2", US_TRUNCATE, "ustrunc", NULL) +OPTAB_CL(sstrunc_optab, "sstrunc$b$a2", SS_TRUNCATE, "sstrunc", NULL) + OPTAB_CD(sfixtrunc_optab, "fix_trunc$F$b$I$a2") OPTAB_CD(ufixtrunc_optab, "fixuns_trunc$F$b$I$a2") diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 57085488722..3783a874699 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -4088,6 +4088,7 @@ arith_overflow_check_p (gimple *stmt, gimple *cast_stmt, gimple *&use_stmt, extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree)); +extern bool gimple_unsigned_integer_sat_trunc (tree, tree*, tree (*)(tree)); static void build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn, @@ -4216,6 +4217,36 @@ match_unsigned_saturation_sub (gimple_stmt_iterator *gsi, gphi *phi) ops[0], ops[1]); } +/* + * Try to match saturation unsigned sub. + * uint16_t x_4(D); + * uint8_t _6; + * overflow_5 = x_4(D) > 255; + * _1 = (unsigned char) x_4(D); + * _2 = (unsigned char) overflow_5; + * _3 = -_2; + * _6 = _1 | _3; + * => + * _6 = .SAT_TRUNC (x_4(D)); + * */ +static void +match_unsigned_saturation_trunc (gimple_stmt_iterator *gsi, gassign *stmt) +{ + tree ops[1]; + tree lhs = gimple_assign_lhs (stmt); + tree type = TREE_TYPE (lhs); + + if (gimple_unsigned_integer_sat_trunc (lhs, ops, NULL) + && direct_internal_fn_supported_p (IFN_SAT_TRUNC, + tree_pair (type, TREE_TYPE (ops[0])), + OPTIMIZE_FOR_BOTH)) + { + gcall *call = gimple_build_call_internal (IFN_SAT_TRUNC, 1, ops[0]); + gimple_call_set_lhs (call, lhs); + gsi_replace (gsi, call, /* update_eh_info */ true); + } +} + /* Recognize for unsigned x x = y - z; if (x > y) @@ -6188,6 +6219,7 @@ math_opts_dom_walker::after_dom_children (basic_block bb) case BIT_IOR_EXPR: match_unsigned_saturation_add (&gsi, as_a (stmt)); + match_unsigned_saturation_trunc (&gsi, as_a (stmt)); /* fall-through */ case BIT_XOR_EXPR: match_uaddc_usubc (&gsi, stmt, code);