From patchwork Tue May 28 08:29:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1940327 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=NGUyPuGr; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VpQgZ0hLhz20Q9 for ; Tue, 28 May 2024 18:30:18 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50A233882AC0 for ; Tue, 28 May 2024 08:30:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id 56A113858435 for ; Tue, 28 May 2024 08:29:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 56A113858435 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 56A113858435 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716884988; cv=none; b=vGmTaMQDMT3M/AzAzNSiRyfGxJscmtnWLvJ51/brrICD/NwfvlDmt4ZzNZOi/EW5NFTFX9jhX7eJjb//79XVHHC580NLfVJLtJ/6WlDIsTDFMJkQgEBpc4OG8mTNZKoGSvfnomg/uVZDPXwy1vI6OasEV5hcrK7QHUgl8+uAA0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716884988; c=relaxed/simple; bh=PbJBfMRdtZVPPG+PQpqs5Q7Axac4uFrD0qteVJSWAZg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=qnM2ZCBY+L5g31wfvJGbdC3CoSDiE90NZYU4Wa4PQezsr8ZM6OIqI8TAQxb7GJCVTr1mCj9TLFJbcfo7RwXVlGLZHId0ruhp9gb6E6gdQx83BLGAG7rh1paB11tX0RdlwSsEtoijy7r1qoRsksdgrCmvVAD777TuZlKa+M5HgNg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1716884984; x=1748420984; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=PbJBfMRdtZVPPG+PQpqs5Q7Axac4uFrD0qteVJSWAZg=; b=NGUyPuGrW5D+BJ67th9Vg9ajMGwZpgQWzz9iDeowU0j0qtf50n/B5J7d tj2/UFVrbgqjVbQqSo6w2TQjviN3XRm+07bfZQyqrgmlrRCC2l1YRAT89 FspMDAuRbqiZwljB07+ex/nYZuUpCOt02KWGs5lScBdf0iSO/fD/LTtyn xh2CTKm39m8NgRZH3ThWaKTu6gbzFYsFGXExEsANO5gZTLe+P1iSSaQso TjBt9YHnG8achLoKTAMcVIQ7MtDEDuWTTeRDxAKblNoZNaE+NCgglW+xS ZmCroAkremVkG47xrnwoUx2n23Feq8UkpJ72fZOa2irLAxfU+C76juDod Q==; X-CSE-ConnectionGUID: QmJN7mqnSFC1HMPvXPn0hw== X-CSE-MsgGUID: nA9ZcwE/RMaMOK1dSaTd9A== X-IronPort-AV: E=McAfee;i="6600,9927,11085"; a="13344260" X-IronPort-AV: E=Sophos;i="6.08,194,1712646000"; d="scan'208";a="13344260" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2024 01:29:43 -0700 X-CSE-ConnectionGUID: u6Oe9eOLQFuny0FoIlAR0A== X-CSE-MsgGUID: LAmV3vuqRFC4cm88vaKBgQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,194,1712646000"; d="scan'208";a="35602919" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 28 May 2024 01:29:40 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 2825E10077F4; Tue, 28 May 2024 16:29:39 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, Pan Li Subject: [PATCH v1] Internal-fn: Support new IFN SAT_SUB for unsigned scalar int Date: Tue, 28 May 2024 16:29:37 +0800 Message-Id: <20240528082937.3300351-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to add the middle-end presentation for the saturation sub. Aka set the result of add to the min when downflow. It will take the pattern similar as below. SAT_SUB (x, y) => (x - y) & (-(TYPE)(x >= y)); For example for uint8_t, we have * SAT_SUB (255, 0) => 255 * SAT_SUB (1, 2) => 0 * SAT_SUB (254, 255) => 0 * SAT_SUB (0, 255) => 0 Given below SAT_SUB for uint64 uint64_t sat_sub_u64 (uint64_t x, uint64_t y) { return (x + y) & (- (uint64_t)((x >= y))); } Before this patch: uint64_t sat_sub_u_0_uint64_t (uint64_t x, uint64_t y) { _Bool _1; long unsigned int _3; uint64_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY _1 = x_4(D) >= y_5(D); _3 = x_4(D) - y_5(D); _6 = _1 ? _3 : 0; return _6; ;; succ: EXIT } After this patch: uint64_t sat_sub_u_0_uint64_t (uint64_t x, uint64_t y) { uint64_t _6; ;; basic block 2, loop depth 0 ;; pred: ENTRY _6 = .SAT_SUB (x_4(D), y_5(D)); [tail call] return _6; ;; succ: EXIT } The below tests are running for this patch: *. The riscv fully regression tests. *. The x86 bootstrap tests. *. The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * internal-fn.def (SAT_SUB): Add new IFN define for SAT_SUB. * match.pd: Add new match for SAT_SUB. * optabs.def (OPTAB_NL): Remove fixed-point for ussub/ssub. * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_sub): Add new decl for generated in match.pd. (build_saturation_binary_arith_call): Add new helper function to build the gimple call to binary SAT alu. (match_saturation_arith): Rename from. (match_unsigned_saturation_add): Rename to. (match_unsigned_saturation_sub): Add new func to match the unsigned sat sub. (math_opts_dom_walker::after_dom_children): Add SAT_SUB matching try when COND_EXPR. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/internal-fn.def | 1 + gcc/match.pd | 14 ++++++++ gcc/optabs.def | 4 +-- gcc/tree-ssa-math-opts.cc | 67 +++++++++++++++++++++++++++------------ 4 files changed, 64 insertions(+), 22 deletions(-) diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 25badbb86e5..24539716e5b 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -276,6 +276,7 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first, smulhrs, umulhrs, binary) DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_SUB, ECF_CONST, first, sssub, ussub, binary) DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) diff --git a/gcc/match.pd b/gcc/match.pd index 024e3350465..3e334533ff8 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3086,6 +3086,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (match (unsigned_integer_sat_add @0 @1) (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1))) +/* Unsigned saturation sub, case 1 (branch with gt): + SAT_U_SUB = X > Y ? X - Y : 0 */ +(match (unsigned_integer_sat_sub @0 @1) + (cond (gt @0 @1) (minus @0 @1) integer_zerop) + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && types_match (type, @0, @1)))) + +/* Unsigned saturation sub, case 2 (branch with ge): + SAT_U_SUB = X >= Y ? X - Y : 0. */ +(match (unsigned_integer_sat_sub @0 @1) + (cond (ge @0 @1) (minus @0 @1) integer_zerop) + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && types_match (type, @0, @1)))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/optabs.def b/gcc/optabs.def index 3f2cb46aff8..bc2611abdc2 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -118,8 +118,8 @@ OPTAB_NX(sub_optab, "sub$F$a3") OPTAB_NX(sub_optab, "sub$Q$a3") OPTAB_VL(subv_optab, "subv$I$a3", MINUS, "sub", '3', gen_intv_fp_libfunc) OPTAB_VX(subv_optab, "sub$F$a3") -OPTAB_NL(sssub_optab, "sssub$Q$a3", SS_MINUS, "sssub", '3', gen_signed_fixed_libfunc) -OPTAB_NL(ussub_optab, "ussub$Q$a3", US_MINUS, "ussub", '3', gen_unsigned_fixed_libfunc) +OPTAB_NL(sssub_optab, "sssub$a3", SS_MINUS, "sssub", '3', gen_signed_fixed_libfunc) +OPTAB_NL(ussub_optab, "ussub$a3", US_MINUS, "ussub", '3', gen_unsigned_fixed_libfunc) OPTAB_NL(smul_optab, "mul$Q$a3", MULT, "mul", '3', gen_int_fp_fixed_libfunc) OPTAB_NX(smul_optab, "mul$P$a3") OPTAB_NX(smul_optab, "mul$F$a3") diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 62da1c5ee08..4717302b728 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -4087,33 +4087,56 @@ arith_overflow_check_p (gimple *stmt, gimple *cast_stmt, gimple *&use_stmt, } extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); +extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree)); + +static void +build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn, + tree lhs, tree op_0, tree op_1) +{ + if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) + { + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); + gimple_call_set_lhs (call, lhs); + gsi_replace (gsi, call, /* update_eh_info */ true); + } +} /* - * Try to match saturation arith pattern(s). - * 1. SAT_ADD (unsigned) - * _7 = _4 + _6; - * _8 = _4 > _7; - * _9 = (long unsigned int) _8; - * _10 = -_9; - * _12 = _7 | _10; - * => - * _12 = .SAT_ADD (_4, _6); */ + * Try to match saturation unsigned add. + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * => + * _12 = .SAT_ADD (_4, _6); */ + static void -match_saturation_arith (gimple_stmt_iterator *gsi, gassign *stmt) +match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gassign *stmt) { - gcall *call = NULL; + tree ops[2]; + tree lhs = gimple_assign_lhs (stmt); + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL)) + build_saturation_binary_arith_call (gsi, IFN_SAT_ADD, lhs, ops[0], ops[1]); +} + +/* + * Try to match saturation unsigned sub. + * _1 = _4 >= _5; + * _3 = _4 - _5; + * _6 = _1 ? _3 : 0; + * => + * _6 = .SAT_SUB (_4, _5); */ + +static void +match_unsigned_saturation_sub (gimple_stmt_iterator *gsi, gassign *stmt) +{ tree ops[2]; tree lhs = gimple_assign_lhs (stmt); - if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) - && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), - OPTIMIZE_FOR_BOTH)) - { - call = gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops[1]); - gimple_call_set_lhs (call, lhs); - gsi_replace (gsi, call, true); - } + if (gimple_unsigned_integer_sat_sub (lhs, ops, NULL)) + build_saturation_binary_arith_call (gsi, IFN_SAT_SUB, lhs, ops[0], ops[1]); } /* Recognize for unsigned x @@ -6078,7 +6101,7 @@ math_opts_dom_walker::after_dom_children (basic_block bb) break; case BIT_IOR_EXPR: - match_saturation_arith (&gsi, as_a (stmt)); + match_unsigned_saturation_add (&gsi, as_a (stmt)); /* fall-through */ case BIT_XOR_EXPR: match_uaddc_usubc (&gsi, stmt, code); @@ -6089,6 +6112,10 @@ math_opts_dom_walker::after_dom_children (basic_block bb) match_single_bit_test (&gsi, stmt); break; + case COND_EXPR: + match_unsigned_saturation_sub (&gsi, as_a (stmt)); + break; + default:; } }