From patchwork Sun Apr 7 07:03:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1920475 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=Jm4fg6h9; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VC39d4YL3z1yYj for ; Sun, 7 Apr 2024 17:04:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CFE833858C32 for ; Sun, 7 Apr 2024 07:04:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id ABB5E3858C32 for ; Sun, 7 Apr 2024 07:03:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ABB5E3858C32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ABB5E3858C32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712473424; cv=none; b=KkjcMKwuEDnG6Hb1GdN1zP12xIlslfwbAfzUMpbZrWJQOPbmzaEUZOuww1inJtlYm+CKrsyZ5lbMrWkPACoWZ6er+qoKPkNlLTpUQ9gd1MfD77SoFQG/Kfqua9SLvHYs/ffDkRljaQRq7/SUewU/yqXTU0Mi/VVIsIoIzUGS7e8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712473424; c=relaxed/simple; bh=aUBfPXGd35LQt6JKWASVgcMdzJafwYGWtzyzlikd0n4=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=W4tpn0oDo6UFFhdKM97mmifEeZ3+yMhaRyS3Hrwx4LXGU7N2XwlgWWFVPxCh4RXH8aHDoqwjwu3ktC61mfAHgw4Lm1Xl89fcBXalNL7nytDHUAUuTnuVTr15kdgoLXL1FGWqn1OBd66HIzuA1X5kejiz+vbYg31IARf3KjmRdb8= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712473417; x=1744009417; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aUBfPXGd35LQt6JKWASVgcMdzJafwYGWtzyzlikd0n4=; b=Jm4fg6h9wNNCLrKqioIYnRCpGfj/jen+hwzikxK3uIkbZG8O8+e6K8kY lAmH1f6Ds2k9g93up3QYZoEkWjm0AzrSdHUoXFPDa6wFflg9eDf4O3aFf IsDRQnfBSGLhTxAujPQIt9yXWnt96QOQt8YNmAHQCYuy3JbmWiLeZhhOj U59Nr1NSR6Wkyilz310eUxVW9WNlodC9pigK+NxJm9L74r4eKqqFFxVVv 8gCXL1T7TQDv7FBmoDzXq+B4lO/pH9xXZiF94cB4IqcyntHfQ8NhunmmI A0AjpgD5ewEhOhsXeoW5w+8mSuG2afp7K2UHbfkl42vuXV0yTRsj6vw3t w==; X-CSE-ConnectionGUID: PlS/IA5XQmWIn3R6+NF0Ww== X-CSE-MsgGUID: Zpa3rhmiTpK+J9Ab4eJEuA== X-IronPort-AV: E=McAfee;i="6600,9927,11036"; a="18902418" X-IronPort-AV: E=Sophos;i="6.07,184,1708416000"; d="scan'208";a="18902418" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Apr 2024 00:03:35 -0700 X-CSE-ConnectionGUID: 5d5DWAXRSYerlK5sMKnpuw== X-CSE-MsgGUID: neuDPtzwR7qdl9vfPlHP+g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,184,1708416000"; d="scan'208";a="42740226" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa002.fm.intel.com with ESMTP; 07 Apr 2024 00:03:27 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 2658E1006FFA; Sun, 7 Apr 2024 15:03:26 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, yanzhang.wang@intel.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v2] Internal-fn: Introduce new internal function SAT_ADD Date: Sun, 7 Apr 2024 15:03:24 +0800 Message-Id: <20240407070324.3829999-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240406120755.2692291-1-pan2.li@intel.com> References: <20240406120755.2692291-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_LOW, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_35_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;; pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;; succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;; pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;; succ: EXIT } For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 vse64.v v1,0(a0) ... To limit the patch size for review, only unsigned version of usadd3 are involved here. The signed version will be covered in the underlying patch(es). The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for unsigned SAT_ADD vector. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for unsigned SAT_ADD scalar. * config/riscv/vector.md: Allow VLS mode for vsaddu. * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD. * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. * match.pd: Add unsigned SAT_ADD match and simply. * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. * tree-vect-patterns.cc (vect_sat_add_build_call): New func impl to build the IFN_SAT_ADD gimple call. (vect_recog_sat_add_pattern): New func impl to recog the pattern for unsigned SAT_ADD. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c: New test. * gcc.target/riscv/sat_arith.h: New test. * gcc.target/riscv/sat_u_add-1.c: New test. * gcc.target/riscv/sat_u_add-2.c: New test. * gcc.target/riscv/sat_u_add-3.c: New test. * gcc.target/riscv/sat_u_add-4.c: New test. * gcc.target/riscv/sat_u_add-run-1.c: New test. * gcc.target/riscv/sat_u_add-run-10.c: New test. * gcc.target/riscv/sat_u_add-run-11.c: New test. * gcc.target/riscv/sat_u_add-run-12.c: New test. * gcc.target/riscv/sat_u_add-run-2.c: New test. * gcc.target/riscv/sat_u_add-run-3.c: New test. * gcc.target/riscv/sat_u_add-run-4.c: New test. * gcc.target/riscv/sat_u_add-run-5.c: New test. * gcc.target/riscv/sat_u_add-run-6.c: New test. * gcc.target/riscv/sat_u_add-run-7.c: New test. * gcc.target/riscv/sat_u_add-run-8.c: New test. * gcc.target/riscv/sat_u_add-run-9.c: New test. * gcc.target/riscv/scalar_sat_binary.h: New test. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 17 ++++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 16 ++++ gcc/config/riscv/riscv.cc | 47 +++++++++++ gcc/config/riscv/riscv.md | 11 +++ gcc/config/riscv/vector.md | 12 +-- gcc/internal-fn.cc | 1 + gcc/internal-fn.def | 3 + gcc/match.pd | 64 +++++++++++++++ gcc/optabs.def | 4 +- .../riscv/rvv/autovec/binop/vec_sat_binary.h | 33 ++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-1.c | 41 ++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-2.c | 44 +++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-3.c | 44 +++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-4.c | 44 +++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-1.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-10.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-11.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-12.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-2.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-3.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-4.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-5.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-6.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-7.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-8.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-9.c | 75 ++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_arith.h | 79 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-1.c | 44 +++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-2.c | 50 ++++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-3.c | 41 ++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-4.c | 38 +++++++++ .../gcc.target/riscv/sat_u_add-run-1.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-10.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-11.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-12.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-2.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-3.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-4.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-5.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-6.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-7.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-8.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-9.c | 25 ++++++ .../gcc.target/riscv/scalar_sat_binary.h | 27 +++++++ gcc/tree-vect-patterns.cc | 62 +++++++++++++++ 46 files changed, 1916 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 3b32369f68c..06a4c34863e 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2612,3 +2612,20 @@ (define_expand "rawmemchr" DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Saturation ALU. +;; ------------------------------------------------------------------------- +;; Includes: +;; - add +;; ------------------------------------------------------------------------- +(define_expand "usadd3" + [(match_operand:V_VLSI 0 "register_operand") + (match_operand:V_VLSI 1 "register_operand") + (match_operand:V_VLSI 2 "register_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_vec_usadd (operands[0], operands[1], operands[2], mode); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index b8735593805..fefd9a1c2c4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_usadd (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); @@ -619,6 +620,7 @@ void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode); +void expand_vec_usadd (rtx, rtx, rtx, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx), enum avl_type); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 814c5febabe..eadbc63431b 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4635,6 +4635,16 @@ emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, } } +static void +emit_vec_saddu (rtx op_dest, rtx op_1, rtx op_2, insn_type type, + machine_mode vec_mode) +{ + rtx ops[] = {op_dest, op_1, op_2}; + insn_code icode = code_for_pred (US_PLUS, vec_mode); + + emit_vlmax_insn (icode, type, ops); +} + void expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, machine_mode vec_int_mode) @@ -4862,6 +4872,12 @@ expand_vec_lfloor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, vec_int_mode); } +void +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); +} + /* Vectorize popcount by the Wilkes-Wheeler-Gill algorithm that libgcc uses as well. */ void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index fe9976bfffe..519a7684cc4 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -10840,6 +10840,53 @@ riscv_vector_mode_supported_any_target_p (machine_mode) return true; } +/* Implements the unsigned saturation add standard name usadd for int mode. */ + +void +riscv_expand_usadd (rtx dest, rtx x, rtx y) +{ + machine_mode mode = GET_MODE (dest); + rtx xmode_sum = gen_reg_rtx (Xmode); + rtx xmode_lt = gen_reg_rtx (Xmode); + rtx xmode_x = gen_lowpart (Xmode, x); + rtx xmode_y = gen_lowpart (Xmode, y); + rtx xmode_dest = gen_reg_rtx (Xmode); + + /* Step-1: sum = x + y */ + if (mode == SImode && mode != Xmode) + { /* Take addw to avoid the sum truncate. */ + rtx simode_sum = gen_reg_rtx (SImode); + riscv_emit_binary (PLUS, simode_sum, x, y); + emit_move_insn (xmode_sum, gen_lowpart (Xmode, simode_sum)); + } + else + riscv_emit_binary (PLUS, xmode_sum, xmode_x, xmode_y); + + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI. */ + if (mode == HImode || mode == QImode) + { + int shift_bits = GET_MODE_BITSIZE (Xmode) + - GET_MODE_BITSIZE (mode).to_constant (); + + gcc_assert (shift_bits > 0); + + riscv_emit_binary (ASHIFT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + riscv_emit_binary (LSHIFTRT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + } + + /* Step-2: lt = sum < x */ + riscv_emit_binary (LTU, xmode_lt, xmode_sum, xmode_x); + + /* Step-3: lt = -lt */ + riscv_emit_unary (NEG, xmode_lt, xmode_lt); + + /* Step-4: xmode_dest = sum | lt */ + riscv_emit_binary (IOR, xmode_dest, xmode_lt, xmode_sum); + + /* Step-5: dest = xmode_dest */ + emit_move_insn (dest, gen_lowpart (mode, xmode_dest)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 0346cc3859d..28d26579c3a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3839,6 +3839,17 @@ (define_insn "*large_load_address" [(set_attr "type" "load") (set (attr "length") (const_int 8))]) +(define_expand "usadd3" + [(match_operand:ANYI 0 "register_operand") + (match_operand:ANYI 1 "register_operand") + (match_operand:ANYI 2 "register_operand")] + "" + { + riscv_expand_usadd (operands[0], operands[1], operands[2]); + DONE; + } +) + (include "bitmanip.md") (include "crypto.md") (include "sync.md") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 8b1c24c5d79..58abc2a2f9e 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -4073,8 +4073,8 @@ (define_insn "@pred_trunc" ;; Saturating Add and Subtract (define_insn "@pred_" - [(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") - (if_then_else:VI + [(set (match_operand:V_VLSI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") + (if_then_else:V_VLSI (unspec: [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1, vm, vm,Wc1,Wc1") (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") @@ -4083,10 +4083,10 @@ (define_insn "@pred_" (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_sat_int_binop:VI - (match_operand:VI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") - (match_operand:VI 4 "" "")) - (match_operand:VI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (any_sat_int_binop:V_VLSI + (match_operand:V_VLSI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 4 "" "")) + (match_operand:V_VLSI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "@ v.vv\t%0,%3,%4%p1 diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5269f0ac528..e517ea7fbfb 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -4181,6 +4181,7 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_SAT_ADD: case IFN_VEC_WIDEN_PLUS: case IFN_VEC_WIDEN_PLUS_LO: case IFN_VEC_WIDEN_PLUS_HI: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 848bb9dbff3..47326b7033c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -275,6 +275,9 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | ECF_NOTHROW, first, DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first, smulhrs, umulhrs, binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST | ECF_NOTHROW, first, + ssadd, usadd, binary) + DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) diff --git a/gcc/match.pd b/gcc/match.pd index 15a1e7350d4..6f8cdf074ed 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3043,6 +3043,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || POINTER_TYPE_P (itype)) && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) +/* Unsigned Saturation Add */ +(match (usadd_left_part_1 @0 @1) + (plus:c @0 @1) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_1 @0 @1) + (negate (convert (lt (plus:c @0 @1) @0))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_2 @0 @1) + (negate (convert (gt @0 (plus:c @0 @1)))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +/* Unsigned saturation add. Case 1 (branchless): + SAT_U_ADD = (X + Y) | - ((X + Y) < X) or + SAT_U_ADD = (X + Y) | - (X > (X + Y)). */ +(simplify + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_1 @0 @1)) + (if (optimize) (IFN_SAT_ADD @0 @1))) +(simplify + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_2 @0 @1)) + (if (optimize) (IFN_SAT_ADD @0 @1))) + +/* Unsigned saturation add. Case 2 (branch): + SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1 or + SAT_U_ADD = x <= (X + Y) ? (X + Y) : -1. */ +(simplify + (cond (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep) + (if (optimize) (IFN_SAT_ADD @0 @1))) +(simplify + (cond (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep) + (if (optimize) (IFN_SAT_ADD @0 @1))) + +/* Vect recog pattern will leverage unsigned_integer_sat_add. */ +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_1 @0 @1)) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_2 @0 @1)) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (cond (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (cond (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep) + (if (optimize))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/optabs.def b/gcc/optabs.def index ad14f9328b9..3f2cb46aff8 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") OPTAB_NX(add_optab, "add$Q$a3") OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfunc) OPTAB_VX(addv_optab, "add$F$a3") -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_libfunc) OPTAB_NX(sub_optab, "sub$F$a3") OPTAB_NX(sub_optab, "sub$Q$a3") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h new file mode 100644 index 00000000000..0976ae97830 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h @@ -0,0 +1,33 @@ +#ifndef HAVE_DEFINED_VEC_SAT_BINARY +#define HAVE_DEFINED_VEC_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. defint N as the test array size, for example 16. + 3. define RUN_VEC_SAT_BINARY as run function. + 4. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i, k; + T out[N]; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + T *op_1 = test_data[i][0]; + T *op_2 = test_data[i][1]; + T *expect = test_data[i][2]; + + RUN_VEC_SAT_BINARY (T, out, op_1, op_2, N); + + for (k = 0; k < N; k++) + if (out[k] != expect[k]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c new file mode 100644 index 00000000000..4fb8b233ee9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint8_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint8_t) + +/* +** vec_sat_u_add_uint8_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint8_t) + +/* +** vec_sat_u_add_uint8_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint8_t, 0xffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c new file mode 100644 index 00000000000..10c112b77b0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint16_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint16_t) + +/* +** vec_sat_u_add_uint16_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint16_t) + +/* +** vec_sat_u_add_uint16_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint16_t, 0xffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c new file mode 100644 index 00000000000..281036ea0ee --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint32_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint32_t) + +/* +** vec_sat_u_add_uint32_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint32_t) + +/* +** vec_sat_u_add_uint32_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint32_t, 0xffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c new file mode 100644 index 00000000000..f392533f114 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint64_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint64_t) + +/* +** vec_sat_u_add_uint64_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint64_t) + +/* +** vec_sat_u_add_uint64_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint64_t, 0xffffffffffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c new file mode 100644 index 00000000000..1dcb333f687 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c new file mode 100644 index 00000000000..5a0e73303cf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c new file mode 100644 index 00000000000..b3efc9243e7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c new file mode 100644 index 00000000000..f478c244ff4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffffffffffffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c new file mode 100644 index 00000000000..dbf01ac863d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c new file mode 100644 index 00000000000..20ad2736403 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c new file mode 100644 index 00000000000..2f31edc527e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c new file mode 100644 index 00000000000..4201b31eb3e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c new file mode 100644 index 00000000000..35ec9ea3455 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c new file mode 100644 index 00000000000..8b1abdb4ba8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c new file mode 100644 index 00000000000..8c72b567590 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c new file mode 100644 index 00000000000..f454f3997ca --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h b/gcc/testsuite/gcc.target/riscv/sat_arith.h new file mode 100644 index 00000000000..f233c74acfd --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h @@ -0,0 +1,79 @@ +#ifndef HAVE_SAT_ARITH +#define HAVE_SAT_ARITH + +#include + +#define DEF_SAT_U_ADD_FMT_1(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_1 (T x, T y) \ +{ \ + return (x + y) | (-(T)((T)(x + y) < x)); \ +} + +#define DEF_SAT_U_ADD_FMT_2(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_2 (T x, T y) \ +{ \ + return (T)(x + y) >= x ? (x + y) : -1; \ +} + +#define DEF_SAT_U_ADD_FMT_3(T, MAX) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_3 (T x, T y) \ +{ \ + return (T)(x + y) >= x ? (x + y) : MAX; \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_1(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (x + y) | (-(T)((T)(x + y) < x)); \ + } \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_2(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (T)(x + y) >= x ? (x + y) : -1; \ + } \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_3(T, MAX) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (T)(x + y) >= x ? (x + y) : MAX; \ + } \ +} + +#define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y) +#define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y) +#define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y) + +#define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N) + +#define RUN_VEC_SAT_U_ADD_FMT_2(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_2(out, op_1, op_2, N) + +#define RUN_VEC_SAT_U_ADD_FMT_3(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_3(out, op_1, op_2, N) + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c new file mode 100644 index 00000000000..b348d93f938 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint8_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint8_t) + +/* +** sat_u_add_uint8_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint8_t) + +/* +** sat_u_add_uint8_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint8_t, 0xffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c new file mode 100644 index 00000000000..df54b984110 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint16_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint16_t) + +/* +** sat_u_add_uint16_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint16_t) + +/* +** sat_u_add_uint16_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint16_t, 0xffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c new file mode 100644 index 00000000000..6ff2e6ac52b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint32_t_fmt_1: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint32_t) + +/* +** sat_u_add_uint32_t_fmt_2: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint32_t) + +/* +** sat_u_add_uint32_t_fmt_3: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint32_t, 0xffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c new file mode 100644 index 00000000000..1585f9a231f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint64_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint64_t) + +/* +** sat_u_add_uint64_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint64_t) + +/* +** sat_u_add_uint64_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint64_t, 0xffffffffffffffff) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c new file mode 100644 index 00000000000..f1972490006 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c new file mode 100644 index 00000000000..0b675666dc0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c new file mode 100644 index 00000000000..ac9809e47c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c new file mode 100644 index 00000000000..110e9b14d7e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffffffffffffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c new file mode 100644 index 00000000000..cb3879d0cde --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c new file mode 100644 index 00000000000..c9a6080ca3b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c new file mode 100644 index 00000000000..c19b7e22387 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c new file mode 100644 index 00000000000..508531c09d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c new file mode 100644 index 00000000000..99b5c3a39f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c new file mode 100644 index 00000000000..13f59548935 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c new file mode 100644 index 00000000000..cdbea7b1b2c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c new file mode 100644 index 00000000000..04a857aed58 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h new file mode 100644 index 00000000000..cbb2d750107 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h @@ -0,0 +1,27 @@ +#ifndef HAVE_DEFINED_SCALAR_SAT_BINARY +#define HAVE_DEFINED_SCALAR_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. define RUN_SAT_BINARY as run function. + 3. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i; + T *d; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + d = test_data[i]; + + if (RUN_SAT_BINARY (T, d[0], d[1]) != d[2]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 4f491c6b833..44ca182cfa3 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -4498,6 +4498,67 @@ vect_recog_mult_pattern (vec_info *vinfo, return pattern_stmt; } +static gimple * +vect_sat_add_build_call (vec_info *vinfo, gimple *last_stmt, tree *type_out, + tree op_0, tree op_1) +{ + tree itype = TREE_TYPE (op_0); + tree vtype = get_vectype_for_scalar_type (vinfo, itype); + + if (vtype == NULL_TREE) + return NULL; + + if (!direct_internal_fn_supported_p (IFN_SAT_ADD, vtype, OPTIMIZE_FOR_SPEED)) + return NULL; + + *type_out = vtype; + + gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, op_0, op_1); + gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); + gimple_call_set_nothrow (call, /* nothrow_p */ true); + gimple_set_location (call, gimple_location (last_stmt)); + + vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); + + return call; +} + +/* + * Try to detect saturation add pattern (SAT_ADD), aka below gimple: + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * + * And then simplied to + * _12 = .SAT_ADD (_4, _6); + */ +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); + +static gimple * +vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gimple *last_stmt = stmt_vinfo->stmt; + + if (!is_gimple_assign (last_stmt)) + return NULL; + + tree res_ops[2]; + tree lhs = gimple_assign_lhs (last_stmt); + + if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL)) + { + gimple *call = vect_sat_add_build_call (vinfo, last_stmt, type_out, + res_ops[0], res_ops[1]); + if (call) + return call; + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: @@ -6998,6 +7059,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_vector_vector_shift_pattern, "vector_vector_shift" }, { vect_recog_divmod_pattern, "divmod" }, { vect_recog_mult_pattern, "mult" }, + { vect_recog_sat_add_pattern, "sat_add" }, { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, { vect_recog_gcond_pattern, "gcond" }, { vect_recog_bool_pattern, "bool" },