From patchwork Sat Apr 6 12:07:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1920391 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=StF/qGNA; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VBYzW318Vz1yYf for ; Sat, 6 Apr 2024 23:08:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 89B023858C31 for ; Sat, 6 Apr 2024 12:08:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) by sourceware.org (Postfix) with ESMTPS id 1D5443858D33 for ; Sat, 6 Apr 2024 12:08:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1D5443858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1D5443858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712405294; cv=none; b=ttGfe8BK1PPakluZD59WDz1PT9x9Hh44JxwP+zXzQbd6MlM0VCmmry9XCqq3Q5Al+Sh/RyKMyYVPPgAhcELXlqTshNlUZqUVEMVmbWiRWt5xfTwbSYVvlpohoG4VbVQiLS9HK/3SJ7jew55TTk7x0B7X/zNHm7tTwPB+ER41Z+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712405294; c=relaxed/simple; bh=YoChgrVBGk6FsIkpdWdzMtgWgkSrLq3AvwBul009tWs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=LPgEnvgvh+AS1FlOLFRv8/oCG3q56gknVedz2XY7OECnnyU1hRrLpzGWBif/3MgcuRzvTZZmBTnQ5zgTlPB1cUBco2IV5G1FSrMiVQeYLFgPzNYiWrhFB7AWkbXfK30ViLFHsHC49TbNleBa7I91HuKYEOEBXlkBXB8ViG4PrdE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712405288; x=1743941288; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=YoChgrVBGk6FsIkpdWdzMtgWgkSrLq3AvwBul009tWs=; b=StF/qGNAIZ49S0o2PYeomySKxhcTot+XufQMDlIY3ygv92lkSdtv4OYg wdkzqB8+ct/iXzCTUnOAPHYVyPWChpQ5Ri3e5pxL9xxTlT5r0tQvqtPRP sVAOpObTuUfiE2VO0Dg1hEXsYSY4PlzVzFowjBGYJ8oDo2HTDUiNtmbqc Dkiy/hxLZRDJwTaNL/eypsjqZpMIMI6RtoVrcUc3GOaqfYdcLQrnZoUeH 12Mxm0dCqW1OKUMY79enW/5+QrIWenBABsMS7B+sDFMHxu4Jf0T6ckuCa OtzeZvwwG6Sd5lWFJY0NWu3hS3qkXviepb4AM9HUz/yp79Wej+AfF+E9j g==; X-CSE-ConnectionGUID: C8mg1+6SStKksziOSpW1gw== X-CSE-MsgGUID: wlj6gdgnTB2zJTXSkZFniA== X-IronPort-AV: E=McAfee;i="6600,9927,11035"; a="18461762" X-IronPort-AV: E=Sophos;i="6.07,183,1708416000"; d="scan'208";a="18461762" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Apr 2024 05:08:07 -0700 X-CSE-ConnectionGUID: /bRHB2/kSqqVm3jl0BxgeQ== X-CSE-MsgGUID: AkDv5VNmRDG+ykqOowskAw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,183,1708416000"; d="scan'208";a="24052007" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa004.fm.intel.com with ESMTP; 06 Apr 2024 05:07:58 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9E0461005693; Sat, 6 Apr 2024 20:07:57 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, yanzhang.wang@intel.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v1] Internal-fn: Introduce new internal function SAT_ADD Date: Sat, 6 Apr 2024 20:07:55 +0800 Message-Id: <20240406120755.2692291-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_LOW, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_35_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;; pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;; succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;; pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;; succ: EXIT } For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 vse64.v v1,0(a0) ... To limit the patch size for review, only unsigned version of usadd3 are involved here. The signed version will be covered in the underlying patch(es). The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for unsigned SAT_ADD vector. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for unsigned SAT_ADD scalar. * config/riscv/vector.md: Allow VLS mode for vsaddu. * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD. * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. * match.pd: Add unsigned SAT_ADD match and simply. * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. * tree-vect-patterns.cc (vect_sat_add_build_call): New func impl to build the IFN_SAT_ADD gimple call. (vect_recog_sat_add_pattern): New func impl to recog the pattern for unsigned SAT_ADD. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c: New test. * gcc.target/riscv/sat_arith.h: New test. * gcc.target/riscv/sat_u_add-1.c: New test. * gcc.target/riscv/sat_u_add-2.c: New test. * gcc.target/riscv/sat_u_add-3.c: New test. * gcc.target/riscv/sat_u_add-4.c: New test. * gcc.target/riscv/sat_u_add-run-1.c: New test. * gcc.target/riscv/sat_u_add-run-10.c: New test. * gcc.target/riscv/sat_u_add-run-11.c: New test. * gcc.target/riscv/sat_u_add-run-12.c: New test. * gcc.target/riscv/sat_u_add-run-2.c: New test. * gcc.target/riscv/sat_u_add-run-3.c: New test. * gcc.target/riscv/sat_u_add-run-4.c: New test. * gcc.target/riscv/sat_u_add-run-5.c: New test. * gcc.target/riscv/sat_u_add-run-6.c: New test. * gcc.target/riscv/sat_u_add-run-7.c: New test. * gcc.target/riscv/sat_u_add-run-8.c: New test. * gcc.target/riscv/sat_u_add-run-9.c: New test. * gcc.target/riscv/scalar_sat_binary.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 17 ++++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 16 ++++ gcc/config/riscv/riscv.cc | 47 +++++++++++ gcc/config/riscv/riscv.md | 11 +++ gcc/config/riscv/vector.md | 12 +-- gcc/internal-fn.cc | 1 + gcc/internal-fn.def | 3 + gcc/match.pd | 64 +++++++++++++++ gcc/optabs.def | 4 +- .../riscv/rvv/autovec/binop/vec_sat_binary.h | 33 ++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-1.c | 41 ++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-2.c | 44 +++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-3.c | 44 +++++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-4.c | 44 +++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-1.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-10.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-11.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-12.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-2.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-3.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-4.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-5.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-6.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-7.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-8.c | 75 ++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-9.c | 75 ++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_arith.h | 79 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-1.c | 44 +++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-2.c | 50 ++++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-3.c | 41 ++++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-4.c | 38 +++++++++ .../gcc.target/riscv/sat_u_add-run-1.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-10.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-11.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-12.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-2.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-3.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-4.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-5.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-6.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-7.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-8.c | 25 ++++++ .../gcc.target/riscv/sat_u_add-run-9.c | 25 ++++++ .../gcc.target/riscv/scalar_sat_binary.h | 27 +++++++ gcc/tree-vect-patterns.cc | 61 ++++++++++++++ 46 files changed, 1915 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 3b32369f68c..06a4c34863e 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2612,3 +2612,20 @@ (define_expand "rawmemchr" DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Saturation ALU. +;; ------------------------------------------------------------------------- +;; Includes: +;; - add +;; ------------------------------------------------------------------------- +(define_expand "usadd3" + [(match_operand:V_VLSI 0 "register_operand") + (match_operand:V_VLSI 1 "register_operand") + (match_operand:V_VLSI 2 "register_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_vec_usadd (operands[0], operands[1], operands[2], mode); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index b8735593805..fefd9a1c2c4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_usadd (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); @@ -619,6 +620,7 @@ void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode); +void expand_vec_usadd (rtx, rtx, rtx, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx), enum avl_type); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 814c5febabe..eadbc63431b 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4635,6 +4635,16 @@ emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, } } +static void +emit_vec_saddu (rtx op_dest, rtx op_1, rtx op_2, insn_type type, + machine_mode vec_mode) +{ + rtx ops[] = {op_dest, op_1, op_2}; + insn_code icode = code_for_pred (US_PLUS, vec_mode); + + emit_vlmax_insn (icode, type, ops); +} + void expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, machine_mode vec_int_mode) @@ -4862,6 +4872,12 @@ expand_vec_lfloor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, vec_int_mode); } +void +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); +} + /* Vectorize popcount by the Wilkes-Wheeler-Gill algorithm that libgcc uses as well. */ void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index fe9976bfffe..519a7684cc4 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -10840,6 +10840,53 @@ riscv_vector_mode_supported_any_target_p (machine_mode) return true; } +/* Implements the unsigned saturation add standard name usadd for int mode. */ + +void +riscv_expand_usadd (rtx dest, rtx x, rtx y) +{ + machine_mode mode = GET_MODE (dest); + rtx xmode_sum = gen_reg_rtx (Xmode); + rtx xmode_lt = gen_reg_rtx (Xmode); + rtx xmode_x = gen_lowpart (Xmode, x); + rtx xmode_y = gen_lowpart (Xmode, y); + rtx xmode_dest = gen_reg_rtx (Xmode); + + /* Step-1: sum = x + y */ + if (mode == SImode && mode != Xmode) + { /* Take addw to avoid the sum truncate. */ + rtx simode_sum = gen_reg_rtx (SImode); + riscv_emit_binary (PLUS, simode_sum, x, y); + emit_move_insn (xmode_sum, gen_lowpart (Xmode, simode_sum)); + } + else + riscv_emit_binary (PLUS, xmode_sum, xmode_x, xmode_y); + + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI. */ + if (mode == HImode || mode == QImode) + { + int shift_bits = GET_MODE_BITSIZE (Xmode) + - GET_MODE_BITSIZE (mode).to_constant (); + + gcc_assert (shift_bits > 0); + + riscv_emit_binary (ASHIFT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + riscv_emit_binary (LSHIFTRT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + } + + /* Step-2: lt = sum < x */ + riscv_emit_binary (LTU, xmode_lt, xmode_sum, xmode_x); + + /* Step-3: lt = -lt */ + riscv_emit_unary (NEG, xmode_lt, xmode_lt); + + /* Step-4: xmode_dest = sum | lt */ + riscv_emit_binary (IOR, xmode_dest, xmode_lt, xmode_sum); + + /* Step-5: dest = xmode_dest */ + emit_move_insn (dest, gen_lowpart (mode, xmode_dest)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 0346cc3859d..28d26579c3a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3839,6 +3839,17 @@ (define_insn "*large_load_address" [(set_attr "type" "load") (set (attr "length") (const_int 8))]) +(define_expand "usadd3" + [(match_operand:ANYI 0 "register_operand") + (match_operand:ANYI 1 "register_operand") + (match_operand:ANYI 2 "register_operand")] + "" + { + riscv_expand_usadd (operands[0], operands[1], operands[2]); + DONE; + } +) + (include "bitmanip.md") (include "crypto.md") (include "sync.md") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 8b1c24c5d79..58abc2a2f9e 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -4073,8 +4073,8 @@ (define_insn "@pred_trunc" ;; Saturating Add and Subtract (define_insn "@pred_" - [(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") - (if_then_else:VI + [(set (match_operand:V_VLSI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") + (if_then_else:V_VLSI (unspec: [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1, vm, vm,Wc1,Wc1") (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") @@ -4083,10 +4083,10 @@ (define_insn "@pred_" (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_sat_int_binop:VI - (match_operand:VI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") - (match_operand:VI 4 "" "")) - (match_operand:VI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (any_sat_int_binop:V_VLSI + (match_operand:V_VLSI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 4 "" "")) + (match_operand:V_VLSI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "@ v.vv\t%0,%3,%4%p1 diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5269f0ac528..e517ea7fbfb 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -4181,6 +4181,7 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_SAT_ADD: case IFN_VEC_WIDEN_PLUS: case IFN_VEC_WIDEN_PLUS_LO: case IFN_VEC_WIDEN_PLUS_HI: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 848bb9dbff3..47326b7033c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -275,6 +275,9 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | ECF_NOTHROW, first, DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first, smulhrs, umulhrs, binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST | ECF_NOTHROW, first, + ssadd, usadd, binary) + DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) diff --git a/gcc/match.pd b/gcc/match.pd index 15a1e7350d4..6f8cdf074ed 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3043,6 +3043,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || POINTER_TYPE_P (itype)) && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) +/* Unsigned Saturation Add */ +(match (usadd_left_part_1 @0 @1) + (plus:c @0 @1) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_1 @0 @1) + (negate (convert (lt (plus:c @0 @1) @0))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +(match (usadd_right_part_2 @0 @1) + (negate (convert (gt @0 (plus:c @0 @1)))) + (if (INTEGRAL_TYPE_P (type) + && TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1))))) + +/* Unsigned saturation add. Case 1 (branchless): + SAT_U_ADD = (X + Y) | - ((X + Y) < X) or + SAT_U_ADD = (X + Y) | - (X > (X + Y)). */ +(simplify + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_1 @0 @1)) + (if (optimize) (IFN_SAT_ADD @0 @1))) +(simplify + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_2 @0 @1)) + (if (optimize) (IFN_SAT_ADD @0 @1))) + +/* Unsigned saturation add. Case 2 (branch): + SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1 or + SAT_U_ADD = x <= (X + Y) ? (X + Y) : -1. */ +(simplify + (cond (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep) + (if (optimize) (IFN_SAT_ADD @0 @1))) +(simplify + (cond (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep) + (if (optimize) (IFN_SAT_ADD @0 @1))) + +/* Vect recog pattern will leverage unsigned_integer_sat_add. */ +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_1 @0 @1)) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (bit_ior:c + (usadd_left_part_1 @0 @1) + (usadd_right_part_2 @0 @1)) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (cond (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep) + (if (optimize))) +(match (unsigned_integer_sat_add @0 @1) + (cond (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep) + (if (optimize))) + /* x > y && x != XXX_MIN --> x > y x > y && x == XXX_MIN --> false . */ (for eqne (eq ne) diff --git a/gcc/optabs.def b/gcc/optabs.def index ad14f9328b9..3f2cb46aff8 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") OPTAB_NX(add_optab, "add$Q$a3") OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfunc) OPTAB_VX(addv_optab, "add$F$a3") -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', gen_signed_fixed_libfunc) +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', gen_unsigned_fixed_libfunc) OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_libfunc) OPTAB_NX(sub_optab, "sub$F$a3") OPTAB_NX(sub_optab, "sub$Q$a3") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h new file mode 100644 index 00000000000..0976ae97830 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h @@ -0,0 +1,33 @@ +#ifndef HAVE_DEFINED_VEC_SAT_BINARY +#define HAVE_DEFINED_VEC_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. defint N as the test array size, for example 16. + 3. define RUN_VEC_SAT_BINARY as run function. + 4. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i, k; + T out[N]; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + T *op_1 = test_data[i][0]; + T *op_2 = test_data[i][1]; + T *expect = test_data[i][2]; + + RUN_VEC_SAT_BINARY (T, out, op_1, op_2, N); + + for (k = 0; k < N; k++) + if (out[k] != expect[k]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c new file mode 100644 index 00000000000..4fb8b233ee9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint8_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint8_t) + +/* +** vec_sat_u_add_uint8_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint8_t) + +/* +** vec_sat_u_add_uint8_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint8_t, 0xffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c new file mode 100644 index 00000000000..10c112b77b0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint16_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint16_t) + +/* +** vec_sat_u_add_uint16_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint16_t) + +/* +** vec_sat_u_add_uint16_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint16_t, 0xffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c new file mode 100644 index 00000000000..281036ea0ee --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint32_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint32_t) + +/* +** vec_sat_u_add_uint32_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint32_t) + +/* +** vec_sat_u_add_uint32_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint32_t, 0xffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c new file mode 100644 index 00000000000..f392533f114 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint64_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint64_t) + +/* +** vec_sat_u_add_uint64_t_fmt_2: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_2(uint64_t) + +/* +** vec_sat_u_add_uint64_t_fmt_3: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_3(uint64_t, 0xffffffffffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 12 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c new file mode 100644 index 00000000000..1dcb333f687 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c new file mode 100644 index 00000000000..5a0e73303cf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c new file mode 100644 index 00000000000..b3efc9243e7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c new file mode 100644 index 00000000000..f478c244ff4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffffffffffffffffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c new file mode 100644 index 00000000000..dbf01ac863d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c new file mode 100644 index 00000000000..20ad2736403 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c new file mode 100644 index 00000000000..2f31edc527e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c new file mode 100644 index 00000000000..4201b31eb3e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c new file mode 100644 index 00000000000..35ec9ea3455 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c new file mode 100644 index 00000000000..8b1abdb4ba8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c new file mode 100644 index 00000000000..8c72b567590 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_2 + +DEF_VEC_SAT_U_ADD_FMT_2(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c new file mode 100644 index 00000000000..f454f3997ca --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_3 + +DEF_VEC_SAT_U_ADD_FMT_3(T, 0xffu) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h b/gcc/testsuite/gcc.target/riscv/sat_arith.h new file mode 100644 index 00000000000..f233c74acfd --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h @@ -0,0 +1,79 @@ +#ifndef HAVE_SAT_ARITH +#define HAVE_SAT_ARITH + +#include + +#define DEF_SAT_U_ADD_FMT_1(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_1 (T x, T y) \ +{ \ + return (x + y) | (-(T)((T)(x + y) < x)); \ +} + +#define DEF_SAT_U_ADD_FMT_2(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_2 (T x, T y) \ +{ \ + return (T)(x + y) >= x ? (x + y) : -1; \ +} + +#define DEF_SAT_U_ADD_FMT_3(T, MAX) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_3 (T x, T y) \ +{ \ + return (T)(x + y) >= x ? (x + y) : MAX; \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_1(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (x + y) | (-(T)((T)(x + y) < x)); \ + } \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_2(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (T)(x + y) >= x ? (x + y) : -1; \ + } \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_3(T, MAX) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (T)(x + y) >= x ? (x + y) : MAX; \ + } \ +} + +#define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y) +#define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y) +#define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y) + +#define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N) + +#define RUN_VEC_SAT_U_ADD_FMT_2(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_2(out, op_1, op_2, N) + +#define RUN_VEC_SAT_U_ADD_FMT_3(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_3(out, op_1, op_2, N) + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c new file mode 100644 index 00000000000..b348d93f938 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint8_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint8_t) + +/* +** sat_u_add_uint8_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint8_t) + +/* +** sat_u_add_uint8_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint8_t, 0xffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c new file mode 100644 index 00000000000..df54b984110 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint16_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint16_t) + +/* +** sat_u_add_uint16_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint16_t) + +/* +** sat_u_add_uint16_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint16_t, 0xffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c new file mode 100644 index 00000000000..6ff2e6ac52b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint32_t_fmt_1: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint32_t) + +/* +** sat_u_add_uint32_t_fmt_2: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint32_t) + +/* +** sat_u_add_uint32_t_fmt_3: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint32_t, 0xffffffffu) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c new file mode 100644 index 00000000000..1585f9a231f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint64_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint64_t) + +/* +** sat_u_add_uint64_t_fmt_2: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_2(uint64_t) + +/* +** sat_u_add_uint64_t_fmt_3: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_3(uint64_t, 0xffffffffffffffff) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 6 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c new file mode 100644 index 00000000000..f1972490006 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c new file mode 100644 index 00000000000..0b675666dc0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c new file mode 100644 index 00000000000..ac9809e47c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c new file mode 100644 index 00000000000..110e9b14d7e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffffffffffffffffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c new file mode 100644 index 00000000000..cb3879d0cde --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c new file mode 100644 index 00000000000..c9a6080ca3b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c new file mode 100644 index 00000000000..c19b7e22387 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c new file mode 100644 index 00000000000..508531c09d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c new file mode 100644 index 00000000000..99b5c3a39f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c new file mode 100644 index 00000000000..13f59548935 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c new file mode 100644 index 00000000000..cdbea7b1b2c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_2 + +DEF_SAT_U_ADD_FMT_2(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c new file mode 100644 index 00000000000..04a857aed58 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_3 + +DEF_SAT_U_ADD_FMT_3(T, 0xffu) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h new file mode 100644 index 00000000000..cbb2d750107 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h @@ -0,0 +1,27 @@ +#ifndef HAVE_DEFINED_SCALAR_SAT_BINARY +#define HAVE_DEFINED_SCALAR_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. define RUN_SAT_BINARY as run function. + 3. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i; + T *d; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + d = test_data[i]; + + if (RUN_SAT_BINARY (T, d[0], d[1]) != d[2]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 4f491c6b833..51b53dbb16f 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-vector-builder.h" #include "vec-perm-indices.h" #include "gimple-range.h" +#include "gimple-match-auto.h" /* TODO: Note the vectorizer still builds COND_EXPRs with GENERIC compares @@ -4498,6 +4499,65 @@ vect_recog_mult_pattern (vec_info *vinfo, return pattern_stmt; } +static gimple * +vect_sat_add_build_call (vec_info *vinfo, gimple *last_stmt, tree *type_out, + tree op_0, tree op_1) +{ + tree itype = TREE_TYPE (op_0); + tree vtype = get_vectype_for_scalar_type (vinfo, itype); + + if (vtype == NULL_TREE) + return NULL; + + if (!direct_internal_fn_supported_p (IFN_SAT_ADD, vtype, OPTIMIZE_FOR_SPEED)) + return NULL; + + *type_out = vtype; + + gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, op_0, op_1); + gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); + gimple_call_set_nothrow (call, /* nothrow_p */ true); + gimple_set_location (call, gimple_location (last_stmt)); + + vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); + + return call; +} + +/* + * Try to detect saturation add pattern (SAT_ADD), aka below gimple: + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * + * And then simplied to + * _12 = .SAT_ADD (_4, _6); + */ +static gimple * +vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gimple *last_stmt = stmt_vinfo->stmt; + + if (!is_gimple_assign (last_stmt)) + return NULL; + + tree res_ops[2]; + tree lhs = gimple_assign_lhs (last_stmt); + + if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL)) + { + gimple *call = vect_sat_add_build_call (vinfo, last_stmt, type_out, + res_ops[0], res_ops[1]); + if (call) + return call; + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: @@ -6998,6 +7058,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_vector_vector_shift_pattern, "vector_vector_shift" }, { vect_recog_divmod_pattern, "divmod" }, { vect_recog_mult_pattern, "mult" }, + { vect_recog_sat_add_pattern, "sat_add" }, { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, { vect_recog_gcond_pattern, "gcond" }, { vect_recog_bool_pattern, "bool" }, From patchwork Mon May 6 14:49:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1932050 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=nc9VMW3h; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VY47n0tvbz1xnT for ; Tue, 7 May 2024 00:49:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4B874385B529 for ; Mon, 6 May 2024 14:49:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by sourceware.org (Postfix) with ESMTPS id 6871B3858D1E for ; Mon, 6 May 2024 14:49:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6871B3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6871B3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715006977; cv=none; b=KIAHyexZRrlw6IsyWFf9KVQ5T34bXno/AdqqXlMRaO/lAliH0n8OctbYB3rd7onreWz53GcEepDcUIwPS2gLlzn5PLib0qeof+I7J+B00EGBXowLrIz69h3iLue+wwEwh+4lBECbSLQgf+MGqAsvFPW6ksH7CovaUUaRhnGjhBY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715006977; c=relaxed/simple; bh=GPeL/yIl4hXgwZQKyFiysmhtPqnIIScvUHFtJqCIqWo=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=fiTzSg/o9dEDONJtqsAuvVSEO/ZF+9Lxtd2ZXImgxeFJNonMDNukZD+dHWosi2ueifprUswYQTUMzFZK7mIM2NcwuFEa39Y6Jhe2l6N3JOz1OV31Ik6e5M4AUJoJAUQl7xg4yFjRUImoD2khTLdPzk2GG6plQsdnr2K+tt48KRQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715006975; x=1746542975; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GPeL/yIl4hXgwZQKyFiysmhtPqnIIScvUHFtJqCIqWo=; b=nc9VMW3hSINxN3E4XoOYBx/OHmTtftiDCWSA8rrOBzZXKms02W1gOsoq yEnGeptZNhdC00GmTxYRgp1Wruinanv5tVRMWECqpg1xqduAvyUkC7Juy UZi6bx0ySBxSH7z2Df8R2NLVsFheCyRLuNPbAYOUga3zw+RGnsp+OWaZc EHdXSEaxxYuMaK4zHkGCdN3I+UXnwVGG4wt+9t7QM7PLlSC6Dxok5gpGD KS/wxMiJdb07IsLNONL7BeIptiBT8dj7tUzSNfqw0EkjmWmsJY7IvUPQ9 BT4w6E0i19yNrJcBnwvReHuoyMLhMOpb8xGdeuELYXNXfhpBe66IT9t/N w==; X-CSE-ConnectionGUID: UItefp5hQA+lF/eLXWJAvg== X-CSE-MsgGUID: opJMzzLLRvS5EaNAkTKPeQ== X-IronPort-AV: E=McAfee;i="6600,9927,11065"; a="36140674" X-IronPort-AV: E=Sophos;i="6.07,258,1708416000"; d="scan'208";a="36140674" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2024 07:49:33 -0700 X-CSE-ConnectionGUID: en9e2zHbTSut36ZMbpY5zQ== X-CSE-MsgGUID: mKZsqyEuT/ydfrmqOi7d7A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,258,1708416000"; d="scan'208";a="32991552" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa004.jf.intel.com with ESMTP; 06 May 2024 07:49:30 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 36D461007344; Mon, 6 May 2024 22:49:29 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int Date: Mon, 6 May 2024 22:49:27 +0800 Message-Id: <20240506144927.726990-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240406120755.2692291-1-pan2.li@intel.com> References: <20240406120755.2692291-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch depends on below scalar enabling patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * tree-vect-patterns.cc (gimple_unsigned_integer_sat_add): New func decl generated by match.pd match. (vect_recog_sat_add_pattern): New func impl to recog the pattern for unsigned SAT_ADD. Signed-off-by: Pan Li --- gcc/tree-vect-patterns.cc | 51 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 87c2acff386..8ffcaf71d5c 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -4487,6 +4487,56 @@ vect_recog_mult_pattern (vec_info *vinfo, return pattern_stmt; } +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)); + +/* + * Try to detect saturation add pattern (SAT_ADD), aka below gimple: + * _7 = _4 + _6; + * _8 = _4 > _7; + * _9 = (long unsigned int) _8; + * _10 = -_9; + * _12 = _7 | _10; + * + * And then simplied to + * _12 = .SAT_ADD (_4, _6); + */ + +static gimple * +vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo); + + if (!is_gimple_assign (last_stmt)) + return NULL; + + tree res_ops[2]; + tree lhs = gimple_assign_lhs (last_stmt); + + if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL)) + { + tree itype = TREE_TYPE (res_ops[0]); + tree vtype = get_vectype_for_scalar_type (vinfo, itype); + + if (vtype != NULL_TREE && direct_internal_fn_supported_p ( + IFN_SAT_ADD, vtype, OPTIMIZE_FOR_SPEED)) + { + *type_out = vtype; + gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, res_ops[0], + res_ops[1]); + + gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL)); + gimple_call_set_nothrow (call, /* nothrow_p */ false); + gimple_set_location (call, gimple_location (last_stmt)); + + vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt); + return call; + } + } + + return NULL; +} + /* Detect a signed division by a constant that wouldn't be otherwise vectorized: @@ -6987,6 +7037,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_vector_vector_shift_pattern, "vector_vector_shift" }, { vect_recog_divmod_pattern, "divmod" }, { vect_recog_mult_pattern, "mult" }, + { vect_recog_sat_add_pattern, "sat_add" }, { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, { vect_recog_gcond_pattern, "gcond" }, { vect_recog_bool_pattern, "bool" }, From patchwork Mon May 6 14:50:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1932051 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=PACixDOS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VY48s3sTvz1ydW for ; Tue, 7 May 2024 00:50:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1ABD1385EC56 for ; Mon, 6 May 2024 14:50:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by sourceware.org (Postfix) with ESMTPS id 95AFC385829B for ; Mon, 6 May 2024 14:50:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 95AFC385829B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 95AFC385829B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715007029; cv=none; b=JH2KByec+MXEs0rGZ5KiZg2hX4A9L+149fYLVdvlypjxqHr3Rl4RhKv4ahgvfqwkfiXzTQoVpo7ad4Dv8EBI2lWv17tOuF/9Q8lezBu/PPFOGunK5dx/FuShgSORzPt3LhQ/Z1FDxZIKgylUqpbX6bxMTe14t1m61m7ZkTEfUC4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715007029; c=relaxed/simple; bh=3azZsoZbkB8VjYplEzQN3sPsh0qoDOwAUJLsslASkBA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=O/RZhHBAOsZ0JZz8GWcEJVkCF6lTI1VDyRETyVbDruK01WXgDg1Pi9fSC8Qv6mBqOHs6B5W8rozUBWbk+psCTpIuRtB8hC4XKFNriOibqEu7bz/oDgcm1YHb59e4E9Z/2sygxTfGaR9eqzeJQHFbmLFx7x3iK91KykH3XMJ6PeQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715007025; x=1746543025; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3azZsoZbkB8VjYplEzQN3sPsh0qoDOwAUJLsslASkBA=; b=PACixDOSITvHCVfJjst3gKPYlbeCjQDGfei2MltQRv4WW67Tr3vUFb0R y9zUZzPSWEEC3do3giqR41eHx665oRPwEDRdZpr+rYdsDeCp1V1zypYE0 F8k/tMcD3IjT+ss9LNSCIl1Vcg6eYI3qRPGjOo8SB9+atKj0C1x/ROt59 BvbPN79FtA8hJ3Eif71/6Eks9qeEovQtVlg6uqN544f2WQ9Wf9a1Eba/H PQLXUmBCCYxLh2INt7lWp7TukNrdsaIPGdCdrhAPq/86bUpW5oJCXLADK ukUFnHMlGDNiPuo+hSu09pEt53HoUSSoHJgewkl1n+2P+7VkFfS2GvNYe A==; X-CSE-ConnectionGUID: IuhTXFD4SBmhLCe6G1zg7A== X-CSE-MsgGUID: YDPYD/xpTF+UZzQB5yteVg== X-IronPort-AV: E=McAfee;i="6600,9927,11065"; a="33261192" X-IronPort-AV: E=Sophos;i="6.07,258,1708416000"; d="scan'208";a="33261192" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2024 07:50:15 -0700 X-CSE-ConnectionGUID: M1KJk4zJRcWPJmHW5tDrmg== X-CSE-MsgGUID: vfNNrNNLTVKj3W/a8DQt6A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,258,1708416000"; d="scan'208";a="32675393" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa003.fm.intel.com with ESMTP; 06 May 2024 07:50:10 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 72B161007344; Mon, 6 May 2024 22:50:09 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, tamar.christina@arm.com, richard.guenther@gmail.com, hongtao.liu@intel.com, Pan Li Subject: [PATCH v4 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector Date: Mon, 6 May 2024 22:50:07 +0800 Message-Id: <20240506145007.727974-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240406120755.2692291-1-pan2.li@intel.com> References: <20240406120755.2692291-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, KAM_SHORT, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch depends on below middle-end enabling patches for scalar and vector. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650823.html The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below vector as example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) slli a4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 <= Vector Single-Width Saturating Add vse64.v v1,0(a0) ... The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for the unsigned SAT_ADD in vector mode. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd3 for scalar. * config/riscv/riscv.md (usadd3): New pattern expand for the unsigned SAT_ADD in scalar mode. * config/riscv/vector.md: Allow VLS mode for vsaddu. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: New test. * gcc.target/riscv/sat_arith.h: New test. * gcc.target/riscv/sat_u_add-1.c: New test. * gcc.target/riscv/sat_u_add-2.c: New test. * gcc.target/riscv/sat_u_add-3.c: New test. * gcc.target/riscv/sat_u_add-4.c: New test. * gcc.target/riscv/sat_u_add-run-1.c: New test. * gcc.target/riscv/sat_u_add-run-2.c: New test. * gcc.target/riscv/sat_u_add-run-3.c: New test. * gcc.target/riscv/sat_u_add-run-4.c: New test. * gcc.target/riscv/scalar_sat_binary.h: New test. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 17 +++++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-v.cc | 16 ++++ gcc/config/riscv/riscv.cc | 47 ++++++++++++ gcc/config/riscv/riscv.md | 11 +++ gcc/config/riscv/vector.md | 12 +-- .../riscv/rvv/autovec/binop/vec_sat_binary.h | 33 ++++++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-1.c | 19 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-2.c | 20 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-3.c | 20 +++++ .../riscv/rvv/autovec/binop/vec_sat_u_add-4.c | 20 +++++ .../rvv/autovec/binop/vec_sat_u_add-run-1.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-2.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-3.c | 75 +++++++++++++++++++ .../rvv/autovec/binop/vec_sat_u_add-run-4.c | 75 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/sat_arith.h | 31 ++++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-1.c | 19 +++++ gcc/testsuite/gcc.target/riscv/sat_u_add-2.c | 21 ++++++ gcc/testsuite/gcc.target/riscv/sat_u_add-3.c | 18 +++++ gcc/testsuite/gcc.target/riscv/sat_u_add-4.c | 17 +++++ .../gcc.target/riscv/sat_u_add-run-1.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-2.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-3.c | 25 +++++++ .../gcc.target/riscv/sat_u_add-run-4.c | 25 +++++++ .../gcc.target/riscv/scalar_sat_binary.h | 27 +++++++ 25 files changed, 744 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index aa1ae0fe075..7ceeb8d64f6 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2612,3 +2612,20 @@ (define_expand "rawmemchr" DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- [INT] Saturation ALU. +;; ------------------------------------------------------------------------- +;; Includes: +;; - add +;; ------------------------------------------------------------------------- +(define_expand "usadd3" + [(match_operand:V_VLSI 0 "register_operand") + (match_operand:V_VLSI 1 "register_operand") + (match_operand:V_VLSI 2 "register_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_vec_usadd (operands[0], operands[1], operands[2], mode); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index e5aebf3fc3d..0d95ecb6508 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -133,6 +133,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern void riscv_expand_usadd (rtx, rtx, rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); @@ -621,6 +622,7 @@ void expand_vec_lrint (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lround (rtx, rtx, machine_mode, machine_mode, machine_mode); void expand_vec_lceil (rtx, rtx, machine_mode, machine_mode); void expand_vec_lfloor (rtx, rtx, machine_mode, machine_mode); +void expand_vec_usadd (rtx, rtx, rtx, machine_mode); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx), enum avl_type); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index c9e0feebca6..c34111f89b8 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4635,6 +4635,16 @@ emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask, } } +static void +emit_vec_saddu (rtx op_dest, rtx op_1, rtx op_2, insn_type type, + machine_mode vec_mode) +{ + rtx ops[] = {op_dest, op_1, op_2}; + insn_code icode = code_for_pred (US_PLUS, vec_mode); + + emit_vlmax_insn (icode, type, ops); +} + void expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, machine_mode vec_int_mode) @@ -4862,6 +4872,12 @@ expand_vec_lfloor (rtx op_0, rtx op_1, machine_mode vec_fp_mode, vec_int_mode); } +void +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) +{ + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); +} + /* Vectorize popcount by the Wilkes-Wheeler-Gill algorithm that libgcc uses as well. */ void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 44945d47fd6..4445d77d5a9 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -11127,6 +11127,53 @@ riscv_get_raw_result_mode (int regno) return default_get_reg_raw_mode (regno); } +/* Implements the unsigned saturation add standard name usadd for int mode. */ + +void +riscv_expand_usadd (rtx dest, rtx x, rtx y) +{ + machine_mode mode = GET_MODE (dest); + rtx xmode_sum = gen_reg_rtx (Xmode); + rtx xmode_lt = gen_reg_rtx (Xmode); + rtx xmode_x = gen_lowpart (Xmode, x); + rtx xmode_y = gen_lowpart (Xmode, y); + rtx xmode_dest = gen_reg_rtx (Xmode); + + /* Step-1: sum = x + y */ + if (mode == SImode && mode != Xmode) + { /* Take addw to avoid the sum truncate. */ + rtx simode_sum = gen_reg_rtx (SImode); + riscv_emit_binary (PLUS, simode_sum, x, y); + emit_move_insn (xmode_sum, gen_lowpart (Xmode, simode_sum)); + } + else + riscv_emit_binary (PLUS, xmode_sum, xmode_x, xmode_y); + + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI. */ + if (mode == HImode || mode == QImode) + { + int shift_bits = GET_MODE_BITSIZE (Xmode) + - GET_MODE_BITSIZE (mode).to_constant (); + + gcc_assert (shift_bits > 0); + + riscv_emit_binary (ASHIFT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + riscv_emit_binary (LSHIFTRT, xmode_sum, xmode_sum, GEN_INT (shift_bits)); + } + + /* Step-2: lt = sum < x */ + riscv_emit_binary (LTU, xmode_lt, xmode_sum, xmode_x); + + /* Step-3: lt = -lt */ + riscv_emit_unary (NEG, xmode_lt, xmode_lt); + + /* Step-4: xmode_dest = sum | lt */ + riscv_emit_binary (IOR, xmode_dest, xmode_lt, xmode_sum); + + /* Step-5: dest = xmode_dest */ + emit_move_insn (dest, gen_lowpart (mode, xmode_dest)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index d4676507b45..b8639e9bc15 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -4006,6 +4006,17 @@ (define_insn "*large_load_address" [(set_attr "type" "load") (set (attr "length") (const_int 8))]) +(define_expand "usadd3" + [(match_operand:ANYI 0 "register_operand") + (match_operand:ANYI 1 "register_operand") + (match_operand:ANYI 2 "register_operand")] + "" + { + riscv_expand_usadd (operands[0], operands[1], operands[2]); + DONE; + } +) + (include "bitmanip.md") (include "crypto.md") (include "sync.md") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 228d0f9a766..f8ed61f4a13 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -4062,8 +4062,8 @@ (define_insn "@pred_trunc" ;; Saturating Add and Subtract (define_insn "@pred_" - [(set (match_operand:VI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") - (if_then_else:VI + [(set (match_operand:V_VLSI 0 "register_operand" "=vd, vd, vr, vr, vd, vd, vr, vr") + (if_then_else:V_VLSI (unspec: [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1, vm, vm,Wc1,Wc1") (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") @@ -4072,10 +4072,10 @@ (define_insn "@pred_" (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_sat_int_binop:VI - (match_operand:VI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") - (match_operand:VI 4 "" "")) - (match_operand:VI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (any_sat_int_binop:V_VLSI + (match_operand:V_VLSI 3 "" " vr, vr, vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 4 "" "")) + (match_operand:V_VLSI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "@ v.vv\t%0,%3,%4%p1 diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h new file mode 100644 index 00000000000..0976ae97830 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h @@ -0,0 +1,33 @@ +#ifndef HAVE_DEFINED_VEC_SAT_BINARY +#define HAVE_DEFINED_VEC_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. defint N as the test array size, for example 16. + 3. define RUN_VEC_SAT_BINARY as run function. + 4. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i, k; + T out[N]; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + T *op_1 = test_data[i][0]; + T *op_2 = test_data[i][1]; + T *expect = test_data[i][2]; + + RUN_VEC_SAT_BINARY (T, out, op_1, op_2, N); + + for (k = 0; k < N; k++) + if (out[k] != expect[k]) + __builtin_abort (); + } + + return 0; +} + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c new file mode 100644 index 00000000000..dbbfa00afe2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint8_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e8,\s*m1,\s*ta,\s*ma +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle8\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint8_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c new file mode 100644 index 00000000000..1253fdb5f60 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint16_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma +** ... +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint16_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c new file mode 100644 index 00000000000..74bba9cadd1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint32_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint32_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c new file mode 100644 index 00000000000..f3692b4cc25 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "../../../sat_arith.h" + +/* +** vec_sat_u_add_uint64_t_fmt_1: +** ... +** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\) +** vsaddu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +DEF_VEC_SAT_U_ADD_FMT_1(uint64_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 4 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c new file mode 100644 index 00000000000..1dcb333f687 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint8_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + { + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 254, 255, 9, + }, + { + 0, 1, 1, 254, + 254, 254, 254, 255, + 255, 255, 255, 255, + 255, 255, 255, 9, + }, + { + 0, 1, 2, 254, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c new file mode 100644 index 00000000000..dbf01ac863d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint16_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + { + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 65534, 65535, 9, + }, + { + 0, 1, 1, 65534, + 65534, 65534, 65534, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 9, + }, + { + 0, 1, 2, 65534, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c new file mode 100644 index 00000000000..20ad2736403 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint32_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + { + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 4294967294, 4294967295, 9, + }, + { + 0, 1, 1, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 9, + }, + { + 0, 1, 2, 4294967294, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c new file mode 100644 index 00000000000..2f31edc527e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c @@ -0,0 +1,75 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "../../../sat_arith.h" + +#define T uint64_t +#define N 16 +#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_ADD_FMT_1 + +DEF_VEC_SAT_U_ADD_FMT_1(T) + +T test_data[][3][N] = { + { + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_0 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* arg_1 */ + { + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, /* expect */ + }, + { + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + { + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + }, + }, + { + { + 0, 0, 1, 0, + 1, 2, 3, 0, + 1, 2, 3, 4, + 5, 18446744073709551614u, 18446744073709551615u, 9, + }, + { + 0, 1, 1, 18446744073709551614u, + 18446744073709551614u, 18446744073709551614u, 18446744073709551614u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 9, + }, + { + 0, 1, 2, 18446744073709551614u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, + 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, 18, + }, + }, +}; + +#include "vec_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h b/gcc/testsuite/gcc.target/riscv/sat_arith.h new file mode 100644 index 00000000000..2ef9fd825f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h @@ -0,0 +1,31 @@ +#ifndef HAVE_SAT_ARITH +#define HAVE_SAT_ARITH + +#include + +#define DEF_SAT_U_ADD_FMT_1(T) \ +T __attribute__((noinline)) \ +sat_u_add_##T##_fmt_1 (T x, T y) \ +{ \ + return (x + y) | (-(T)((T)(x + y) < x)); \ +} + +#define DEF_VEC_SAT_U_ADD_FMT_1(T) \ +void __attribute__((noinline)) \ +vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ +{ \ + unsigned i; \ + for (i = 0; i < limit; i++) \ + { \ + T x = op_1[i]; \ + T y = op_2[i]; \ + out[i] = (x + y) | (-(T)((T)(x + y) < x)); \ + } \ +} + +#define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y) + +#define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \ + vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N) + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c new file mode 100644 index 00000000000..609e1ea343b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint8_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** andi\s+a0,\s*a0,\s*0xff +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint8_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c new file mode 100644 index 00000000000..d30436c782a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint16_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** slli\s+a0,\s*a0,\s*48 +** srli\s+a0,\s*a0,\s*48 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint16_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c new file mode 100644 index 00000000000..12347c607bd --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint32_t_fmt_1: +** addw\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** sext.w\s+a0,\s*a0 +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint32_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c new file mode 100644 index 00000000000..f2c6b74d917 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-4.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "sat_arith.h" + +/* +** sat_u_add_uint64_t_fmt_1: +** add\s+[atx][0-9]+,\s*a0,\s*a1 +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+ +** neg\s+[atx][0-9]+,\s*[atx][0-9]+ +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+ +** ret +*/ +DEF_SAT_U_ADD_FMT_1(uint64_t) + +/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c new file mode 100644 index 00000000000..f1972490006 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-1.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint8_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 254, 254, }, + { 1, 254, 255, }, + { 2, 254, 255, }, + { 0, 255, 255, }, + { 1, 255, 255, }, + { 2, 255, 255, }, + { 255, 255, 255, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c new file mode 100644 index 00000000000..cb3879d0cde --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-2.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint16_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 65534, 65534, }, + { 1, 65534, 65535, }, + { 2, 65534, 65535, }, + { 0, 65535, 65535, }, + { 1, 65535, 65535, }, + { 2, 65535, 65535, }, + { 65535, 65535, 65535, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c new file mode 100644 index 00000000000..c9a6080ca3b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-3.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint32_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 4294967294, 4294967294, }, + { 1, 4294967294, 4294967295, }, + { 2, 4294967294, 4294967295, }, + { 0, 4294967295, 4294967295, }, + { 1, 4294967295, 4294967295, }, + { 2, 4294967295, 4294967295, }, + { 4294967295, 4294967295, 4294967295, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c new file mode 100644 index 00000000000..c19b7e22387 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-run-4.c @@ -0,0 +1,25 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99" } */ + +#include "sat_arith.h" + +#define T uint64_t +#define RUN_SAT_BINARY RUN_SAT_U_ADD_FMT_1 + +DEF_SAT_U_ADD_FMT_1(T) + +T test_data[][3] = { + /* arg_0, arg_1, expect */ + { 0, 0, 0, }, + { 0, 1, 1, }, + { 1, 1, 2, }, + { 0, 18446744073709551614u, 18446744073709551614u, }, + { 1, 18446744073709551614u, 18446744073709551615u, }, + { 2, 18446744073709551614u, 18446744073709551615u, }, + { 0, 18446744073709551615u, 18446744073709551615u, }, + { 1, 18446744073709551615u, 18446744073709551615u, }, + { 2, 18446744073709551615u, 18446744073709551615u, }, + { 18446744073709551615u, 18446744073709551615u, 18446744073709551615u, }, +}; + +#include "scalar_sat_binary.h" diff --git a/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h new file mode 100644 index 00000000000..cbb2d750107 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/scalar_sat_binary.h @@ -0,0 +1,27 @@ +#ifndef HAVE_DEFINED_SCALAR_SAT_BINARY +#define HAVE_DEFINED_SCALAR_SAT_BINARY + +/* To leverage this header files for run test, you need to: + 1. define T as the type, for example uint8_t, + 2. define RUN_SAT_BINARY as run function. + 3. prepare the test_data for test cases. + */ + +int +main () +{ + unsigned i; + T *d; + + for (i = 0; i < sizeof (test_data) / sizeof (test_data[0]); i++) + { + d = test_data[i]; + + if (RUN_SAT_BINARY (T, d[0], d[1]) != d[2]) + __builtin_abort (); + } + + return 0; +} + +#endif