From patchwork Wed Jul 3 14:17:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1956287 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=jty/qvCx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WDhhc343rz1xqh for ; Thu, 4 Jul 2024 00:18:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9C1B9386187D for ; Wed, 3 Jul 2024 14:18:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) by sourceware.org (Postfix) with ESMTPS id 642E43861015 for ; Wed, 3 Jul 2024 14:17:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 642E43861015 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 642E43861015 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720016250; cv=none; b=aEteDqNAMiH8uHB1oxSZNjJZKYbgB2jaLEgV/8FnkqHd5/9uHWe4spTAH5iXB1x+ameM+8vMz6Z9FZLSaW/hnXqjBz95xtQtALGR+Zsu+gdOu/IBjTUpe3r9CjQYT9ZCT0qq/Bq3XFQ0S4XgFfaD/NcUU3drQEjl3BE8XHEPV80= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720016250; c=relaxed/simple; bh=WB78JzuNIVfQk5gRq1jAL2KOTn8QoCycZsz7h78K7Bc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=GvbUWD0NjwdhNxnfdc+VHI4FXGx2taGJKTKt4qgEqL8Qr7Hqq1sezE3qvAysOT0flExqmLEacS1pF/dSGU1q+aJ5Mm3FpaiAb5VPlal5DopTL7L6iyv0IkwRVA6+DbjFnSvrQ8mo56oi8Epre6XkVCuDv057DCQOVHet2bysL+o= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1720016239; x=1751552239; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=WB78JzuNIVfQk5gRq1jAL2KOTn8QoCycZsz7h78K7Bc=; b=jty/qvCxbKtu6hdtShu8Y5TnCk8xypKOzQw4pwS3FcIReXAx+Vv+AD5S BdaTeb9X5bJcvm0H8c4kC0UzZYiD1t7IMeaMg9WyrnbEXsPhgShNrGU3D M0LAuVxwLgAWkNgTN8PbRIbZBe/WHYqWWVC+qK9CBRbgb0w8Z5TZxKWKB xlSinpNlhPrtiWiJ8y1NLfbQk9Ly0wmT7khU1STnQHaJgxr6knsRWXNp1 788gTGmYcigv1SaTwNxJD526DHUtznZc5u754nSMOF5iRPqR85j6j6rJw PmIyX1fZ7fuCrK81cOYw+Dv86MJwVH1vkfOmOTDG7rx/jFBb61/sXF9SG g==; X-CSE-ConnectionGUID: xyDijQPTRtWF7qi0QtQj2w== X-CSE-MsgGUID: WF6J/SZ9StWbeatFKl/Ydg== X-IronPort-AV: E=McAfee;i="6700,10204,11121"; a="17118007" X-IronPort-AV: E=Sophos;i="6.09,182,1716274800"; d="scan'208";a="17118007" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jul 2024 07:17:18 -0700 X-CSE-ConnectionGUID: rw0eX7oVTkeb8BMhuXxDpg== X-CSE-MsgGUID: 1x6MRgJYQ+KnIjC3KJJn9A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,182,1716274800"; d="scan'208";a="51122520" Received: from shvmail02.sh.intel.com ([10.239.244.9]) by orviesa003.jf.intel.com with ESMTP; 03 Jul 2024 07:17:16 -0700 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail02.sh.intel.com (Postfix) with ESMTP id 3C16910050F5; Wed, 3 Jul 2024 22:17:15 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH v1] RISC-V: Bugfix vfmv insn honor zvfhmin for FP16 SEW [PR115763] Date: Wed, 3 Jul 2024 22:17:13 +0800 Message-Id: <20240703141713.1425590-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li According to the ISA, the zvfhmin sub extension should only contain convertion insn. Thus, the vfmv insn acts on FP16 should not be present when only the zvfhmin option is given. This patch would like to fix it by split the pred_broadcast define_insn into zvfhmin and zvfh part. Given below example: void test (_Float16 *dest, _Float16 bias) { dest[0] = bias; dest[1] = bias; } when compile with -march=rv64gcv_zfh_zvfhmin Before this patch: test: vsetivli zero,2,e16,mf4,ta,ma vfmv.v.f v1,fa0 // should not leverage vfmv for zvfhmin vse16.v v1,0(a0) ret After this patch: test: addi sp,sp,-16 fsh fa0,14(sp) addi a5,sp,14 vsetivli zero,2,e16,mf4,ta,ma vlse16.v v1,0(a5),zero vse16.v v1,0(a0) addi sp,sp,16 jr ra PR target/115763 gcc/ChangeLog: * config/riscv/vector.md (*pred_broadcast): Split into zvfh and zvfhmin part. (*pred_broadcast_zvfh): New define_insn for zvfh part. (*pred_broadcast_zvfhmin): Ditto but for zvfhmin. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-5.c: Adjust asm check. * gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto. * gcc.target/riscv/rvv/base/scalar_move-7.c: Ditto. * gcc.target/riscv/rvv/base/scalar_move-8.c: Ditto. * gcc.target/riscv/rvv/base/pr115763-1.c: New test. * gcc.target/riscv/rvv/base/pr115763-2.c: New test. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/config/riscv/vector.md | 49 +++++++++++++------ .../gcc.target/riscv/rvv/base/pr115763-1.c | 9 ++++ .../gcc.target/riscv/rvv/base/pr115763-2.c | 10 ++++ .../gcc.target/riscv/rvv/base/scalar_move-5.c | 4 +- .../gcc.target/riscv/rvv/base/scalar_move-6.c | 6 +-- .../gcc.target/riscv/rvv/base/scalar_move-7.c | 6 +-- .../gcc.target/riscv/rvv/base/scalar_move-8.c | 6 +-- 7 files changed, 64 insertions(+), 26 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-2.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index fe18ee5b5f7..d9474262d54 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -2080,31 +2080,50 @@ (define_insn_and_split "*pred_broadcast" [(set_attr "type" "vimov,vimov,vlds,vlds,vlds,vlds,vimovxv,vimovxv") (set_attr "mode" "")]) -(define_insn "*pred_broadcast" - [(set (match_operand:V_VLSF_ZVFHMIN 0 "register_operand" "=vr, vr, vr, vr, vr, vr, vr, vr") - (if_then_else:V_VLSF_ZVFHMIN +(define_insn "*pred_broadcast_zvfh" + [(set (match_operand:V_VLSF 0 "register_operand" "=vr, vr, vr, vr") + (if_then_else:V_VLSF (unspec: - [(match_operand: 1 "vector_broadcast_mask_operand" "Wc1,Wc1, vm, vm,Wc1,Wc1,Wb1,Wb1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_broadcast_mask_operand" "Wc1, Wc1, Wb1, Wb1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (vec_duplicate:V_VLSF_ZVFHMIN - (match_operand: 3 "direct_broadcast_operand" " f, f,Wdm,Wdm,Wdm,Wdm, f, f")) - (match_operand:V_VLSF_ZVFHMIN 2 "vector_merge_operand" "vu, 0, vu, 0, vu, 0, vu, 0")))] + (vec_duplicate:V_VLSF + (match_operand: 3 "direct_broadcast_operand" " f, f, f, f")) + (match_operand:V_VLSF 2 "vector_merge_operand" " vu, 0, vu, 0")))] "TARGET_VECTOR" "@ vfmv.v.f\t%0,%3 vfmv.v.f\t%0,%3 + vfmv.s.f\t%0,%3 + vfmv.s.f\t%0,%3" + [(set_attr "type" "vfmov,vfmov,vfmovfv,vfmovfv") + (set_attr "mode" "")]) + +(define_insn "*pred_broadcast_zvfhmin" + [(set (match_operand:V_VLSF_ZVFHMIN 0 "register_operand" "=vr, vr, vr, vr") + (if_then_else:V_VLSF_ZVFHMIN + (unspec: + [(match_operand: 1 "vector_broadcast_mask_operand" " vm, vm, Wc1, Wc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (vec_duplicate:V_VLSF_ZVFHMIN + (match_operand: 3 "direct_broadcast_operand" "Wdm, Wdm, Wdm, Wdm")) + (match_operand:V_VLSF_ZVFHMIN 2 "vector_merge_operand" " vu, 0, vu, 0")))] + "TARGET_VECTOR" + "@ vlse.v\t%0,%3,zero,%1.t vlse.v\t%0,%3,zero,%1.t vlse.v\t%0,%3,zero - vlse.v\t%0,%3,zero - vfmv.s.f\t%0,%3 - vfmv.s.f\t%0,%3" - [(set_attr "type" "vfmov,vfmov,vlds,vlds,vlds,vlds,vfmovfv,vfmovfv") + vlse.v\t%0,%3,zero" + [(set_attr "type" "vlds,vlds,vlds,vlds") (set_attr "mode" "")]) (define_insn "*pred_broadcast_extended_scalar" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-1.c new file mode 100644 index 00000000000..3b0b0046041 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-1.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zfh_zvfh -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model" } */ + +void test (_Float16 *dest, _Float16 bias) { + dest[0] = bias; + dest[1] = bias; +} + +/* { dg-final { scan-assembler-times {vfmv\.v\.f\s+v[0-9]+,\s*fa[0-9]+} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-2.c new file mode 100644 index 00000000000..f4d53e72022 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115763-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zfh_zvfhmin -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model" } */ + +void test (_Float16 *dest, _Float16 bias) { + dest[0] = bias; + dest[1] = bias; +} + +/* { dg-final { scan-assembler-times {fsh\s+fa[0-9]+,[0-9]+\(sp\)} 1 } } */ +/* { dg-final { scan-assembler-not {vfmv\.v\.x\s+v[0-9]+,\s*fa[0-9]+} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-5.c index 2e897a4896f..04dec7bc8dc 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-5.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-5.c @@ -21,9 +21,9 @@ void foo (void *base, void *out, size_t vl) /* ** foo2: -** addi\t[a-x0-9]+,\s*[a-x0-9]+,100 +** fld\tfa[0-9]+,\s*100\(a0\) ** vsetvli\tzero,a2,e64,m2,t[au],m[au] -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** vs2r.v\tv[0-9]+,0\([a-x0-9]+\) ** ret */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-6.c index 326cfd8e2ff..0ebb92eda42 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-6.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-6.c @@ -21,9 +21,9 @@ void foo (void *base, void *out, size_t vl) /* ** foo2: -** addi\t[a-x0-9]+,\s*[a-x0-9]+,100 +** fld\tfa[0-9]+,\s*100\(a0\) ** vsetvli\tzero,a2,e64,m2,t[au],m[au] -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** vs2r.v\tv[0-9]+,0\([a-x0-9]+\) ** ret */ @@ -52,7 +52,7 @@ void foo3 (void *base, void *out, size_t vl) /* ** foo4: ** ... -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** ... ** ret */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-7.c index b218f2d0ba4..512fa62858a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-7.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-7.c @@ -21,9 +21,9 @@ void foo (void *base, void *out, size_t vl) /* ** foo2: -** addi\t[a-x0-9]+,\s*[a-x0-9]+,100 +** fld\tfa[0-9]+,\s*100\(a0\) ** vsetvli\tzero,a2,e64,m2,t[au],m[au] -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** vs2r.v\tv[0-9]+,0\([a-x0-9]+\) ** ret */ @@ -52,7 +52,7 @@ void foo3 (void *base, void *out, size_t vl) /* ** foo4: ** ... -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** ... ** ret */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-8.c index 4438e793dbc..d9d10f3702a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-8.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-8.c @@ -21,9 +21,9 @@ void foo (void *base, void *out, size_t vl) /* ** foo2: -** addi\t[a-x0-9]+,\s*[a-x0-9]+,100 +** fld\tfa[0-9]+,\s*100\(a0\) ** vsetvli\tzero,a2,e64,m2,t[au],m[au] -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** vs2r.v\tv[0-9]+,0\([a-x0-9]+\) ** ret */ @@ -52,7 +52,7 @@ void foo3 (void *base, void *out, size_t vl) /* ** foo4: ** ... -** vlse64.v\tv[0-9]+,0\([a-x0-9]+\),zero +** vfmv\.v\.f\tv[0-9]+,\s*fa[0-9]+ ** ... ** ret */