From patchwork Wed Nov 15 03:48:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 1864003 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=gMDOevIs; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SVTfK1MNMz1yRV for ; Wed, 15 Nov 2023 14:48:25 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 49A313858438 for ; Wed, 15 Nov 2023 03:48:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by sourceware.org (Postfix) with ESMTPS id C5B653858D20 for ; Wed, 15 Nov 2023 03:48:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C5B653858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C5B653858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700020091; cv=none; b=VoXd7JfsOpWkCzmawbsArNpRxATGJ+JiJRQBrijo8k/3yE2sPNehvX9nl/FI71CpR07Hat3b/hRNKzAhzz5LGAi50ys9YswwULZ0Xs2asaWjipvFBOf8VxrSuWp15ZftvXirB1Nfiw07u2mzLgLbyKD/9aDz79dOUDy9W+akvjs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700020091; c=relaxed/simple; bh=2aJOS9HXaNnMsmlFmzW++VEMbhNsqzQB95aSV/VA66o=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Gi4Re2l/p+UmVf10UmvA6zThADfmOuGUU8FCYly3SlBZv5d9Eft4wtuBr43Hkt3IrPETcslthnqVLxry4Ghq+O9uCEmXsAd0XJM58sn5dPeLP8oFf+6ke6fF4jlWjWT4+QrW8RjpctWpbp3GpBmt+hjyHObXPaHfDJ3DDbsJxcI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700020089; x=1731556089; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=2aJOS9HXaNnMsmlFmzW++VEMbhNsqzQB95aSV/VA66o=; b=gMDOevIs2i8BpcYKee8VAvE+mr+sKdFTHBRxW4539ExdC0kjL1/9iVoN eTmd3a6ppPLkHFIi30Ut5XuphFAyJtiVoGvZC6B96W/P1nkFk35PgMi23 s2ypPDfS/CXcPuHwELE/yM5vUOE/hMyYTcMEdcs5/sAinfaf2dOerdVeA su6zq5SzQgC5qXjnz9a7+JEUMDGXrJgg2WCQyOD9FivkGv2Rb+ztI7MsB uTIdSsyc9sOlePaER+P8Was8i3gy87bF5nnF5zRxXbtthYZwRLvXlFzys O4RBB9oQAr2aIRGUpnyvmbLLc1yTZHCDbIH3F6WeFV1R+3HjusbqDfdog g==; X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="12350321" X-IronPort-AV: E=Sophos;i="6.03,303,1694761200"; d="scan'208";a="12350321" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Nov 2023 19:48:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10894"; a="764861356" X-IronPort-AV: E=Sophos;i="6.03,303,1694761200"; d="scan'208";a="764861356" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga002.jf.intel.com with ESMTP; 14 Nov 2023 19:48:02 -0800 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 330431005670; Wed, 15 Nov 2023 11:48:02 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com Subject: [PATCH v1] RISC-V: Refine the mask generation for vec_init case 2 Date: Wed, 15 Nov 2023 11:48:01 +0800 Message-Id: <20231115034801.979185-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li We take vec_init element int mode when generate the mask for case 2. But actually we don't need as many bits as the element. The extra bigger mode may introduce some unnecessary insns. For example as below code: typedef int64_t v16di __attribute__ ((vector_size (16 * 8))); void __attribute__ ((noinline, noclone)) foo (int64_t *out, int64_t x, int64_t y) { v16di v = {y, x, y, x, y, x, y, x, y, x, y, x, y, x, y, x}; *(v16di *) out = v; } We will have VDImode when generate the 0b0101010101010101 mask but actually VHImode is good enough here. This patch would like to refine the mask generation to avoid: 1. Unnecessary scalar to generate big constant mask. 2. Unnecessary vector insn to v0 mask. Before this patch: foo: li a5,-1431654400 li a4,-1431654400 <== unnecessary insn addi a5,a5,-1365 <== unnecessary insn addi a4,a4,-1366 slli a5,a5,32 <== unnecessary insn add a5,a5,a4 <== unnecessary insn vsetivli zero,16,e64,m8,ta,ma vmv.v.x v8,a2 vmv.s.x v16,a5 vmv1r.v v0,v16 <== unnecessary insn vmerge.vxm v8,v8,a1,v0 vse64.v v8,0(a0) ret After this patch: foo: li a5,-20480 addiw a5,a5,-1366 vsetivli zero,16,e64,m8,ta,ma vmv.s.x v0,a5 vmv.v.x v8,a2 vmerge.vxm v8,v8,a1,v0 vs8r.v v8,0(a0) ret gcc/ChangeLog: * config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask): Add inner_mode mask arg for mask int mode. (get_repeating_sequence_dup_machine_mode): Add mask_bit_mode arg to get the good enough vector int mode on precision. (expand_vector_init_merge_repeating_sequence): Pass required args to above func. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c: New test. Signed-off-by: Pan Li Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 54 ++++++++++++++----- .../vls-vlmax/init-repeat-sequence-10.c | 28 ++++++++++ .../vls-vlmax/init-repeat-sequence-11.c | 26 +++++++++ .../vls-vlmax/init-repeat-sequence-6.c | 27 ++++++++++ .../vls-vlmax/init-repeat-sequence-7.c | 25 +++++++++ .../vls-vlmax/init-repeat-sequence-8.c | 27 ++++++++++ .../vls-vlmax/init-repeat-sequence-9.c | 25 +++++++++ 7 files changed, 200 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 265a298f447..ffb645eccf3 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -416,7 +416,7 @@ public: bool repeating_sequence_use_merge_profitable_p (); bool combine_sequence_use_slideup_profitable_p (); bool combine_sequence_use_merge_profitable_p (); - rtx get_merge_scalar_mask (unsigned int) const; + rtx get_merge_scalar_mask (unsigned int, machine_mode) const; bool single_step_npatterns_p () const; bool npatterns_all_equal_p () const; @@ -592,7 +592,8 @@ rvv_builder::get_merged_repeating_sequence () To merge "b", the mask should be 0101.... */ rtx -rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const +rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern, + machine_mode inner_mode) const { unsigned HOST_WIDE_INT mask = 0; unsigned HOST_WIDE_INT base_mask = (1ULL << index_in_pattern); @@ -611,7 +612,7 @@ rvv_builder::get_merge_scalar_mask (unsigned int index_in_pattern) const for (int i = 0; i < limit; i++) mask |= base_mask << (i * npatterns ()); - return gen_int_mode (mask, inner_int_mode ()); + return gen_int_mode (mask, inner_mode); } /* Return true if the variable-length vector is single step. @@ -919,17 +920,45 @@ emit_vlmax_decompress_insn (rtx target, rtx op0, rtx op1, rtx mask) /* Emit merge instruction. */ static machine_mode -get_repeating_sequence_dup_machine_mode (const rvv_builder &builder) +get_repeating_sequence_dup_machine_mode (const rvv_builder &builder, + machine_mode mask_bit_mode) { - poly_uint64 dup_nunits = GET_MODE_NUNITS (builder.mode ()); + unsigned mask_precision = GET_MODE_PRECISION (mask_bit_mode).to_constant (); + unsigned mask_scalar_size = mask_precision > builder.inner_bits_size () + ? builder.inner_bits_size () : mask_precision; - if (known_ge (GET_MODE_SIZE (builder.mode ()), BYTES_PER_RISCV_VECTOR)) + scalar_mode inner_mode; + unsigned minimal_bits_size; + + switch (mask_scalar_size) { - dup_nunits = exact_div (BYTES_PER_RISCV_VECTOR, - builder.inner_bytes_size ()); + case 8: + inner_mode = QImode; + minimal_bits_size = TARGET_MIN_VLEN / 8; /* AKA RVVMF8. */ + break; + case 16: + inner_mode = HImode; + minimal_bits_size = TARGET_MIN_VLEN / 4; /* AKA RVVMF4. */ + break; + case 32: + inner_mode = SImode; + minimal_bits_size = TARGET_MIN_VLEN / 2; /* AKA RVVMF2. */ + break; + case 64: + inner_mode = DImode; + minimal_bits_size = TARGET_MIN_VLEN / 1; /* AKA RVVM1. */ + break; + default: + gcc_unreachable (); + break; } - return get_vector_mode (builder.inner_int_mode (), dup_nunits).require (); + gcc_assert (mask_precision % mask_scalar_size == 0); + + uint64_t dup_nunit = mask_precision > mask_scalar_size + ? mask_precision / mask_scalar_size : minimal_bits_size / mask_scalar_size; + + return get_vector_mode (inner_mode, dup_nunit).require (); } /* Expand series const vector. */ @@ -2130,9 +2159,9 @@ expand_vector_init_merge_repeating_sequence (rtx target, since we don't have such instruction in RVV. Instead, we should use INT mode (QI/HI/SI/DI) with integer move instruction to generate the mask data we want. */ - machine_mode mask_int_mode - = get_repeating_sequence_dup_machine_mode (builder); machine_mode mask_bit_mode = get_mask_mode (builder.mode ()); + machine_mode mask_int_mode + = get_repeating_sequence_dup_machine_mode (builder, mask_bit_mode); uint64_t full_nelts = builder.full_nelts ().to_constant (); /* Step 1: Broadcast the first pattern. */ @@ -2143,7 +2172,8 @@ expand_vector_init_merge_repeating_sequence (rtx target, for (unsigned int i = 1; i < builder.npatterns (); i++) { /* Step 2-1: Generate mask register v0 for each merge. */ - rtx merge_mask = builder.get_merge_scalar_mask (i); + rtx merge_mask + = builder.get_merge_scalar_mask (i, GET_MODE_INNER (mask_int_mode)); rtx mask = gen_reg_rtx (mask_bit_mode); rtx dup = gen_reg_rtx (mask_int_mode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c new file mode 100644 index 00000000000..ccce5052dc2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvl1024b -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int64_t vnx32di __attribute__ ((vector_size (32 * 8))); + +/* +** f_vnx32di: +** vsetvli\s+[axt][0-9]+,\s*zero,\s*e64,\s*m2,\s*ta,\s*ma +** ... +** vmv\.v\.x\s+v[0-9]+,\s*[axt][0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vmerge\.vxm\s+v[0-9]+,\s*v[0-9]+,\s*[axt][0-9]+,\s*v0 +** vs2r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx32di (int64_t a, int64_t b, int64_t *out) +{ + vnx32di v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx32di *) out = v; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c new file mode 100644 index 00000000000..a62eee8a5ae --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvl1024b -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +typedef double vnx32df __attribute__ ((vector_size (32 * 8))); + +/* +** f_vnx32df: +** vsetvli\s+[axt][0-9]+\s*,zero,\s*e64,\s*m2,\s*ta,\s*ma +** ... +** vfmv\.v\.f\s+v[0-9]+,\s*[af]+[0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*[af]+[0-9]+,\s*v0 +** vs2r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx32df (double a, double b, double *out) +{ + vnx32df v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx32df *) out = v; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c new file mode 100644 index 00000000000..4f8a78b3161 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int64_t vnx16di __attribute__ ((vector_size (16 * 8))); + +/* +** f_vnx16di: +** vsetivli\s+zero,\s*16,\s*e64,\s*m8,\s*ta,\s*ma +** ... +** vmv\.v\.x\s+v[0-9]+,\s*[axt][0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vmerge\.vxm\s+v[0-9]+,\s*v[0-9]+,\s*[axt][0-9]+,\s*v0 +** vs8r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx16di (int64_t a, int64_t b, int64_t *out) +{ + vnx16di v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx16di *) out = v; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c new file mode 100644 index 00000000000..f0d14db8fa8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +typedef double vnx16df __attribute__ ((vector_size (16 * 8))); + +/* +** f_vnx16df: +** vsetivli\s+zero,\s*16,\s*e64,\s*m8,\s*ta,\s*ma +** ... +** vfmv\.v\.f\s+v[0-9]+,\s*[af]+[0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*[af]+[0-9]+,\s*v0 +** vs8r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx16df (double a, double b, double *out) +{ + vnx16df v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx16df *) out = v; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c new file mode 100644 index 00000000000..fd986e6b649 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvl1024b -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int64_t vnx16di __attribute__ ((vector_size (16 * 8))); + +/* +** f_vnx16di: +** vsetivli\s+zero,\s*16,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vmv\.v\.x\s+v[0-9]+,\s*[axt][0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vmerge\.vxm\s+v[0-9]+,\s*v[0-9]+,\s*[axt][0-9]+,\s*v0 +** vs1r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx16di (int64_t a, int64_t b, int64_t *out) +{ + vnx16di v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx16di *) out = v; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c new file mode 100644 index 00000000000..753221ffdbf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvl1024b -mabi=lp64d -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +typedef double vnx16df __attribute__ ((vector_size (16 * 8))); + +/* +** f_vnx16df: +** vsetivli\s+zero,\s*16,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfmv\.v\.f\s+v[0-9]+,\s*[af]+[0-9]+ +** ... +** vmv\.s\.x\s+v0,\s*[axt][0-9]+ +** vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*[af]+[0-9]+,\s*v0 +** vs1r\.v\s+v[0-9]+,\s*0\([axt][0-9]+\) +** ret +*/ +__attribute__ ((noipa)) void +f_vnx16df (double a, double b, double *out) +{ + vnx16df v = { + a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, + }; + *(vnx16df *) out = v; +}