From patchwork Tue Aug 27 00:37:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick O'Neill X-Patchwork-Id: 1977067 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.a=rsa-sha256 header.s=20230601 header.b=SS0cbEqM; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wt7z06fjfz1yXd for ; Tue, 27 Aug 2024 10:41:48 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 22572385DDF2 for ; Tue, 27 Aug 2024 00:41:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id 441E0385E027 for ; Tue, 27 Aug 2024 00:37:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 441E0385E027 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 441E0385E027 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719073; cv=none; b=XUs8wcfFwilLnnkEwp5YRvxpSqqeWU8sqeP0qXRV5DZQ0EetTfrf8lo8k3/Y8dhEpJ0cURgkPHzmVfc3CatG5625VvRw690UkybUOzH5+E8NX3ODQx3njE00aknik3VWsdBqj3kkMEwxwYCP4XCFjyLmYa9UZcj+GGVSOqSACB4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719073; c=relaxed/simple; bh=X96EtzIc+9SVA1BrNZ9cIOnYOAsmqi6lto4kSKKG3lY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=lG45ivM41dgeJA5VBFyYCFEKwon49COoQ9Yx4IVhyn6+SXGDNikopjYWhCrwSQPMeAMOdAF6Fcwli3hfVFE6anl+QMMh01zbReBA5s2DUtfUMXuHeLDW9J58Di7+wY2mggGYl+dX4ha87EhDS5S5uby+vOp0XtNYuFCElqL1A2s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32a.google.com with SMTP id 46e09a7af769-7093b53f315so3183197a34.2 for ; Mon, 26 Aug 2024 17:37:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1724719068; x=1725323868; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y2TI9D81GJY3BaRPvBheqtP571NlycIZgmgav0jcjqs=; b=SS0cbEqM7o4Rlc5rN0pbC5Z/CvHrU3XWS1niQaTi3OhD9vPsqFBFdEKulfEqrpjyfC z6GBXUtbKum9maPUJ9H+2AYzuZrFXpnYOAXP0D7j6goF0chW1wcLi+9Wcuiv2lBR2vyB oHdTnPAFMpqVdldLIG+5vzKnj9ZxeJeEVsczCX/WKsjlTQ2inzksTdJDhSzUbIWZTQr7 XNe3kHKCGdPHJQudFh6mcrLhPydSxhcGoQM2fefPlpuUQUyn8P6S9r9B/fPvR9lOxD99 Bbv4VaS/5qEqj8w8iNEhDdyocHgBu60n3TfqbQTkjIioLK5nRO6r09/UcYcSGOoo2Z4D rxvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724719068; x=1725323868; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y2TI9D81GJY3BaRPvBheqtP571NlycIZgmgav0jcjqs=; b=XYsmQRgFDS4jgPWgeflZOhj8X3nXRvU95Cl5xWcD5ARZB/KxZTMl+qIXD2a6mqMbeN Po0THDsot7KGaEkwy44sFMbjalWRzATfccmIzw/72o7Eoo+DbH8X1Xo4LlARsmfDLkIF JerF/iuf8MBbJ33R5/NypYLskCMDpzwYBiQDbx3J55EXAu5fKwTbTnUuhIXEEhzZPYBI tadqtBXjnt9Jcw660E/zQXAKiH9FGKH1dlcQx40j7YB9r+1Y+G+65rBbpbFOLPxdwJir ZrvYvx6dJmehwjHPUxiy8uNvowU3jlJoDlctgSkIssxbubEGR1buTjhyDS2rwgz8Uhyz dqTg== X-Gm-Message-State: AOJu0Yy2PBiISU0LMg2sHII4yxMK+ry935t7ppo9ObhQY+5lUdFFGPKk 5lB0f7rsv9V+v1sY3vT6kcPnsH9uiOoegqL9sdUrIDradFgFtbyEFOEL4k52C4C4TbkeaBGeyVN u X-Google-Smtp-Source: AGHT+IFP9S6xeUtedpnzE1zWE/aY0TuROb0ewGYnrjFAR4vjx6ed2zdP349YeL8i9kFTRrsgkD9yiA== X-Received: by 2002:a05:6808:2118:b0:3db:27e6:8be5 with SMTP id 5614622812f47-3de2a8ec2damr15534747b6e.42.1724719067995; Mon, 26 Aug 2024 17:37:47 -0700 (PDT) Received: from patrick-ThinkPad-X1-Carbon-Gen-8.hq.rivosinc.com ([50.145.13.30]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7cd9ac982dasm6941173a12.17.2024.08.26.17.37.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2024 17:37:47 -0700 (PDT) From: Patrick O'Neill To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, gnu-toolchain@rivosinc.com, Patrick O'Neill Subject: [PATCH v2 7/9] RISC-V: Move helper functions above expand_const_vector Date: Mon, 26 Aug 2024 17:37:01 -0700 Message-ID: <20240827003710.1513605-8-patrick@rivosinc.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240827003710.1513605-1-patrick@rivosinc.com> References: <20240827003710.1513605-1-patrick@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org These subroutines will be used in expand_const_vector in a future patch. Relocate so expand_const_vector can use them. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vector_init_insert_elems): Relocate. (expand_vector_init_trailing_same_elem): Ditto. Signed-off-by: Patrick O'Neill --- Ack'd here: https://inbox.sourceware.org/gcc-patches/0a08cbce-1568-4197-8df3-33966e440870@gmail.com/ --- gcc/config/riscv/riscv-v.cc | 132 ++++++++++++++++++------------------ 1 file changed, 66 insertions(+), 66 deletions(-) -- 2.34.1 diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index cb2380ad664..9b6c3a21e2d 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1104,6 +1104,72 @@ expand_vec_series (rtx dest, rtx base, rtx step, rtx vid) emit_move_insn (dest, result); } +/* Subroutine of riscv_vector_expand_vector_init. + Works as follows: + (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER. + (b) Skip leading elements from BUILDER, which are the same as + element NELTS_REQD - 1. + (c) Insert earlier elements in reverse order in TARGET using vslide1down. */ + +static void +expand_vector_init_insert_elems (rtx target, const rvv_builder &builder, + int nelts_reqd) +{ + machine_mode mode = GET_MODE (target); + rtx dup = expand_vector_broadcast (mode, builder.elt (0)); + emit_move_insn (target, dup); + int ndups = builder.count_dups (0, nelts_reqd - 1, 1); + for (int i = ndups; i < nelts_reqd; i++) + { + unsigned int unspec + = FLOAT_MODE_P (mode) ? UNSPEC_VFSLIDE1DOWN : UNSPEC_VSLIDE1DOWN; + insn_code icode = code_for_pred_slide (unspec, mode); + rtx ops[] = {target, target, builder.elt (i)}; + emit_vlmax_insn (icode, BINARY_OP, ops); + } +} + +/* Subroutine of expand_vec_init to handle case + when all trailing elements of builder are same. + This works as follows: + (a) Use expand_insn interface to broadcast last vector element in TARGET. + (b) Insert remaining elements in TARGET using insr. + + ??? The heuristic used is to do above if number of same trailing elements + is greater than leading_ndups, loosely based on + heuristic from mostly_zeros_p. May need fine-tuning. */ + +static bool +expand_vector_init_trailing_same_elem (rtx target, + const rtx_vector_builder &builder, + int nelts_reqd) +{ + int leading_ndups = builder.count_dups (0, nelts_reqd - 1, 1); + int trailing_ndups = builder.count_dups (nelts_reqd - 1, -1, -1); + machine_mode mode = GET_MODE (target); + + if (trailing_ndups > leading_ndups) + { + rtx dup = expand_vector_broadcast (mode, builder.elt (nelts_reqd - 1)); + for (int i = nelts_reqd - trailing_ndups - 1; i >= 0; i--) + { + unsigned int unspec + = FLOAT_MODE_P (mode) ? UNSPEC_VFSLIDE1UP : UNSPEC_VSLIDE1UP; + insn_code icode = code_for_pred_slide (unspec, mode); + rtx tmp = gen_reg_rtx (mode); + rtx ops[] = {tmp, dup, builder.elt (i)}; + emit_vlmax_insn (icode, BINARY_OP, ops); + /* slide1up need source and dest to be different REG. */ + dup = tmp; + } + + emit_move_insn (target, dup); + return true; + } + + return false; +} + static void expand_const_vector (rtx target, rtx src) { @@ -2338,31 +2404,6 @@ preferred_simd_mode (scalar_mode mode) return word_mode; } -/* Subroutine of riscv_vector_expand_vector_init. - Works as follows: - (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER. - (b) Skip leading elements from BUILDER, which are the same as - element NELTS_REQD - 1. - (c) Insert earlier elements in reverse order in TARGET using vslide1down. */ - -static void -expand_vector_init_insert_elems (rtx target, const rvv_builder &builder, - int nelts_reqd) -{ - machine_mode mode = GET_MODE (target); - rtx dup = expand_vector_broadcast (mode, builder.elt (0)); - emit_move_insn (target, dup); - int ndups = builder.count_dups (0, nelts_reqd - 1, 1); - for (int i = ndups; i < nelts_reqd; i++) - { - unsigned int unspec - = FLOAT_MODE_P (mode) ? UNSPEC_VFSLIDE1DOWN : UNSPEC_VSLIDE1DOWN; - insn_code icode = code_for_pred_slide (unspec, mode); - rtx ops[] = {target, target, builder.elt (i)}; - emit_vlmax_insn (icode, BINARY_OP, ops); - } -} - /* Use merge approach to initialize the vector with repeating sequence. v = {a, b, a, b, a, b, a, b}. @@ -2487,47 +2528,6 @@ expand_vector_init_merge_combine_sequence (rtx target, emit_vlmax_insn (icode, MERGE_OP, merge_ops); } -/* Subroutine of expand_vec_init to handle case - when all trailing elements of builder are same. - This works as follows: - (a) Use expand_insn interface to broadcast last vector element in TARGET. - (b) Insert remaining elements in TARGET using insr. - - ??? The heuristic used is to do above if number of same trailing elements - is greater than leading_ndups, loosely based on - heuristic from mostly_zeros_p. May need fine-tuning. */ - -static bool -expand_vector_init_trailing_same_elem (rtx target, - const rtx_vector_builder &builder, - int nelts_reqd) -{ - int leading_ndups = builder.count_dups (0, nelts_reqd - 1, 1); - int trailing_ndups = builder.count_dups (nelts_reqd - 1, -1, -1); - machine_mode mode = GET_MODE (target); - - if (trailing_ndups > leading_ndups) - { - rtx dup = expand_vector_broadcast (mode, builder.elt (nelts_reqd - 1)); - for (int i = nelts_reqd - trailing_ndups - 1; i >= 0; i--) - { - unsigned int unspec - = FLOAT_MODE_P (mode) ? UNSPEC_VFSLIDE1UP : UNSPEC_VSLIDE1UP; - insn_code icode = code_for_pred_slide (unspec, mode); - rtx tmp = gen_reg_rtx (mode); - rtx ops[] = {tmp, dup, builder.elt (i)}; - emit_vlmax_insn (icode, BINARY_OP, ops); - /* slide1up need source and dest to be different REG. */ - dup = tmp; - } - - emit_move_insn (target, dup); - return true; - } - - return false; -} - /* Initialize register TARGET from the elements in PARALLEL rtx VALS. */ void