From patchwork Tue Aug 27 00:36:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick O'Neill X-Patchwork-Id: 1977056 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.a=rsa-sha256 header.s=20230601 header.b=IptefmF+; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wt7v13rcnz1yYl for ; Tue, 27 Aug 2024 10:38:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5BAC5385E45C for ; Tue, 27 Aug 2024 00:38:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by sourceware.org (Postfix) with ESMTPS id CA9F73858C52 for ; Tue, 27 Aug 2024 00:37:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CA9F73858C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CA9F73858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::132 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719063; cv=none; b=wIL0J7OCqvQyZU4FcFe0YzsVd4lW+rVfvGFBLI5elhlfvd2BkQ//WlggTxIWst5KifYzmbgztsldEcNN3iLAmuqB0a5zM8TZgRIJao4QNCPD05rdETKq8rNEaqFhxqSJV6OBt1qSrXtu/prZ3NdzTTq1Tsu1F3IPtOAtgP52DRY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719063; c=relaxed/simple; bh=xwDYAiDIJaBemJZGsCwMkjwIQO/tGOj1sePFj59Ebvg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=niaaVAdcv4ubrCsgdqiGEKxvUNdNOEZO3dDi5CCYSKTAeuatjYtQ3cyY3ORtmAxcCDBR8je1v1Y7Hza5HvCbfGe3oEeI670lsIvuqxgGTOGeQ6zgngiiIaFuNXHtG7Kmzh5bjFOnMvWxf418R6mBh4m/YJmBVV1ssc/dzuM2t3M= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-il1-x132.google.com with SMTP id e9e14a558f8ab-39d3636e955so15780595ab.3 for ; Mon, 26 Aug 2024 17:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1724719061; x=1725323861; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7knKu+NssPvS2l8mdddz+ZzJgqvvJRyU/WJ0FUX/2OE=; b=IptefmF+cri1l1eCTwYJX1pzgfGHKRAu+DfXD+SJ6BXio2i5FTJK+/OofErGprIuAR Lz29z32c9PnIKj0I7U9bRTtj/KIB0t79OJrOmLVBpvv1Ck7ZFGOucqL9XU6UiImhy3Xq 3c8hjgmkjVyCFBmdiJmBPcROQ6JdEB5Wuogm06nhxvhqQu7EH1xMZ6XGWf7Sn7Uaw+wr aOeqo7XnZTrYw8n9krjOQx42CA93dvwoUFkrR32be0YJ3Yq6+/elLvY6Wk3yltLgdhuy tR3R9a4KKBCsNCD1HwnE/MFO0NmqVuVG4E2ccOctYfcQTCUqzrhGz2h2Tfawr/FiBMPl WEJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724719061; x=1725323861; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7knKu+NssPvS2l8mdddz+ZzJgqvvJRyU/WJ0FUX/2OE=; b=XhrYPzRfBNAEmjRPhv58FfN6yMTCjQGN1mYxgqBLPo/NAvYZ0hT+Zx8409adW0soKd eA4H6lt6yS783qisL6Fgs1NSmdAWJto/C/gs8vYlu3kzR/RweQgVrQ9X/n/9kF0xy8Kn l+vmboqrsU2glTGxuKg3FG7yZgUAQvvahlv+oT1cKLBbV3GL/oCBrda7OKTXAHrjgdoP vfX8wJ2uw3vH4Q6QuBB/DW1eDer6TSXY852HpwhuTMg4rxaHJ4jnSHgxLXuGnQXk5JTn sXwzDf573HRLc4AcAEZq+McBuDmKcoQHrQ+ZwpShoT6TXp+jDCMPc0WQfe28VEDmJT2t UVuQ== X-Gm-Message-State: AOJu0YxNFqZ/NZP57G6Qt4s7/Ad1sL7XZofeFV3JrHLdcGEDq7kX8LfE 02z7ej0/8wuNcfqi4pOGC/d6oGEyo0CBHd6LCWOMwqms2u955HTpZNn0ae9+sxG2tNYN79cujdL X X-Google-Smtp-Source: AGHT+IG0iZpodmJM19KXvSfTIA+UdnLRefJfeuwFNPCjrN0v7rvjopNUAWLP6MLz8pcq2/1tpJo9nA== X-Received: by 2002:a05:6e02:2198:b0:398:36c0:7968 with SMTP id e9e14a558f8ab-39e3c9737e7mr137364355ab.6.1724719060622; Mon, 26 Aug 2024 17:37:40 -0700 (PDT) Received: from patrick-ThinkPad-X1-Carbon-Gen-8.hq.rivosinc.com ([50.145.13.30]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7cd9ac982dasm6941173a12.17.2024.08.26.17.37.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2024 17:37:40 -0700 (PDT) From: Patrick O'Neill To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, gnu-toolchain@rivosinc.com, Patrick O'Neill Subject: [PATCH v2 1/9] RISC-V: Fix vid const vector expander for non-npatterns size steps Date: Mon, 26 Aug 2024 17:36:55 -0700 Message-ID: <20240827003710.1513605-2-patrick@rivosinc.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240827003710.1513605-1-patrick@rivosinc.com> References: <20240827003710.1513605-1-patrick@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Prior to this patch the expander would emit vectors like: { 0, 0, 5, 5, 10, 10, ...} as: { 0, 0, 2, 2, 4, 4, ...} This patch sets the step size to the requested value. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fix STEP size in expander. Signed-off-by: Patrick O'Neill --- Detected with the existing testsuite after patch 8/9 is applied: FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 execution test FAIL: gcc.dg/torture/vshuf-v8hi.c -O2 execution test FAIL: gcc.dg/torture/vshuf-v8qi.c -O2 execution test --- gcc/config/riscv/riscv-v.cc | 48 ++++++++++++++++++++++++++++++++----- 1 file changed, 42 insertions(+), 6 deletions(-) -- 2.34.1 diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index c89603669e3..a3039a2cb19 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1312,25 +1312,61 @@ expand_const_vector (rtx target, rtx src) /* Generate the variable-length vector following this rule: { a, a, a + step, a + step, a + step * 2, a + step * 2, ...} E.g. { 0, 0, 8, 8, 16, 16, ... } */ - /* We want to create a pattern where value[ix] = floor (ix / + + /* We want to create a pattern where value[idx] = floor (idx / NPATTERNS). As NPATTERNS is always a power of two we can - rewrite this as = ix & -NPATTERNS. */ + rewrite this as = idx & -NPATTERNS. */ /* Step 2: VID AND -NPATTERNS: { 0&-4, 1&-4, 2&-4, 3 &-4, 4 &-4, 5 &-4, 6 &-4, 7 &-4, ... } */ rtx imm = gen_int_mode (-builder.npatterns (), builder.inner_mode ()); - rtx tmp = gen_reg_rtx (builder.mode ()); - rtx and_ops[] = {tmp, vid, imm}; + rtx tmp1 = gen_reg_rtx (builder.mode ()); + rtx and_ops[] = {tmp1, vid, imm}; icode = code_for_pred_scalar (AND, builder.mode ()); emit_vlmax_insn (icode, BINARY_OP, and_ops); + + /* Step 3: Convert to step size 1. */ + rtx tmp2 = gen_reg_rtx (builder.mode ()); + /* log2 (npatterns) to get the shift amount to convert + Eg. { 0, 0, 0, 0, 4, 4, ... } + into { 0, 0, 0, 0, 1, 1, ... }. */ + HOST_WIDE_INT shift_amt = exact_log2 (builder.npatterns ()) ; + rtx shift = gen_int_mode (shift_amt, builder.inner_mode ()); + rtx shift_ops[] = {tmp2, tmp1, shift}; + icode = code_for_pred_scalar (ASHIFTRT, builder.mode ()); + emit_vlmax_insn (icode, BINARY_OP, shift_ops); + + /* Step 4: Multiply to step size n. */ + HOST_WIDE_INT step_size = + INTVAL (builder.elt (builder.npatterns ())) + - INTVAL (builder.elt (0)); + rtx tmp3 = gen_reg_rtx (builder.mode ()); + if (pow2p_hwi (step_size)) + { + /* Power of 2 can be handled with a left shift. */ + HOST_WIDE_INT shift = exact_log2 (step_size); + rtx shift_amount = gen_int_mode (shift, Pmode); + insn_code icode = code_for_pred_scalar (ASHIFT, mode); + rtx ops[] = {tmp3, tmp2, shift_amount}; + emit_vlmax_insn (icode, BINARY_OP, ops); + } + else + { + rtx mult_amt = gen_int_mode (step_size, builder.inner_mode ()); + insn_code icode = code_for_pred_scalar (MULT, builder.mode ()); + rtx ops[] = {tmp3, tmp2, mult_amt}; + emit_vlmax_insn (icode, BINARY_OP, ops); + } + + /* Step 5: Add starting value to all elements. */ HOST_WIDE_INT init_val = INTVAL (builder.elt (0)); if (init_val == 0) - emit_move_insn (target, tmp); + emit_move_insn (target, tmp3); else { rtx dup = gen_const_vector_dup (builder.mode (), init_val); - rtx add_ops[] = {target, tmp, dup}; + rtx add_ops[] = {target, tmp3, dup}; icode = code_for_pred (PLUS, builder.mode ()); emit_vlmax_insn (icode, BINARY_OP, add_ops); }