From patchwork Tue Aug 27 00:36:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick O'Neill X-Patchwork-Id: 1977063 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.a=rsa-sha256 header.s=20230601 header.b=mSHPI3kZ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wt7wq4s3zz1yYl for ; Tue, 27 Aug 2024 10:39:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7CEBA385EC31 for ; Tue, 27 Aug 2024 00:39:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc2b.google.com (mail-oo1-xc2b.google.com [IPv6:2607:f8b0:4864:20::c2b]) by sourceware.org (Postfix) with ESMTPS id 6A95E3858CDA for ; Tue, 27 Aug 2024 00:37:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6A95E3858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6A95E3858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::c2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719067; cv=none; b=X3ASJ05GZN5bFnhZn21uMW1+YfZzgvNsahGgB/ohlGPaZS2GChC2bYX2g7c7DP4V0/O/isc46h7cX0BaJMkOarTKB+fNX+yEZyBVs9Aly8hH9MFIjwM8Uy9YwHdDwYa0GEVX0tQS5b1DEgzMNeZw2zpf81kazTUvGD/q+/35Txg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724719067; c=relaxed/simple; bh=ItJfFjY8pag/9uRMgU5aOe9Poyg+8RwDiviIOBMd3Nc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=PQKu0f2rrmnAjOqIwRHYKiNn4YFaUFj9fMpXnEDfzzJKO6iLGuGfFwm3gspX7Ep9bqZSHyv4E0ND/ISHQUKe0CCKkIcbolbKeqawzdFDLXiJdLVUZZlPXB0YaOJeQJ/yWu5MCiE8jQoHc14sFFYav58WSLRTvmzZ04s0/IBYn7Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oo1-xc2b.google.com with SMTP id 006d021491bc7-5d5b22f97b7so5403234eaf.2 for ; Mon, 26 Aug 2024 17:37:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1724719063; x=1725323863; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jQCM6M8BXtAOPvh4ZmaNvTWVteFs9PWNKonsVG0HCIo=; b=mSHPI3kZMeiumUUbtcG7aulDm7U1CInhWBLeAu075SQlkCkX22dBzKeZ6K1Qa2JsFg ttTcWvEaPhsQd7UHdaNXTRF9DiT8mR2t6QZtw95kSdQpQ8vosb7kJtUiA1kOFBLlgEkw ar1otw6RUCJXqySaWRDkWPgfGor2J2zwzgoXFPkkHoUq6iFVORZoZQ1UZ0gv8l9AiSpx YjcOfjHU1oKpcDjTE21hOT/gmE9lIPXxri7fpvkz44ExsTOkoiu47AQ6Hi7mGmPP1+Q0 2l+RKt4QMc/9ObG226MJLkxRSF7jCtw985hpYXjXLiwtPw3o2h7I8d6nN+12TZxYp2cw Ypnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724719063; x=1725323863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jQCM6M8BXtAOPvh4ZmaNvTWVteFs9PWNKonsVG0HCIo=; b=Ew0Do6gA/mYbrPX34zTPQaxuePCcJ8yoNNb/A9IMBOTF52+iqGa4fO+aphoDpsz+Ix sAdpJt/BArhhxb5E9plie/QN9AoydMPTtKV2kDUoLvNi/eALHHSRZCsyFZzSlsN9vXKf 7Ha30tMwr/oBoVjrUx9d9eBzPb83a3HerrJXxTHtkvPix6w8uxmlT+TfM3oX5vieeSGo +uyhbom1GlUycpUqZ8pwrD7pRD8PIiD9sd5/QPXE6QSYe3RCCZFTnzcRDsWeRxkHDBwe 4u6ro1z3F+b5K3R0TZTtK7AIbDDms6ICRNmN4suSu71e2G+rfhfnOUEwvPGP76dduL8d 50pw== X-Gm-Message-State: AOJu0Yy0tTWuKqgcG/CbP/41BdxACbhQKJSM6fnfMWt20/mDypTXsxWb Piz0ZoeaZKcbGvrTXL3JYUGTkdfyN9w+Dbgu8i7l+NvwrIQFTeAlpyHV3+euxLJxaIFofzB0mnW 4 X-Google-Smtp-Source: AGHT+IEbwD96cH4n1r8zKc7AFtrF1G6am+J0GJnE1p9AgEA4xZtfAtuni9g9VdkDephfOp69aDCtzg== X-Received: by 2002:a05:6808:4486:b0:3db:2afc:ad6 with SMTP id 5614622812f47-3de2a8d49fcmr16450597b6e.38.1724719063006; Mon, 26 Aug 2024 17:37:43 -0700 (PDT) Received: from patrick-ThinkPad-X1-Carbon-Gen-8.hq.rivosinc.com ([50.145.13.30]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7cd9ac982dasm6941173a12.17.2024.08.26.17.37.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2024 17:37:42 -0700 (PDT) From: Patrick O'Neill To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, gnu-toolchain@rivosinc.com, Patrick O'Neill Subject: [PATCH v2 3/9] RISC-V: Handle case when constant vector construction target rtx is not a register Date: Mon, 26 Aug 2024 17:36:57 -0700 Message-ID: <20240827003710.1513605-4-patrick@rivosinc.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240827003710.1513605-1-patrick@rivosinc.com> References: <20240827003710.1513605-1-patrick@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This manifests in RTL that is optimized away which causes runtime failures in the testsuite. Update all patterns to use a temp result register if required. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if needed. Signed-off-by: Patrick O'Neill --- gcc/config/riscv/riscv-v.cc | 73 +++++++++++++++++++++---------------- 1 file changed, 41 insertions(+), 32 deletions(-) -- 2.34.1 diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a3039a2cb19..aea4b9b872b 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1150,26 +1150,29 @@ static void expand_const_vector (rtx target, rtx src) { machine_mode mode = GET_MODE (target); + rtx result = register_operand (target, mode) ? target : gen_reg_rtx (mode); if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) { rtx elt; gcc_assert ( const_vec_duplicate_p (src, &elt) && (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx))); - rtx ops[] = {target, src}; + rtx ops[] = {result, src}; emit_vlmax_insn (code_for_pred_mov (mode), UNARY_MASK_OP, ops); + + if (result != target) + emit_move_insn (target, result); return; } rtx elt; if (const_vec_duplicate_p (src, &elt)) { - rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx (mode); /* Element in range -16 ~ 15 integer or 0.0 floating-point, we use vmv.v.i instruction. */ if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src)) { - rtx ops[] = {tmp, src}; + rtx ops[] = {result, src}; emit_vlmax_insn (code_for_pred_mov (mode), UNARY_OP, ops); } else @@ -1186,7 +1189,7 @@ expand_const_vector (rtx target, rtx src) instruction (vsetvl a5, zero). */ if (lra_in_progress) { - rtx ops[] = {tmp, elt}; + rtx ops[] = {result, elt}; emit_vlmax_insn (code_for_pred_broadcast (mode), UNARY_OP, ops); } else @@ -1194,15 +1197,15 @@ expand_const_vector (rtx target, rtx src) struct expand_operand ops[2]; enum insn_code icode = optab_handler (vec_duplicate_optab, mode); gcc_assert (icode != CODE_FOR_nothing); - create_output_operand (&ops[0], tmp, mode); + create_output_operand (&ops[0], result, mode); create_input_operand (&ops[1], elt, GET_MODE_INNER (mode)); expand_insn (icode, 2, ops); - tmp = ops[0].value; + result = ops[0].value; } } - if (tmp != target) - emit_move_insn (target, tmp); + if (result != target) + emit_move_insn (target, result); return; } @@ -1210,7 +1213,10 @@ expand_const_vector (rtx target, rtx src) rtx base, step; if (const_vec_series_p (src, &base, &step)) { - expand_vec_series (target, base, step); + expand_vec_series (result, base, step); + + if (result != target) + emit_move_insn (target, result); return; } @@ -1243,7 +1249,7 @@ expand_const_vector (rtx target, rtx src) all element equal to 0x0706050403020100. */ rtx ele = builder.get_merged_repeating_sequence (); rtx dup = expand_vector_broadcast (builder.new_mode (), ele); - emit_move_insn (target, gen_lowpart (mode, dup)); + emit_move_insn (result, gen_lowpart (mode, dup)); } else { @@ -1272,8 +1278,8 @@ expand_const_vector (rtx target, rtx src) emit_vlmax_insn (code_for_pred_scalar (AND, builder.int_mode ()), BINARY_OP, and_ops); - rtx tmp = gen_reg_rtx (builder.mode ()); - rtx dup_ops[] = {tmp, builder.elt (0)}; + rtx tmp1 = gen_reg_rtx (builder.mode ()); + rtx dup_ops[] = {tmp1, builder.elt (0)}; emit_vlmax_insn (code_for_pred_broadcast (builder.mode ()), UNARY_OP, dup_ops); for (unsigned int i = 1; i < builder.npatterns (); i++) @@ -1285,12 +1291,12 @@ expand_const_vector (rtx target, rtx src) /* Merge scalar to each i. */ rtx tmp2 = gen_reg_rtx (builder.mode ()); - rtx merge_ops[] = {tmp2, tmp, builder.elt (i), mask}; + rtx merge_ops[] = {tmp2, tmp1, builder.elt (i), mask}; insn_code icode = code_for_pred_merge_scalar (builder.mode ()); emit_vlmax_insn (icode, MERGE_OP, merge_ops); - tmp = tmp2; + tmp1 = tmp2; } - emit_move_insn (target, tmp); + emit_move_insn (result, tmp1); } } else if (CONST_VECTOR_STEPPED_P (src)) @@ -1362,11 +1368,11 @@ expand_const_vector (rtx target, rtx src) /* Step 5: Add starting value to all elements. */ HOST_WIDE_INT init_val = INTVAL (builder.elt (0)); if (init_val == 0) - emit_move_insn (target, tmp3); + emit_move_insn (result, tmp3); else { rtx dup = gen_const_vector_dup (builder.mode (), init_val); - rtx add_ops[] = {target, tmp3, dup}; + rtx add_ops[] = {result, tmp3, dup}; icode = code_for_pred (PLUS, builder.mode ()); emit_vlmax_insn (icode, BINARY_OP, add_ops); } @@ -1396,7 +1402,7 @@ expand_const_vector (rtx target, rtx src) /* Step 2: Generate result = VID + diff. */ rtx vec = v.build (); - rtx add_ops[] = {target, vid, vec}; + rtx add_ops[] = {result, vid, vec}; emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), BINARY_OP, add_ops); } @@ -1412,24 +1418,24 @@ expand_const_vector (rtx target, rtx src) v.quick_push (builder.elt (i)); rtx new_base = v.build (); - /* Step 2: Generate tmp = VID >> LOG2 (NPATTERNS).  */ + /* Step 2: Generate tmp1 = VID >> LOG2 (NPATTERNS).  */ rtx shift_count = gen_int_mode (exact_log2 (builder.npatterns ()), builder.inner_mode ()); - rtx tmp = expand_simple_binop (builder.mode (), LSHIFTRT, + rtx tmp1 = expand_simple_binop (builder.mode (), LSHIFTRT, vid, shift_count, NULL_RTX, false, OPTAB_DIRECT); - /* Step 3: Generate tmp2 = tmp * step.  */ + /* Step 3: Generate tmp2 = tmp1 * step.  */ rtx tmp2 = gen_reg_rtx (builder.mode ()); rtx step = simplify_binary_operation (MINUS, builder.inner_mode (), builder.elt (v.npatterns()), builder.elt (0)); - expand_vec_series (tmp2, const0_rtx, step, tmp); + expand_vec_series (tmp2, const0_rtx, step, tmp1); - /* Step 4: Generate target = tmp2 + new_base.  */ - rtx add_ops[] = {target, tmp2, new_base}; + /* Step 4: Generate result = tmp2 + new_base.  */ + rtx add_ops[] = {result, tmp2, new_base}; emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), BINARY_OP, add_ops); } @@ -1462,13 +1468,13 @@ expand_const_vector (rtx target, rtx src) if (int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode) && get_vector_mode (new_smode, new_nunits).exists (&new_mode)) { - rtx tmp = gen_reg_rtx (new_mode); + rtx tmp1 = gen_reg_rtx (new_mode); base1 = gen_int_mode (rtx_to_poly_int64 (base1), new_smode); - expand_vec_series (tmp, base1, gen_int_mode (step1, new_smode)); + expand_vec_series (tmp1, base1, gen_int_mode (step1, new_smode)); if (rtx_equal_p (base2, const0_rtx) && known_eq (step2, 0)) /* { 1, 0, 2, 0, ... }. */ - emit_move_insn (target, gen_lowpart (mode, tmp)); + emit_move_insn (result, gen_lowpart (mode, tmp1)); else if (known_eq (step2, 0)) { /* { 1, 1, 2, 1, ... }. */ @@ -1478,10 +1484,10 @@ expand_const_vector (rtx target, rtx src) gen_int_mode (builder.inner_bits_size (), new_smode), NULL_RTX, false, OPTAB_DIRECT); rtx tmp2 = gen_reg_rtx (new_mode); - rtx and_ops[] = {tmp2, tmp, scalar}; + rtx and_ops[] = {tmp2, tmp1, scalar}; emit_vlmax_insn (code_for_pred_scalar (AND, new_mode), BINARY_OP, and_ops); - emit_move_insn (target, gen_lowpart (mode, tmp2)); + emit_move_insn (result, gen_lowpart (mode, tmp2)); } else { @@ -1495,10 +1501,10 @@ expand_const_vector (rtx target, rtx src) gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX, false, OPTAB_DIRECT); rtx tmp3 = gen_reg_rtx (new_mode); - rtx ior_ops[] = {tmp3, tmp, shifted_tmp2}; + rtx ior_ops[] = {tmp3, tmp1, shifted_tmp2}; emit_vlmax_insn (code_for_pred (IOR, new_mode), BINARY_OP, ior_ops); - emit_move_insn (target, gen_lowpart (mode, tmp3)); + emit_move_insn (result, gen_lowpart (mode, tmp3)); } } else @@ -1526,7 +1532,7 @@ expand_const_vector (rtx target, rtx src) rtx mask = gen_reg_rtx (builder.mask_mode ()); expand_vec_cmp (mask, EQ, and_vid, CONST1_RTX (mode)); - rtx ops[] = {target, tmp1, tmp2, mask}; + rtx ops[] = {result, tmp1, tmp2, mask}; emit_vlmax_insn (code_for_pred_merge (mode), MERGE_OP, ops); } } @@ -1536,6 +1542,9 @@ expand_const_vector (rtx target, rtx src) } else gcc_unreachable (); + + if (result != target) + emit_move_insn (target, result); } /* Get the frm mode with given CONST_INT rtx, the default mode is