From patchwork Thu Aug 8 17:10:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Raphael Moreira Zinsly X-Patchwork-Id: 1970662 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ventanamicro.com header.i=@ventanamicro.com header.a=rsa-sha256 header.s=google header.b=n7ifNQxJ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wftq06G5fz1yf8 for ; Fri, 9 Aug 2024 03:10:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B8EEB3858C35 for ; Thu, 8 Aug 2024 17:10:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by sourceware.org (Postfix) with ESMTPS id AA1F4385842A for ; Thu, 8 Aug 2024 17:10:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AA1F4385842A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AA1F4385842A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::432 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723137023; cv=none; b=reFX0gBuINIrTt2J6f9yt1IKWJ58UbQZJPcGbcgoQEcRHtePbBAQhTjqBmI4ZUHmzbQJZ2bDRsiH0gsGFoo6G64Cjd613WBmXUTvFxwj36upHYPATQmKnVD4bBWC10U1U+5XM5uH04lF/cn0kx/A0EcYaCfl6n3h/Hs/hQ1Yc7Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723137023; c=relaxed/simple; bh=FRMMdVKjXS+JWSwK6tdUqR3o/Y8/fkmO/aI9hSGPdp4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ehtFE/V2uHn49VII3/VhL67mkqb8gDZCAoMlZN78gm42MDhROxpaYY4XAlAXaTf5I6Ie2tJkIF04jEpTYdGREQD4Uf4dn2fcv2uvfq9pvBazqA3hadP1y1+hHv/bXBuOVsAW5nJgQMY6okUgQJlAktrDDJgsdAy7nm4ii6MquJ8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-70d1cbbeeaeso978319b3a.0 for ; Thu, 08 Aug 2024 10:10:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1723137020; x=1723741820; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=nLQ3AkkCOQr56ywBD1oH1jLJ5gvp9oERJxd8fo1o6nk=; b=n7ifNQxJVenaOQfS0k5QP4jPIdoNd1KbtDVkEFpfvo/b5QLoD+tuZfqvezUusRHlsc hhOObJoUJyQNPo1EdIKuUBSNE1cxY46DZEf5X7FErNDGG20yCmxSgUqAJDJevEZ9rtOz 9hDU04I7JeDpjfCtFNFMkJ2QxY3CXzwZmfpU8MH8KokAzunqc3bpf2KZzk0T8GqLILit 1W2g7YRaHd/tCajexMzTU5MzAA4+FRRynBmyCZhDw3MPBmJDb+5y2ObERCZVckJdm+sT ZRZDl+qrZ3YCP7z4AnAdR7z2QkA518F3pmocTd1x74+0SnwT4lbnniClSlJ0Pg4HSSav XIeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723137020; x=1723741820; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nLQ3AkkCOQr56ywBD1oH1jLJ5gvp9oERJxd8fo1o6nk=; b=QA5eYecIJ5KsPTjeJ6B925nFaLOIZEf7z7/L8DeQ3mzLju5OXBKL7Rmj2JwY/XuDIU w4sycffyHlQVCovCZsFMFDxXqHfdwbzb9iBEuqvg5/i4v7+8ljilQ2f7GYTpbPwXgZAZ G+zAq1lRLpSw1Q/ywnxr5O0BFLjOYyNEJQizh+WBx/9uton1NV5KgM3ikL8I0yxnlnTH gUCaU1Sc85FcNqR5FCV36SlaadZEDOfPMmRWDRc2tDzdiCUOWbYwZdPPGeIseG+SMN/U PoG6S1FtJAgujLdZ1uhln+2SC9tAK9UvAj8FIY5uE5mEMdYHyaEQ6KSqz2RLDDjjjnP5 c1Cg== X-Gm-Message-State: AOJu0YwxYMaBBpbjocMhtAdjHMULj2oHZa65ZTwmgraA6LsIhzKtIUrr NV9PhxtxwQxVMN40Vs5vSmnolnOknI+KlMSy28K6VlwdcBUVpqiTfTtQzgjEUf8m5EjQtLxDw0c W X-Google-Smtp-Source: AGHT+IHbpfTTZ3ycnl2ID351vp3A4qQIaoFIef7vD2ifYSD/gnUEA4JNH3x6DTwayfA6FFez7MsXxw== X-Received: by 2002:a05:6a21:9981:b0:1c4:c305:121d with SMTP id adf61e73a8af0-1c6fcfef1a3mr2958124637.39.1723137019929; Thu, 08 Aug 2024 10:10:19 -0700 (PDT) Received: from marvin.dc1.ventanamicro.com ([189.4.72.88]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ff58f21de1sm126767785ad.25.2024.08.08.10.10.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Aug 2024 10:10:19 -0700 (PDT) From: Raphael Moreira Zinsly To: gcc-patches@gcc.gnu.org Cc: jlaw@ventanamicro.com, Raphael Zinsly Subject: [PATCH 1/2] RISC-V: Constant synthesis with same upper and lower halves Date: Thu, 8 Aug 2024 14:10:09 -0300 Message-ID: <20240808171010.16216-1-rzinsly@ventanamicro.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Raphael Zinsly Improve handling of constants where its upper and lower 32-bit halves are the same and Zbkb is not available in riscv_move_integer. riscv_split_integer already handles this but the changes in riscv_build_integer makes it possible to improve code generation for negative values. e.g. for: unsigned long f (void) { return 0xf857f2def857f2deUL; } Without the patch: li a0,-128454656 addi a0,a0,734 li a5,-128454656 addi a5,a5,735 slli a5,a5,32 add a0,a5,a0 With the patch: li a0,128454656 addi a0,a0,-735 slli a5,a0,32 add a0,a0,a5 xori a0,a0,-1 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_build_integer): Detect constants with the same 32-bit halves and without Zbkb. (riscv_move_integer): Add synthesys of these constants. gcc/testsuite/ChangeLog: * gcc.target/riscv/synthesis-11.c: New test. Co-authored-by: Jeff Law --- gcc/config/riscv/riscv.cc | 59 +++++++++++++++++-- gcc/testsuite/gcc.target/riscv/synthesis-11.c | 40 +++++++++++++ 2 files changed, 93 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/synthesis-11.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 8ece7859945..454220d8ba4 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1223,6 +1223,43 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, } } + else if (cost > 3 && TARGET_64BIT && can_create_pseudo_p ()) + { + struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; + int alt_cost; + + unsigned HOST_WIDE_INT loval = value & 0xffffffff; + unsigned HOST_WIDE_INT hival = (value & ~loval) >> 32; + bool bit31 = (hival & 0x80000000) != 0; + /* Without pack we can generate it with a shift 32 followed by an or. */ + if (hival == loval && !bit31) + { + alt_cost = 2 + riscv_build_integer_1 (alt_codes, + sext_hwi (loval, 32), mode); + if (alt_cost < cost) + { + /* We need to save the first constant we build. */ + alt_codes[alt_cost - 3].save_temporary = true; + + /* Now we want to shift the previously generated constant into the + high half. */ + alt_codes[alt_cost - 2].code = ASHIFT; + alt_codes[alt_cost - 2].value = 32; + alt_codes[alt_cost - 2].use_uw = false; + alt_codes[alt_cost - 2].save_temporary = false; + + /* And the final step, IOR the two halves together. Since this uses + the saved temporary, use CONCAT similar to what we do for Zbkb. */ + alt_codes[alt_cost - 1].code = CONCAT; + alt_codes[alt_cost - 1].value = 0; + alt_codes[alt_cost - 1].use_uw = false; + alt_codes[alt_cost - 1].save_temporary = false; + + memcpy (codes, alt_codes, sizeof (alt_codes)); + cost = alt_cost; + } + } + } return cost; } @@ -2786,12 +2823,22 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT value, } else if (codes[i].code == CONCAT || codes[i].code == VEC_MERGE) { - rtx t = can_create_pseudo_p () ? gen_reg_rtx (mode) : temp; - rtx t2 = codes[i].code == VEC_MERGE ? old_value : x; - gcc_assert (t2); - t2 = gen_lowpart (SImode, t2); - emit_insn (gen_riscv_xpack_di_si_2 (t, x, GEN_INT (32), t2)); - x = t; + if (codes[i].code == CONCAT && !TARGET_ZBKB) + { + /* The two values should have no bits in common, so we can + use PLUS instead of IOR which has a higher chance of + using a compressed instruction. */ + x = gen_rtx_PLUS (mode, x, old_value); + } + else + { + rtx t = can_create_pseudo_p () ? gen_reg_rtx (mode) : temp; + rtx t2 = codes[i].code == VEC_MERGE ? old_value : x; + gcc_assert (t2); + t2 = gen_lowpart (SImode, t2); + emit_insn (gen_riscv_xpack_di_si_2 (t, x, GEN_INT (32), t2)); + x = t; + } } else x = gen_rtx_fmt_ee (codes[i].code, mode, diff --git a/gcc/testsuite/gcc.target/riscv/synthesis-11.c b/gcc/testsuite/gcc.target/riscv/synthesis-11.c new file mode 100644 index 00000000000..98401d5ca32 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/synthesis-11.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* We aggressively skip as we really just need to test the basic synthesis + which shouldn't vary based on the optimization level. -O1 seems to work + and eliminates the usual sources of extraneous dead code that would throw + off the counts. */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O2" "-O3" "-Os" "-Oz" "-flto" } } */ +/* { dg-options "-march=rv64gc" } */ + +/* Rather than test for a specific synthesis of all these constants or + having thousands of tests each testing one variant, we just test the + total number of instructions. + + This isn't expected to change much and any change is worthy of a look. */ +/* { dg-final { scan-assembler-times "\\t(add|addi|bseti|li|pack|ret|sh1add|sh2add|sh3add|slli|srli|xori|or)" 114 } } */ + + + +unsigned long foo_0x7857f2de7857f2de(void) { return 0x7857f2de7857f2deUL; } +unsigned long foo_0x19660e6319660e63(void) { return 0x19660e6319660e63UL; } +unsigned long foo_0x137f1b75137f1b75(void) { return 0x137f1b75137f1b75UL; } +unsigned long foo_0x35019fa035019fa0(void) { return 0x35019fa035019fa0UL; } +unsigned long foo_0x3828e6c13828e6c1(void) { return 0x3828e6c13828e6c1UL; } +unsigned long foo_0x039d87e9039d87e9(void) { return 0x039d87e9039d87e9UL; } +unsigned long foo_0x429617c1429617c1(void) { return 0x429617c1429617c1UL; } +unsigned long foo_0x2411811924118119(void) { return 0x2411811924118119UL; } +unsigned long foo_0x0c01df7d0c01df7d(void) { return 0x0c01df7d0c01df7dUL; } +unsigned long foo_0x70e23d6b70e23d6b(void) { return 0x70e23d6b70e23d6bUL; } +unsigned long foo_0xf857f2def857f2de(void) { return 0xf857f2def857f2deUL; } +unsigned long foo_0x99660e6399660e63(void) { return 0x99660e6399660e63UL; } +unsigned long foo_0x937f1b75937f1b75(void) { return 0x937f1b75937f1b75UL; } +unsigned long foo_0xb5019fa0b5019fa0(void) { return 0xb5019fa0b5019fa0UL; } +unsigned long foo_0xb828e6c1b828e6c1(void) { return 0xb828e6c1b828e6c1UL; } +unsigned long foo_0x839d87e9839d87e9(void) { return 0x839d87e9839d87e9UL; } +unsigned long foo_0xc29617c1c29617c1(void) { return 0xc29617c1c29617c1UL; } +unsigned long foo_0xa4118119a4118119(void) { return 0xa4118119a4118119UL; } +unsigned long foo_0x8c01df7d8c01df7d(void) { return 0x8c01df7d8c01df7dUL; } +unsigned long foo_0xf0e23d6bf0e23d6b(void) { return 0xf0e23d6bf0e23d6bUL; } +unsigned long foo_0x7fff00007fff0000(void) { return 0x7fff00007fff0000UL; } +