From patchwork Sat Dec 9 17:03:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 1874055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=xry111.site header.i=@xry111.site header.a=rsa-sha256 header.s=default header.b=ECJMeFMU; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SnZ9z2lCmz23n0 for ; Sun, 10 Dec 2023 04:04:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 658BF3858C78 for ; Sat, 9 Dec 2023 17:04:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 3CCDA3858C60 for ; Sat, 9 Dec 2023 17:04:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3CCDA3858C60 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3CCDA3858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702141464; cv=none; b=OkHufCQ8UKye0fnt2SYYwT9IJtS8EfG/MP/8ydezL94FO6rm0+J66rxb5n3emop6DfNHr0S6iVvQbZTc5tl95XBDRybMpDZQEN95mbcKQlMV0GvVYLmiiwJsH3d39iEQbc4Uj3kcWdMIdbnud5MSbKdVy21TyyTbVlBMYiW7aMc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702141464; c=relaxed/simple; bh=CPsS9dFnQkdRt05B5dvbtKfto+zoSlP5xqtM2Ls0oB4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=O3UrDB5QmwA4SwRMfopPICH7xQgWbmaGivobh+CsM5PkNC71F28j/EX5+V1y+WhNdabysI9RqI7yuIFzl7qGpnyaLrmu+wOQ0exed/s/mbjTsXNIYbPhnX1lNxnhUJqmu/L3Dead1AiOR+dfI1znLRlQWF7+cT1Dabijbit7iWc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1702141462; bh=CPsS9dFnQkdRt05B5dvbtKfto+zoSlP5xqtM2Ls0oB4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ECJMeFMUHsesivJXr9E+hefTr0v8drMD5z4CKlEElgftdWZW4AgBqXX9JurTr0r0l Nz8S2zfSyYqzdfKiDugGLcz/oz26rCri0jDUz9dwb1mQEhdOP0aifOchumACM3NF8H 2B8pRDW9p/Z5843qpz41ExYkHT92Bt0OqbJnt8RU= Received: from stargazer.. (unknown [IPv6:240e:358:1144:3a00:dc73:854d:832e:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 7B1286693F; Sat, 9 Dec 2023 12:04:19 -0500 (EST) From: Xi Ruoyao To: gcc-patches@gcc.gnu.org Cc: chenglulu , i@xen0n.name, xuchenghua@loongson.cn, c@jia.je, Xi Ruoyao Subject: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936] Date: Sun, 10 Dec 2023 01:03:47 +0800 Message-ID: <20231209170347.12601-4-xry111@xry111.site> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231209170347.12601-2-xry111@xry111.site> References: <20231209170347.12601-2-xry111@xry111.site> MIME-Version: 1.0 X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli. gcc/ChangeLog: PR target/112936 * config/loongarch/loongarch-def.cc (loongarch_rtx_cost_data::loongarch_rtx_cost_data): Update instruction costs per micro-benchmark results. (loongarch_rtx_cost_optimize_size): Set all instruction costs to (COSTS_N_INSNS (1) + 1). * config/loongarch/loongarch.cc (loongarch_rtx_costs): Remove special case for multiplication when optimizing for size. Adjust division cost when TARGET_64BIT && !TARGET_DIV32. Account the extra cost when TARGET_CHECK_ZERO_DIV and optimizing for speed. gcc/testsuite/ChangeLog PR target/112936 * gcc.target/loongarch/mul-const-reduction.c: New test. --- gcc/config/loongarch/loongarch-def.cc | 39 ++++++++++--------- gcc/config/loongarch/loongarch.cc | 22 +++++------ .../loongarch/mul-const-reduction.c | 11 ++++++ 3 files changed, 43 insertions(+), 29 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c diff --git a/gcc/config/loongarch/loongarch-def.cc b/gcc/config/loongarch/loongarch-def.cc index 6217b19268c..4a8885e8343 100644 --- a/gcc/config/loongarch/loongarch-def.cc +++ b/gcc/config/loongarch/loongarch-def.cc @@ -92,15 +92,15 @@ array_tune loongarch_cpu_align = /* Default RTX cost initializer. */ loongarch_rtx_cost_data::loongarch_rtx_cost_data () - : fp_add (COSTS_N_INSNS (1)), - fp_mult_sf (COSTS_N_INSNS (2)), - fp_mult_df (COSTS_N_INSNS (4)), - fp_div_sf (COSTS_N_INSNS (6)), + : fp_add (COSTS_N_INSNS (5)), + fp_mult_sf (COSTS_N_INSNS (5)), + fp_mult_df (COSTS_N_INSNS (5)), + fp_div_sf (COSTS_N_INSNS (8)), fp_div_df (COSTS_N_INSNS (8)), - int_mult_si (COSTS_N_INSNS (1)), - int_mult_di (COSTS_N_INSNS (1)), - int_div_si (COSTS_N_INSNS (4)), - int_div_di (COSTS_N_INSNS (6)), + int_mult_si (COSTS_N_INSNS (4)), + int_mult_di (COSTS_N_INSNS (4)), + int_div_si (COSTS_N_INSNS (5)), + int_div_di (COSTS_N_INSNS (5)), branch_cost (6), memory_latency (4) {} @@ -111,18 +111,21 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data () array_tune loongarch_cpu_rtx_cost_data = array_tune (); -/* RTX costs to use when optimizing for size. */ +/* RTX costs to use when optimizing for size. + We use a value slightly larger than COSTS_N_INSNS (1) for all of them + because they are slower than simple instructions. */ +#define COST_COMPLEX_INSN (COSTS_N_INSNS (1) + 1) const loongarch_rtx_cost_data loongarch_rtx_cost_optimize_size = loongarch_rtx_cost_data () - .fp_add_ (4) - .fp_mult_sf_ (4) - .fp_mult_df_ (4) - .fp_div_sf_ (4) - .fp_div_df_ (4) - .int_mult_si_ (4) - .int_mult_di_ (4) - .int_div_si_ (4) - .int_div_di_ (4); + .fp_add_ (COST_COMPLEX_INSN) + .fp_mult_sf_ (COST_COMPLEX_INSN) + .fp_mult_df_ (COST_COMPLEX_INSN) + .fp_div_sf_ (COST_COMPLEX_INSN) + .fp_div_df_ (COST_COMPLEX_INSN) + .int_mult_si_ (COST_COMPLEX_INSN) + .int_mult_di_ (COST_COMPLEX_INSN) + .int_div_si_ (COST_COMPLEX_INSN) + .int_div_di_ (COST_COMPLEX_INSN); array_tune loongarch_cpu_issue_rate = array_tune () .set (CPU_NATIVE, 4) diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 754aeb8bfb7..f04b5798f39 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -3787,8 +3787,6 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code, *total = (speed ? loongarch_cost->int_mult_si * 3 + 6 : COSTS_N_INSNS (7)); - else if (!speed) - *total = COSTS_N_INSNS (1) + 1; else if (mode == DImode) *total = loongarch_cost->int_mult_di; else @@ -3823,14 +3821,18 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code, case UDIV: case UMOD: - if (!speed) - { - *total = COSTS_N_INSNS (loongarch_idiv_insns (mode)); - } - else if (mode == DImode) + if (mode == DImode) *total = loongarch_cost->int_div_di; else - *total = loongarch_cost->int_div_si; + { + *total = loongarch_cost->int_div_si; + if (TARGET_64BIT && !TARGET_DIV32) + *total += COSTS_N_INSNS (2); + } + + if (TARGET_CHECK_ZERO_DIV) + *total += COSTS_N_INSNS (2); + return false; case SIGN_EXTEND: @@ -3862,9 +3864,7 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code, && (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 1)) == ZERO_EXTEND)))) { - if (!speed) - *total = COSTS_N_INSNS (1) + 1; - else if (mode == DImode) + if (mode == DImode) *total = loongarch_cost->int_mult_di; else *total = loongarch_cost->int_mult_si; diff --git a/gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c b/gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c new file mode 100644 index 00000000000..02d9a4876d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=la464" } */ +/* { dg-final { scan-assembler "alsl\.w" } } */ +/* { dg-final { scan-assembler "slli\.w" } } */ +/* { dg-final { scan-assembler-not "mul\.w" } } */ + +int +test (int a) +{ + return a * 68; +}