From patchwork Wed Dec 6 05:24:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1872431 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=PyURGbly; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SlQnv2LNVz1ySd for ; Wed, 6 Dec 2023 16:24:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1B1D73861894 for ; Wed, 6 Dec 2023 05:24:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 68E913858429; Wed, 6 Dec 2023 05:24:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 68E913858429 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 68E913858429 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840278; cv=none; b=ip196rwlOZmQ28E7rUv3Pev84SEu9nCbOLKSPEVgaDHFgyHdlN8lzNeCvZFc3JxB0onwwmVkFHBpD9xpVCpMIGuMYUpc9yqFr8PCnBIKmta5FWC2FuabGN6KdSC8eJDY38DJcO9sr5XnouqkNn6MLY+rKXQoUKJhufjAytJA8HI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840278; c=relaxed/simple; bh=rjdVvZ+6xPP3j0nXfvXN91W79HXbmmTLfF1UVN7MPbY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=B78VS6v2f6qJ45AOpLB7F6cwOOkVjeZQVB/JyR2Tyd/B62fupFs1DPUMHMi7FYReF9ObCnP939pIPtnOEfp08rGrAnBKi1nEySkPdycATCZKmYve76P3dzFAT+vLr/Xv01q3627T4jHEt22jAbBYogjmkSoXth0t4K8X9zst220= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B64h3xF026303; Wed, 6 Dec 2023 05:24:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : mime-version; s=pp1; bh=ZbXxgcsbB7jwXsGLYPSAIxtKSD51+GKgZdJ/GCw5ePc=; b=PyURGblyaofCRH519Mc+/qnDx98IiO9hWUIYVDgO7oWenqOTqdff2iuIT/YdxmZePfuY mLUntqJ4B8xx42vOrQzvYElHM9Ier8PjNCCxnNo1VRofwEGiTVSJj3sSKcopy6k3jK8T 9IX9+apqsyc/mfoA5v4oibIaV6MdBLPhb7+1Dkhx9UxxdoE34LDMUkhlLQCyQYvpkQRg j2sUEce18/spjpCrEELhkMqiUsccR1ejX1Cp9CnMlSx3ezaYow+lTRDql8bfZ6oajopJ rJ9ZyjHHhQL78tnQsEjTaMLkZ06RMbwWMg+1Nblj7+jJdvSSaEMSUiKV7f1az7YORB+j 5Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utj5vh0ht-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:33 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B6568vt005076; Wed, 6 Dec 2023 05:24:32 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utj5vh0hd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:32 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B65JwO2010965; Wed, 6 Dec 2023 05:24:32 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3utav2taym-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:32 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B65OTVu1180168 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Dec 2023 05:24:29 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3FAF42004F; Wed, 6 Dec 2023 05:24:29 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 262722004B; Wed, 6 Dec 2023 05:24:28 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 6 Dec 2023 05:24:27 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V3 1/3]rs6000: update num_insns_constant for 2 insns Date: Wed, 6 Dec 2023 13:24:25 +0800 Message-Id: <20231206052427.143889-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ebu0vBsmAvP7t7IUeD849Vl7pUSdYhr1 X-Proofpoint-GUID: e3icb8ypbekwuc8QwTpi2GUBCKbMBYc5 X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_03,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxscore=0 suspectscore=0 impostorscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 phishscore=0 lowpriorityscore=0 adultscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060042 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, Trunk gcc supports more constants to be built via two instructions: e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic". And then num_insns_constant should also be updated. Function "rs6000_emit_set_long_const" is used to build complicated constants; and "num_insns_constant_gpr" is used to compute 'how many instructions are needed" to build the constant. So, these two functions should be aligned. The idea of this patch is: to reuse "rs6000_emit_set_long_const" to compute/record the instruction number(when computing the insn_num, then do not emit instructions). Compare with the previous version: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636565.html This version updates "rs6000_emit_set_long_const" to use a condition if to select either "computing insn number" or "emitting the insn". And put them together to avoid misalign in the future. Bootstrap & regtest pass ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add new parameter to record number of instructions to build the constant. (num_insns_constant_gpr): Call rs6000_emit_set_long_const to compute num_insn. --- gcc/config/rs6000/rs6000.cc | 272 ++++++++++++++++++------------------ 1 file changed, 137 insertions(+), 135 deletions(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 3dfd79c4c43..dbdc72dce5d 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -1115,7 +1115,7 @@ static tree rs6000_handle_longcall_attribute (tree *, tree, tree, int, bool *); static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool *); static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *); static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree); -static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT); +static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT, int * = nullptr); static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool); static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, bool); static int rs6000_debug_address_cost (rtx, machine_mode, addr_space_t, @@ -6054,21 +6054,9 @@ num_insns_constant_gpr (HOST_WIDE_INT value) else if (TARGET_POWERPC64) { - HOST_WIDE_INT low = sext_hwi (value, 32); - HOST_WIDE_INT high = value >> 31; - - if (high == 0 || high == -1) - return 2; - - high >>= 1; - - if (low == 0 || low == high) - return num_insns_constant_gpr (high) + 1; - else if (high == 0) - return num_insns_constant_gpr (low) + 1; - else - return (num_insns_constant_gpr (high) - + num_insns_constant_gpr (low) + 1); + int num_insns = 0; + rs6000_emit_set_long_const (NULL, value, &num_insns); + return num_insns; } else @@ -10494,14 +10482,13 @@ can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_INT *mask) /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. Output insns to set DEST equal to the constant C as a series of - lis, ori and shl instructions. */ + lis, ori and shl instructions. If NUM_INSNS is not NULL, then + only increase *NUM_INSNS as the number of insns, and do not output + real insns. */ static void -rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) +rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c, int *num_insns) { - rtx temp; - int shift; - HOST_WIDE_INT mask; HOST_WIDE_INT ud1, ud2, ud3, ud4; ud1 = c & 0xffff; @@ -10509,168 +10496,183 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) ud3 = (c >> 32) & 0xffff; ud4 = (c >> 48) & 0xffff; - if ((ud4 == 0xffff && ud3 == 0xffff && ud2 == 0xffff && (ud1 & 0x8000)) - || (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000))) - emit_move_insn (dest, GEN_INT (sext_hwi (ud1, 16))); + /* This lambda is used to emit one insn or just increase the insn count. + When counting the insn number, no need to emit the insn. Here, two + kinds of insns are needed: move and rldimi. */ + auto count_or_emit_insn = [&num_insns] (rtx dest, rtx op1, rtx op2 = NULL) { + if (num_insns) + (*num_insns)++; + else if (!op2) + emit_move_insn (dest, op1); + else + emit_insn (gen_rotldi3_insert_3 (dest, op1, GEN_INT (32), op2, + GEN_INT (0xffffffff))); + }; - else if ((ud4 == 0xffff && ud3 == 0xffff && (ud2 & 0x8000)) - || (ud4 == 0 && ud3 == 0 && ! (ud2 & 0x8000))) + if ((ud4 == 0xffff && ud3 == 0xffff && ud2 == 0xffff && (ud1 & 0x8000)) + || (ud4 == 0 && ud3 == 0 && ud2 == 0 && !(ud1 & 0x8000))) { - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); + /* li */ + count_or_emit_insn (dest, GEN_INT (sext_hwi (ud1, 16))); + return; + } + + rtx temp = num_insns ? nullptr + : can_create_pseudo_p () ? gen_reg_rtx (DImode) : dest; - emit_move_insn (ud1 != 0 ? temp : dest, - GEN_INT (sext_hwi (ud2 << 16, 32))); + if ((ud4 == 0xffff && ud3 == 0xffff && (ud2 & 0x8000)) + || (ud4 == 0 && ud3 == 0 && !(ud2 & 0x8000))) + { + /* lis[; ori] */ + count_or_emit_insn (ud1 != 0 ? temp : dest, + GEN_INT (sext_hwi (ud2 << 16, 32))); if (ud1 != 0) - emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); + count_or_emit_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); + return; } - else if (ud4 == 0xffff && ud3 == 0xffff && !(ud2 & 0x8000) && ud1 == 0) + + if (ud4 == 0xffff && ud3 == 0xffff && !(ud2 & 0x8000) && ud1 == 0) { /* lis; xoris */ - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); - emit_move_insn (temp, GEN_INT (sext_hwi ((ud2 | 0x8000) << 16, 32))); - emit_move_insn (dest, gen_rtx_XOR (DImode, temp, GEN_INT (0x80000000))); + count_or_emit_insn (temp, GEN_INT (sext_hwi ((ud2 | 0x8000) << 16, 32))); + count_or_emit_insn (dest, + gen_rtx_XOR (DImode, temp, GEN_INT (0x80000000))); + return; } - else if (ud4 == 0xffff && ud3 == 0xffff && (ud1 & 0x8000)) + + if (ud4 == 0xffff && ud3 == 0xffff && (ud1 & 0x8000)) { /* li; xoris */ - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); - emit_move_insn (temp, GEN_INT (sext_hwi (ud1, 16))); - emit_move_insn (dest, gen_rtx_XOR (DImode, temp, - GEN_INT ((ud2 ^ 0xffff) << 16))); + count_or_emit_insn (temp, GEN_INT (sext_hwi (ud1, 16))); + count_or_emit_insn (dest, gen_rtx_XOR (DImode, temp, + GEN_INT ((ud2 ^ 0xffff) << 16))); + return; } - else if (can_be_built_by_li_lis_and_rotldi (c, &shift, &mask) - || can_be_built_by_li_lis_and_rldicl (c, &shift, &mask) - || can_be_built_by_li_lis_and_rldicr (c, &shift, &mask) - || can_be_built_by_li_and_rldic (c, &shift, &mask)) + + int shift; + HOST_WIDE_INT mask; + if (can_be_built_by_li_lis_and_rotldi (c, &shift, &mask) + || can_be_built_by_li_lis_and_rldicl (c, &shift, &mask) + || can_be_built_by_li_lis_and_rldicr (c, &shift, &mask) + || can_be_built_by_li_and_rldic (c, &shift, &mask)) { - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); + /* li/lis; rldicX */ unsigned HOST_WIDE_INT imm = (c | ~mask); imm = (imm >> shift) | (imm << (HOST_BITS_PER_WIDE_INT - shift)); - emit_move_insn (temp, GEN_INT (imm)); + count_or_emit_insn (temp, GEN_INT (imm)); if (shift != 0) temp = gen_rtx_ROTATE (DImode, temp, GEN_INT (shift)); if (mask != HOST_WIDE_INT_M1) temp = gen_rtx_AND (DImode, temp, GEN_INT (mask)); - emit_move_insn (dest, temp); - } - else if (ud3 == 0 && ud4 == 0) - { - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); + count_or_emit_insn (dest, temp); - gcc_assert (ud2 & 0x8000); + return; + } - if (ud1 == 0) - { - /* lis; rldicl */ - emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32))); - emit_move_insn (dest, - gen_rtx_AND (DImode, temp, GEN_INT (0xffffffff))); - } - else if (!(ud1 & 0x8000)) + if (ud3 == 0 && ud4 == 0) + { + gcc_assert ((ud2 & 0x8000) && ud1 != 0); + if (!(ud1 & 0x8000)) { /* li; oris */ - emit_move_insn (temp, GEN_INT (ud1)); - emit_move_insn (dest, - gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16))); + count_or_emit_insn (temp, GEN_INT (ud1)); + count_or_emit_insn (dest, + gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16))); + return; } - else - { - /* lis; ori; rldicl */ - emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32))); - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); - emit_move_insn (dest, + + /* lis; ori; rldicl */ + count_or_emit_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32))); + count_or_emit_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); + count_or_emit_insn (dest, gen_rtx_AND (DImode, temp, GEN_INT (0xffffffff))); - } + return; } - else if (ud1 == ud3 && ud2 == ud4) + + if (ud1 == ud3 && ud2 == ud4) { - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); HOST_WIDE_INT num = (ud2 << 16) | ud1; - rs6000_emit_set_long_const (temp, sext_hwi (num, 32)); + rs6000_emit_set_long_const (temp, sext_hwi (num, 32), num_insns); + rtx one = gen_rtx_AND (DImode, temp, GEN_INT (0xffffffff)); rtx two = gen_rtx_ASHIFT (DImode, temp, GEN_INT (32)); - emit_move_insn (dest, gen_rtx_IOR (DImode, one, two)); + count_or_emit_insn (dest, gen_rtx_IOR (DImode, one, two)); + return; } - else if ((ud4 == 0xffff && (ud3 & 0x8000)) - || (ud4 == 0 && ! (ud3 & 0x8000))) - { - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); - emit_move_insn (temp, GEN_INT (sext_hwi (ud3 << 16, 32))); + if ((ud4 == 0xffff && (ud3 & 0x8000)) || (ud4 == 0 && !(ud3 & 0x8000))) + { + count_or_emit_insn (temp, GEN_INT (sext_hwi (ud3 << 16, 32))); if (ud2 != 0) - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud2))); - emit_move_insn (ud1 != 0 ? temp : dest, - gen_rtx_ASHIFT (DImode, temp, GEN_INT (16))); + count_or_emit_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud2))); + count_or_emit_insn (ud1 != 0 ? temp : dest, + gen_rtx_ASHIFT (DImode, temp, GEN_INT (16))); if (ud1 != 0) - emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); + count_or_emit_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1))); + return; } - else if (TARGET_PREFIXED) + + if (TARGET_PREFIXED) { if (can_create_pseudo_p ()) { - /* pli A,L + pli B,H + rldimi A,B,32,0. */ - temp = gen_reg_rtx (DImode); - rtx temp1 = gen_reg_rtx (DImode); - emit_move_insn (temp, GEN_INT ((ud4 << 16) | ud3)); - emit_move_insn (temp1, GEN_INT ((ud2 << 16) | ud1)); - - emit_insn (gen_rotldi3_insert_3 (dest, temp, GEN_INT (32), temp1, - GEN_INT (0xffffffff))); + /* pli A,L; pli B,H; rldimi A,B,32,0. */ + rtx temp1 = num_insns ? nullptr : gen_reg_rtx (DImode); + count_or_emit_insn (temp, GEN_INT ((ud4 << 16) | ud3)); + count_or_emit_insn (temp1, GEN_INT ((ud2 << 16) | ud1)); + count_or_emit_insn (dest, temp, temp1); + return; } - else - { - /* pli A,H + sldi A,32 + paddi A,A,L. */ - emit_move_insn (dest, GEN_INT ((ud4 << 16) | ud3)); - emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32))); + /* There may be 1 insn inaccurate because of no info about dest. */ + bool can_use_paddi = dest ? REGNO (dest) != FIRST_GPR_REGNO : false; - bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO; + /* pli A,H; sldi A,32; paddi A,A,L. */ + count_or_emit_insn (dest, GEN_INT ((ud4 << 16) | ud3)); + count_or_emit_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32))); - /* Use paddi for the low 32 bits. */ - if (ud2 != 0 && ud1 != 0 && can_use_paddi) - emit_move_insn (dest, gen_rtx_PLUS (DImode, dest, + /* Use paddi for the low 32 bits. */ + if (ud2 != 0 && ud1 != 0 && can_use_paddi) + count_or_emit_insn (dest, gen_rtx_PLUS (DImode, dest, GEN_INT ((ud2 << 16) | ud1))); - - /* Use oris, ori for low 32 bits. */ - if (ud2 != 0 && (ud1 == 0 || !can_use_paddi)) - emit_move_insn (dest, + /* Use oris, ori for low 32 bits. */ + if (ud2 != 0 && (ud1 == 0 || !can_use_paddi)) + count_or_emit_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud2 << 16))); - if (ud1 != 0 && (ud2 == 0 || !can_use_paddi)) - emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1))); - } + if (ud1 != 0 && (ud2 == 0 || !can_use_paddi)) + count_or_emit_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1))); + return; } - else + + if (can_create_pseudo_p ()) { - if (can_create_pseudo_p ()) - { - /* lis HIGH,UD4 ; ori HIGH,UD3 ; - lis LOW,UD2 ; ori LOW,UD1 ; rldimi LOW,HIGH,32,0. */ - rtx high = gen_reg_rtx (DImode); - rtx low = gen_reg_rtx (DImode); - HOST_WIDE_INT num = (ud2 << 16) | ud1; - rs6000_emit_set_long_const (low, sext_hwi (num, 32)); - num = (ud4 << 16) | ud3; - rs6000_emit_set_long_const (high, sext_hwi (num, 32)); - emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low, - GEN_INT (0xffffffff))); - } - else - { - /* lis DEST,UD4 ; ori DEST,UD3 ; rotl DEST,32 ; - oris DEST,UD2 ; ori DEST,UD1. */ - emit_move_insn (dest, GEN_INT (sext_hwi (ud4 << 16, 32))); - if (ud3 != 0) - emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud3))); + /* lis HIGH,UD4 ; ori HIGH,UD3 ; + lis LOW,UD2 ; ori LOW,UD1 ; rldimi LOW,HIGH,32,0. */ + rtx high = num_insns ? nullptr : gen_reg_rtx (DImode); + rtx low = num_insns ? nullptr : gen_reg_rtx (DImode); + HOST_WIDE_INT num = (ud2 << 16) | ud1; + rs6000_emit_set_long_const (low, sext_hwi (num, 32), num_insns); + num = (ud4 << 16) | ud3; + rs6000_emit_set_long_const (high, sext_hwi (num, 32), num_insns); - emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32))); - if (ud2 != 0) - emit_move_insn (dest, - gen_rtx_IOR (DImode, dest, GEN_INT (ud2 << 16))); - if (ud1 != 0) - emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1))); - } + count_or_emit_insn (dest, high, low); + return; } + + /* lis DEST,UD4 ; ori DEST,UD3 ; rotl DEST,32 ; + oris DEST,UD2 ; ori DEST,UD1. */ + count_or_emit_insn (dest, GEN_INT (sext_hwi (ud4 << 16, 32))); + if (ud3 != 0) + count_or_emit_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud3))); + + count_or_emit_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32))); + if (ud2 != 0) + count_or_emit_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud2 << 16))); + if (ud1 != 0) + count_or_emit_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1))); + + return; } /* Helper for the following. Get rid of [r+r] memory refs From patchwork Wed Dec 6 05:24:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1872432 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=r7dHFsoU; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SlQnw191fz23nQ for ; Wed, 6 Dec 2023 16:24:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC4DF38708BF for ; Wed, 6 Dec 2023 05:24:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id C01343857C51; Wed, 6 Dec 2023 05:24:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C01343857C51 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C01343857C51 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; cv=none; b=e9xbbRnLW3yffdrFOFMHTR0/I0TM3/5uLR7pexBH7FpttbD2BMTXYkRQlK6wMr3SQdmetR6I5QqpJ5fWHon1sFScjVqLkwEQ5v0a7s2/r8iZFaz72A3Z6DU9d4dqLYMCjFgnRFBWbmtw882WMuqI6XwhRrKm83xmF8eWi/4MI8s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; c=relaxed/simple; bh=RMBZiIU6oAXXefqViQdUpb/gtfR7Q8hJS6DDVvDzDcM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=WBFLmplQ5wOV79SLNKFnInLWNlQOumbwmFjYR65lCtiic7FpjDWewYGs7uyEOvfxAsUJ/d2Q+DjmlZ7Dwvbbfn9MGQB4notwI14//y9fOUT6caZDcuu9e88Z2j7x7YoBli1asNO4aKqziQDfWeSf8WIXynSR1EFGTgYTA/Xb3/0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B65MVXH010502; Wed, 6 Dec 2023 05:24:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=NNO2GGo7IN/7DS6zea0LEOdzmAJrhjWkkiPp8xjQhoY=; b=r7dHFsoURJYq1zMYYlYe6pL317zbSXW9LMYOfgPUynEg+Cp3/X4Fw8hxLthMeaNlg5U3 46vVKFuUIbVvlFVjNL4llknoSW1RmKitXZOy5BvnF9LLjLG66dDlYH4xAOfiAU2XfaWd Qgm98hm6gjPNQ1NeneszfzGyfj3WSUqe8TzSRGRwecn0RJZkXtCke6OwGY0OYWZA6w7s z5JK1NKX3kIW+UhTxxH7A6MUZuY4VNMDnnefagtx5m/yHWMQPqMiN7yXtAjFPEPElUCr /ZYrvrn9qoOTIHxY4dRIZCG5dfpPYwku6werL6XKGkEDV+j5qgmlK5KNbzzKrAfb4s5j iA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utjrd81cf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:35 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B65NIso013742; Wed, 6 Dec 2023 05:24:34 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3utjrd81c0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:34 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B65KHM4020810; Wed, 6 Dec 2023 05:24:33 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3utav4ab0m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:33 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B65OUOK27263564 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Dec 2023 05:24:30 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9A77B2004E; Wed, 6 Dec 2023 05:24:30 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8143B20040; Wed, 6 Dec 2023 05:24:29 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 6 Dec 2023 05:24:29 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V3 2/3] Using pli for constant splitting Date: Wed, 6 Dec 2023 13:24:26 +0800 Message-Id: <20231206052427.143889-2-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231206052427.143889-1-guojiufu@linux.ibm.com> References: <20231206052427.143889-1-guojiufu@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: cZRfQdMxqK0aR2cP73Uc968b27VgPvu_ X-Proofpoint-GUID: AMV0hdaF3ztmLjebUDTzWqzotISt0RiE X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_04,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 priorityscore=1501 lowpriorityscore=0 mlxlogscore=999 suspectscore=0 spamscore=0 malwarescore=0 clxscore=1011 mlxscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060043 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, For constant building e.g. r120=0x66666666, which does not fit 'li or lis', 'pli' is used to build this constant via 'emit_move_insn'. While for a complicated constant, e.g. 0x6666666666666666ULL, when using 'rs6000_emit_set_long_const' to split the constant recursively, it fails to use 'pli' to build the half part constant: 0x66666666. 'rs6000_emit_set_long_const' could be updated to use 'pli' to build half part of the constant when necessary. For example: 0x6666666666666666ULL, "pli 3,1717986918; rldimi 3,3,32,0" can be used. Compare with previous: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636567.html This verion is refreshed and added with a new testcase. Bootstrap®test pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add code to use pli for 34bit constant. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const_split_pli.c: New test. --- gcc/config/rs6000/rs6000.cc | 7 +++++++ gcc/testsuite/gcc.target/powerpc/const_split_pli.c | 9 +++++++++ 2 files changed, 16 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/const_split_pli.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index dbdc72dce5d..2e074a21a05 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10509,6 +10509,13 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c, int *num_insns) GEN_INT (0xffffffff))); }; + if (TARGET_PREFIXED && SIGNED_INTEGER_34BIT_P (c)) + { + /* li/lis/pli */ + count_or_emit_insn (dest, GEN_INT (c)); + return; + } + if ((ud4 == 0xffff && ud3 == 0xffff && ud2 == 0xffff && (ud1 & 0x8000)) || (ud4 == 0 && ud3 == 0 && ud2 == 0 && !(ud1 & 0x8000))) { diff --git a/gcc/testsuite/gcc.target/powerpc/const_split_pli.c b/gcc/testsuite/gcc.target/powerpc/const_split_pli.c new file mode 100644 index 00000000000..626c93084aa --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/const_split_pli.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target power10_ok } */ + +unsigned long long msk66() { return 0x6666666666666666ULL; } + +/* { dg-final { scan-assembler-times {\mpli\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mli\M} } } */ +/* { dg-final { scan-assembler-not {\mlis\M} } } */ From patchwork Wed Dec 6 05:24:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1872433 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=UlUmF/TT; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SlQpT2gsZz23nD for ; Wed, 6 Dec 2023 16:25:21 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6C2AD38708B6 for ; Wed, 6 Dec 2023 05:25:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A46E73857B95; Wed, 6 Dec 2023 05:24:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A46E73857B95 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A46E73857B95 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; cv=none; b=ss3850U+R8trhD1KAMkinorvKy1/8GWqity4YuG7/peGDpyL1Ftq7w7pU6icu6QwJkKn1eqLD/LIUFXiCiPm6K9d4KR9edowjuM1Rge9AUIStYdki30P1//8RADODBD689MhBoljQWbwcsjIWOEekDS27sIDgMqB8LS/vtbkkWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701840279; c=relaxed/simple; bh=zg0la/N2ygqStR1u9tUqKr7Xq9b/a610AqXFMnhIb4g=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Aqh/4O9PZhRltXhmiVkH/EMrT9k5H74dv2Hxui+rf5PsBPbqK6A1f5428MaWB/cIVKzb/2KUtvMt2/qyUrjG3pNarOCqh9YQ6USiN26Zm1ZUWvIh18E3azedKtFY7KwwoWZOIfkw4qvQWTttCnr5aB9MHYU5vaUuvSvtFO/nrd8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B64CUns000708; Wed, 6 Dec 2023 05:24:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=DdbiOpjG1N2263o9mgLyb76zqY9dHT8iBGzgwQfi6qY=; b=UlUmF/TT2alvnvpO3AiGbIPPPnbXi8VxF9A95vesU3l9WEHr3Xk1jr18ehMfSfPiUMMk AWmzEhaN94QQ8gNdaAbGFIJXD4Go13693qwxbB7uTLNi9Ai5/Sc3ZsgcwN1LCtesQ78H MQRqf8+S2RBihb3KA5KiPQbhJfDvuU0ivdVEuVrvW+K9GeyqCt9tlLtPEwHEpc52uEVH mZVhzgc3+9479t+BVTzQyKkQikTB0i+l3nVch1cZP88+NZA/7mSgenxtIJCp6gdW6/4e AgzQ8smG//kmgRN4RrJhijtPDKPyWbNy06FVR5fiiCJ7aYTS1zHKtAdIH0/qQkhtyQGD Bg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uthqk1k9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:36 +0000 Received: from m0353727.ppops.net (m0353727.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3B65BxNT004096; Wed, 6 Dec 2023 05:24:35 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uthqk1k93-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:35 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3B65K4Cn011068; Wed, 6 Dec 2023 05:24:34 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3utav2tayv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Dec 2023 05:24:34 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3B65OWOa20316808 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Dec 2023 05:24:32 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EAF0520065; Wed, 6 Dec 2023 05:24:31 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DBE8E2004B; Wed, 6 Dec 2023 05:24:30 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 6 Dec 2023 05:24:30 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V3 3/3] split complicate constant to memory Date: Wed, 6 Dec 2023 13:24:27 +0800 Message-Id: <20231206052427.143889-3-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231206052427.143889-1-guojiufu@linux.ibm.com> References: <20231206052427.143889-1-guojiufu@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 6rdrT_96sOEOngEKX94T8-4C_dgpoW57 X-Proofpoint-ORIG-GUID: VhpRA4FCXzaxHBqcY4ZX1OwLORAdprMV X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-06_04,2023-12-05_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 malwarescore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312060043 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, Sometimes, a complicated constant is built via 3(or more) instructions to build. Generally speaking, it would not be as fast as loading it from the constant pool (as a few discussions in PR63281): * "ld" is one instruction. If consider "address/toc" adjust, we may count it as 2 instructions (the high part of address computation could be optimized as nop by linker further). And "pld" may need fewer cycles. * As testing(SPEC2017), it could get better/stable runtime if set the threshold as "> 2" (compare with "> 3"). As tested on spec2017, for visible performance changes, we can find the runtime improvement on 500.perlbench_r about ~1.8% (-O2, P10) with the patch. And for performance downgrades on other benchmarks, as investigated, the recessions are not caused by this patch. Compare with the previous version: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636566.html This version is refreshed based on the latest code. Boostrap & regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) PR target/63281 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split complicate constant to memory. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test point. * gcc.target/powerpc/pr106550.c: Likewise.. * gcc.target/powerpc/pr106550_1.c: Likewise. * gcc.target/powerpc/pr87870.c: Update according to latest behavior. * gcc.target/powerpc/pr93012.c: Likewise. --- gcc/config/rs6000/rs6000.cc | 16 ++++++++++++++++ .../gcc.target/powerpc/const_anchors.c | 5 ++--- .../gcc.target/powerpc/parall_5insn_const.c | 14 ++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550_1.c | 15 +++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr87870.c | 5 ++++- gcc/testsuite/gcc.target/powerpc/pr93012.c | 5 ++++- 7 files changed, 66 insertions(+), 11 deletions(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 2e074a21a05..e44a6da91ae 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10271,6 +10271,22 @@ rs6000_emit_set_const (rtx dest, rtx source) c = sext_hwi (c, 32); emit_move_insn (lo, GEN_INT (c)); } + + /* If it can be stored to the constant pool and profitable. */ + else if (base_reg_operand (dest, mode) + && num_insns_constant (source, mode) > 2) + { + rtx sym = force_const_mem (mode, source); + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0)) + && use_toc_relative_ref (XEXP (sym, 0), mode)) + { + rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest)); + sym = gen_const_mem (mode, toc); + set_mem_alias_set (sym, get_TOC_alias_set ()); + } + + emit_insn (gen_rtx_SET (dest, sym)); + } else rs6000_emit_set_long_const (dest, c); break; diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c index 542e2674b12..188744165f2 100644 --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c @@ -1,5 +1,5 @@ /* { dg-do compile { target has_arch_ppc64 } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fdump-rtl-final" } */ #define C1 0x2351847027482577ULL #define C2 0x2351847027482578ULL @@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long long b) if (b) *a++ = C2; } - -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */ +/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c index e3a9a7264cf..df0690b90be 100644 --- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c @@ -9,8 +9,18 @@ void __attribute__ ((noinline)) foo (unsigned long long *a) { /* 2 lis + 2 ori + 1 rldimi for each constant. */ - *a++ = 0x800aabcdc167fa16ULL; - *a++ = 0x7543a876867f616ULL; + { + register long long d asm("r0") = 0x800aabcdc167fa16ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + register long long d asm("r0") = 0x7543a876867f616ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL}; diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c b/gcc/testsuite/gcc.target/powerpc/pr106550.c index 74e395331ab..5eca2b2f701 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c @@ -1,12 +1,25 @@ /* PR target/106550 */ /* { dg-options "-O2 -mdejagnu-cpu=power10" } */ /* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target has_arch_ppc64 } */ void foo (unsigned long long *a) { - *a++ = 0x020805006106003; /* pli+pli+rldimi */ - *a++ = 0x2351847027482577;/* pli+pli+rldimi */ + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x020805006106003ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x2351847027482577ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 4 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c index 5ab40d71a56..80e6b817dff 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c @@ -13,8 +13,19 @@ foo (unsigned long long *a) asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); *a++ = n; - *a++ = 0x235a8470a7480000ULL; /* pli+sldi+oris */ - *a++ = 0x23a184700000b677ULL; /* pli+sldi+ori */ + { + register long long d asm("r0") = 0x235a8470a7480000ULL; /* pli+sldi+oris */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + + { + register long long d asm("r0") = 0x23a184700000b677ULL; /* pli+sldi+ori */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 3 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr87870.c b/gcc/testsuite/gcc.target/powerpc/pr87870.c index d2108ac3386..5fee06744ae 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr87870.c +++ b/gcc/testsuite/gcc.target/powerpc/pr87870.c @@ -25,4 +25,7 @@ test3 (void) return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; } -/* { dg-final { scan-assembler-not {\mld\M} } } */ +/* test3 using "ld" to load the value for r3 and r4. + test0, test1 and test2 are using "li". */ +/* { dg-final { scan-assembler-times {\mp?ld\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mli\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index 4f764d0576f..b9e869e4285 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -10,4 +10,7 @@ unsigned long long mskh1() { return 0xffff9234ffff9234ULL; } unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mpli\M} 4 { target has_arch_pwr10 }} } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 7 { target has_arch_pwr10 } } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 3 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\mld\M} 4 { target { ! has_arch_pwr10 } } } } */