From patchwork Thu Sep 1 03:24:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1672616 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=n6CbFTGd; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MJ5xy42Ffz1yhP for ; Thu, 1 Sep 2022 13:24:37 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 164BB3834F3A for ; Thu, 1 Sep 2022 03:24:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 164BB3834F3A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1662002671; bh=/R/EPxNjOZSVKzuELq+ONanIFwQSoA7140WyrF6eYPc=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=n6CbFTGdJFBra7OFrtGHEAPGtILL6vhp1kZSE/nVx+Xt30xbj0TVKT/QEzBW9FnSr rX9Af/Dl/0jHohaaC55IZ9fdpaevcioMbhMuu2ziwRJlyNm60d0FNuUISInV+SLZlZ uv3O0meQNlY+CuVXDdWeMeszuey8sYZV2pSME7t4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id F20163834F16; Thu, 1 Sep 2022 03:24:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F20163834F16 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2813IH0C003956; Thu, 1 Sep 2022 03:24:08 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jamqc04ye-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:08 +0000 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2813KF3P011702; Thu, 1 Sep 2022 03:24:08 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jamqc04x2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:07 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2813M9Qc006897; Thu, 1 Sep 2022 03:24:05 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma01fra.de.ibm.com with ESMTP id 3j8hkab8tv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:05 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2813O2hQ32702794 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 1 Sep 2022 03:24:02 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4305F4203F; Thu, 1 Sep 2022 03:24:02 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5F35942045; Thu, 1 Sep 2022 03:24:01 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 1 Sep 2022 03:24:01 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants Date: Thu, 1 Sep 2022 11:24:00 +0800 Message-Id: <20220901032400.23692-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: TG_QEFwUV8BMra6PPr1ILEbrG3mhr2QQ X-Proofpoint-ORIG-GUID: kbRTokjymdBt0zqitpqet8FYUuF_qn1U X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-09-01_01,2022-08-31_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxscore=0 malwarescore=0 suspectscore=0 clxscore=1015 phishscore=0 priorityscore=1501 adultscore=0 bulkscore=0 mlxlogscore=999 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2209010011 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Cc: meissner@linux.ibm.com, dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, As mentioned in PR106550, since pli could support 34bits immediate, we could use less instructions(3insn would be ok) to build 64bits constant with pli. For example, for constant 0x020805006106003, we could generate it with: asm code1: pli 9,101736451 (0x6106003) sldi 9,9,32 paddi 9,9, 2130000 (0x0208050) or asm code2: pli 10, 2130000 pli 9, 101736451 rldimi 9, 10, 32, 0 Testing with simple cases as below, run them a lot of times: f1.c long __attribute__ ((noinline)) foo (long *arg,long *,long*) { *arg = 0x2351847027482577; } 5insns: base pli+sldi+paddi: similar -0.08% pli+pli+rldimi: faster +0.66% f2.c long __attribute__ ((noinline)) foo (long *arg, long *arg2, long *arg3) { *arg = 0x2351847027482577; *arg2 = 0x3257845024384680; *arg3 = 0x1245abcef9240dec; } 5nisns: base pli+sldi+paddi: faster +1.35% pli+pli+rldimi: faster +5.49% f2.c would be more meaningful. Because 'sched passes' are effective for f2.c, but 'scheds' do less thing for f1.c. Compare with previous patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599525.html This one updates code slightly and extracts changes on rs6000.md to a seperate patch. This patch pass boostrap and regtest on ppc64le(includes p10). Is it ok for trunk? BR, Jeff(Jiufu) PR target/106550 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add 'pli' for constant building. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106550.c: New test. --- gcc/config/rs6000/rs6000.cc | 39 +++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr106550.c | 14 ++++++++ 2 files changed, 53 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106550.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index df491bee2ea..1ccb2ff30a1 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10181,6 +10181,45 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) gen_rtx_IOR (DImode, copy_rtx (temp), GEN_INT (ud1))); } + else if (TARGET_PREFIXED) + { + /* pli 9,high32 + pli 10,low32 + rldimi 9,10,32,0. */ + if (can_create_pseudo_p ()) + { + temp = gen_reg_rtx (DImode); + rtx temp1 = gen_reg_rtx (DImode); + emit_move_insn (copy_rtx (temp), GEN_INT ((ud4 << 16) | ud3)); + emit_move_insn (copy_rtx (temp1), GEN_INT ((ud2 << 16) | ud1)); + + emit_insn (gen_rotldi3_insert_3 (dest, temp, GEN_INT (32), temp1, + GEN_INT (0xffffffff))); + } + + /* pli 9,high32 + sldi 9,32 + paddi 9,9,low32. */ + else + { + emit_move_insn (copy_rtx (dest), GEN_INT ((ud4 << 16) | ud3)); + + emit_move_insn (copy_rtx (dest), + gen_rtx_ASHIFT (DImode, copy_rtx (dest), + GEN_INT (32))); + + bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO; + + /* Use paddi for the low32 bits. */ + if (ud2 != 0 && ud1 != 0 && can_use_paddi) + emit_move_insn (dest, gen_rtx_PLUS (DImode, copy_rtx (dest), + GEN_INT ((ud2 << 16) | ud1))); + /* Use oris, ori for low32 bits. */ + if (ud2 != 0 && (ud1 == 0 || !can_use_paddi)) + emit_move_insn (ud1 != 0 ? copy_rtx (dest) : dest, + gen_rtx_IOR (DImode, copy_rtx (dest), + GEN_INT (ud2 << 16))); + if (ud1 != 0 && (ud2 == 0 || !can_use_paddi)) + emit_move_insn (dest, gen_rtx_IOR (DImode, copy_rtx (dest), + GEN_INT (ud1))); + } + } else { temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c b/gcc/testsuite/gcc.target/powerpc/pr106550.c new file mode 100644 index 00000000000..c6f4116bb9a --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c @@ -0,0 +1,14 @@ +/* PR target/106550 */ +/* { dg-options "-O2 -std=c99 -mdejagnu-cpu=power10" } */ + +void +foo (unsigned long long *a) +{ + *a++ = 0x020805006106003; + *a++ = 0x2351847027482577; +} + +/* 3 insns for each constant: pli+sldi+paddi or pli+pli+rldimi. + And 3 additional insns: std+std+blr: 9 insns totally. */ +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9 } } */ + From patchwork Thu Sep 1 03:24:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1672615 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=q5nVbX8c; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MJ5xz12tgz1yn9 for ; Thu, 1 Sep 2022 13:24:39 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E6FDC3830089 for ; Thu, 1 Sep 2022 03:24:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E6FDC3830089 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1662002676; bh=SgMTTKeetj27fYbITQRxzeDVjnWSdOmgFj2K1IWa7CQ=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=q5nVbX8c6Q81ZUpyxjbvLrNZ0O4fM/mMzU1njShBylAy2DpWaWwoeq52KuahN8Sbl Rw1at7bOwU97KoR3Nfemd1ysj4YhlOsZZzPJlboWaQ+FaGg3d1wdmtHKB539qhRp2X cjfhMBSU0Fl19sAMIMoYpSogQ7IrFunJnNs3hDsQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 0FD973834F27; Thu, 1 Sep 2022 03:24:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0FD973834F27 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2812uKfG030701; Thu, 1 Sep 2022 03:24:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jamd3gyq3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:15 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 28130pm1018542; Thu, 1 Sep 2022 03:24:14 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jamd3gynq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:14 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2813NZ3M029877; Thu, 1 Sep 2022 03:24:11 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma02fra.de.ibm.com with ESMTP id 3j7aw8ve0w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Sep 2022 03:24:11 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2813O8ne37749078 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 1 Sep 2022 03:24:08 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 72575A404D; Thu, 1 Sep 2022 03:24:08 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8F087A4040; Thu, 1 Sep 2022 03:24:07 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 1 Sep 2022 03:24:07 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/2] allow constant splitter run in split1 pass Date: Thu, 1 Sep 2022 11:24:07 +0800 Message-Id: <20220901032407.23751-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Jsu61FPJC4gwI3p31OczN3awlEVFxOPL X-Proofpoint-ORIG-GUID: X1mdRUsJ1Sx895EtAtiKTmjy-7STUHzp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-09-01_01,2022-08-31_03,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 phishscore=0 clxscore=1015 adultscore=0 spamscore=0 bulkscore=0 lowpriorityscore=0 mlxscore=0 suspectscore=0 priorityscore=1501 mlxlogscore=999 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2209010011 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Cc: meissner@linux.ibm.com, dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, Currently, these two splitters (touched in this patch) are using predicate `int_reg_operand_not_pseudo`, then they work in split2 pass after RA in most times, and can not run before RA. It would not be a bad idea to allow these splitters before RA. Then more passes (between split1 and split2) could optimize the emitted instructions. And if splitting before RA, for these constant splitters, we may have more freedom to create pseduo to generate more parallel instructions. For the example in the leading patch [PATCH 1/2]: pli+plit+rldimi would be better than pli+sldi+paddi. Test this patch with spec, we could see performance gain some times; while the improvement is not stable and woud caused by the patch indirectly. This patch pass boostrap and regtest on ppc64le(includes p10). Is it ok for trunk? Thanks for comments! BR, Jeff(Jiufu) gcc/ChangeLog: * config/rs6000/rs6000.md (const splitter): Update predicate. --- gcc/config/rs6000/rs6000.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 1367a2cb779..ec39676f628 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -9641,7 +9641,7 @@ (define_insn "*movdi_internal64" ; Some DImode loads are best done as a load of -1 followed by a mask ; instruction. (define_split - [(set (match_operand:DI 0 "int_reg_operand_not_pseudo") + [(set (match_operand:DI 0 "int_reg_operand") (match_operand:DI 1 "const_int_operand"))] "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1 @@ -9659,7 +9659,7 @@ (define_split ;; When non-easy constants can go in the TOC, this should use ;; easy_fp_constant predicate. (define_split - [(set (match_operand:DI 0 "int_reg_operand_not_pseudo") + [(set (match_operand:DI 0 "int_reg_operand") (match_operand:DI 1 "const_int_operand"))] "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1" [(set (match_dup 0) (match_dup 2))