From patchwork Thu Jun 13 02:19:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1947227 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=XXTthTeM; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W05k65j4bz20fP for ; Thu, 13 Jun 2024 12:21:02 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18C743882056 for ; Thu, 13 Jun 2024 02:21:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4217E3882052; Thu, 13 Jun 2024 02:19:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4217E3882052 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4217E3882052 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718245193; cv=none; b=oIij5oZkj6b9Z6K2/0Fi2fnlBKIYKUjHQ4BH3NX7iw1Gz35RY2PhI+zxmW/BsAf2st8MkmqJSaQ0ddFqaEd7VH5SRHz6SJmRARfQDHP501SIWiiPnGIw/hWPZ0QFIk6MO0YfVhN6AzHkOhIqPBWGfmYJVwlhikTcotSzipymjG4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718245193; c=relaxed/simple; bh=4MR5nVMf+KWwVZxMJhs/mYqNqIHLfvkFOorn0uSmbjY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=YJTsQuw73JhM2L8r8oWtFH/L9Ky0xDhKT0aPBBTKKFenDQ2Q8o+ZjQTO2tl/58Cn8StZtEk+VYRjtCAGUgDL+W9sSOi8LGgKG3cqbimmMibTgAoI9L/8Ul1Eow4djjyApgd65K/zAlOoTQVm2UGT5e2zHvj0kTPmQ4T/Y9Kfetc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45D1tpsv007983; Thu, 13 Jun 2024 02:19:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from :to:cc:subject:date:message-id:mime-version :content-transfer-encoding; s=pp1; bh=1/rBvBnKHkUOfzCb5c86Do/LRe MzxoZf0b4T25L68C8=; b=XXTthTeMpP/Stqypzh/gS1rvJnhjXlHOhvQLpUTTZO ScwcsuaE2Vqw/kTm5Doe/UCoBH+uWq/0EwfS5yadlI++g2Yw9YJmrFNCth1xK92s +nA8LHr79cVhC002TFH0xz5nydzfihSspC/D4mWFtRuKczfhPTTNnO0Cts08FSmB EiexPsvIE5ESyiB9RfpkRal2pI9gKpWZPYk1Vy2FJOl30Qq0APwXMNSHhXD9+3mg QWT9wj+122Fq6H3s4br/+e9FMMXqznFFZeZI1Wu2B4E483Lb7TEUeQadzcYAmoX0 1Z4LacqRQSsWVv6cSsJpcoyj55yAQhjv+1ixcQF126vA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yqq4rr2qs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 13 Jun 2024 02:19:49 +0000 (GMT) Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 45D2JngA010779; Thu, 13 Jun 2024 02:19:49 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3yqq4rr2qp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 13 Jun 2024 02:19:49 +0000 (GMT) Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 45D07tbP003905; Thu, 13 Jun 2024 02:19:48 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3yn2mq37x4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 13 Jun 2024 02:19:48 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 45D2Jg2N46465478 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 13 Jun 2024 02:19:44 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4855220043; Thu, 13 Jun 2024 02:19:42 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E086B20040; Thu, 13 Jun 2024 02:19:40 +0000 (GMT) Received: from nilram.aus.stglabs.ibm.com (unknown [9.40.204.36]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 13 Jun 2024 02:19:40 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, linkw@gcc.gnu.org, dje.gcc@gmail.com, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V5 1/2] split complicate 64bit constant to memory Date: Thu, 13 Jun 2024 10:19:38 +0800 Message-ID: <20240613021940.4000707-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: __fKmcSW7v-uJKKXC2kqi7NC07_hmb2P X-Proofpoint-GUID: DISYmctYnnOG2_9pJh1D-_Ze5U8JsQty X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-12_12,2024-06-12_02,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 adultscore=0 spamscore=0 mlxscore=0 priorityscore=1501 bulkscore=0 malwarescore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 suspectscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2405170001 definitions=main-2406130010 X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, Sometimes, a complicated constant is built via 3(or more) instructions. Generally speaking, it would not be as fast as loading it from the constant pool (as the discussions in PR63281): "ld" is one instruction. If consider "address/toc" adjust, we may count it as 2 instructions. And "pld" may need fewer cycles. As testing(SPEC2017), it could get better/stable runtime if set the threshold as "> 2" (compare with "> 3"). As known, because the constant is load from memory by this patch, so this functionality may affect the cache missing. While, IMHO, this patch would be still do the right thing. Compare with the previous version: This version 1. allow assigning complicate constant to r0 before RA, 2. allow more condition beside TARGET_ELF, 3. updated test cases, and remove 2 test cases as the orignal test point is not used any more. Boostrap & regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) PR target/63281 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_const): Split constant to memory under -m64. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const_anchors.c: Test final-rtl. * gcc.target/powerpc/pr106550_1.c (FORCE_CONST_INTO_REG): New macro. * gcc.target/powerpc/pr106550_1.c: Use macro FORCE_CONST_INTO_REG. * gcc.target/powerpc/pr87870.c: Update asm insn checking. * gcc.target/powerpc/pr93012.c: Likewise. * gcc.target/powerpc/parall_5insn_const.c: Removed. * gcc.target/powerpc/pr106550.c: Removed. * gcc.target/powerpc/pr63281.c: New test. --- gcc/config/rs6000/rs6000.cc | 15 +++++++++++ .../gcc.target/powerpc/const_anchors.c | 5 ++-- .../gcc.target/powerpc/parall_5insn_const.c | 27 ------------------- gcc/testsuite/gcc.target/powerpc/pr106550.c | 14 ---------- gcc/testsuite/gcc.target/powerpc/pr106550_1.c | 16 ++++++----- gcc/testsuite/gcc.target/powerpc/pr63281.c | 11 ++++++++ gcc/testsuite/gcc.target/powerpc/pr87870.c | 5 +++- gcc/testsuite/gcc.target/powerpc/pr93012.c | 6 ++++- 8 files changed, 47 insertions(+), 52 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr106550.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr63281.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index e4dc629ddcc..bc9d6f5c34f 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10240,6 +10240,21 @@ rs6000_emit_set_const (rtx dest, rtx source) c = sext_hwi (c, 32); emit_move_insn (lo, GEN_INT (c)); } + + else if ((can_create_pseudo_p () || base_reg_operand (dest, mode)) + && TARGET_64BIT && num_insns_constant (source, mode) > 2) + { + rtx sym = force_const_mem (mode, source); + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0)) + && use_toc_relative_ref (XEXP (sym, 0), mode)) + { + rtx toc = create_TOC_reference (XEXP (sym, 0), dest); + sym = gen_const_mem (mode, toc); + set_mem_alias_set (sym, get_TOC_alias_set ()); + } + + emit_move_insn (dest, sym); + } else rs6000_emit_set_long_const (dest, c); break; diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c index 542e2674b12..682e773d506 100644 --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c @@ -1,5 +1,5 @@ /* { dg-do compile { target has_arch_ppc64 } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fdump-rtl-final" } */ #define C1 0x2351847027482577ULL #define C2 0x2351847027482578ULL @@ -17,4 +17,5 @@ void __attribute__ ((noinline)) foo1 (long long *a, long long b) *a++ = C2; } -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */ +/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */ + diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c deleted file mode 100644 index e3a9a7264cf..00000000000 --- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c +++ /dev/null @@ -1,27 +0,0 @@ -/* { dg-do run } */ -/* { dg-options "-O2 -mno-prefixed -save-temps" } */ -/* { dg-require-effective-target has_arch_ppc64 } */ - -/* { dg-final { scan-assembler-times {\mlis\M} 4 } } */ -/* { dg-final { scan-assembler-times {\mori\M} 4 } } */ -/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */ - -void __attribute__ ((noinline)) foo (unsigned long long *a) -{ - /* 2 lis + 2 ori + 1 rldimi for each constant. */ - *a++ = 0x800aabcdc167fa16ULL; - *a++ = 0x7543a876867f616ULL; -} - -long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL}; -int -main () -{ - long long res[2]; - - foo (res); - if (__builtin_memcmp (res, A, sizeof (res)) != 0) - __builtin_abort (); - - return 0; -} diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c b/gcc/testsuite/gcc.target/powerpc/pr106550.c deleted file mode 100644 index 92b76ac8811..00000000000 --- a/gcc/testsuite/gcc.target/powerpc/pr106550.c +++ /dev/null @@ -1,14 +0,0 @@ -/* PR target/106550 */ -/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ -/* { dg-require-effective-target has_arch_ppc64 } */ - -void -foo (unsigned long long *a) -{ - *a++ = 0x020805006106003; /* pli+pli+rldimi */ - *a++ = 0x2351847027482577;/* pli+pli+rldimi */ -} - -/* { dg-final { scan-assembler-times {\mpli\M} 4 } } */ -/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */ - diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c index 5ab40d71a56..aa98f31865e 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c @@ -4,17 +4,19 @@ /* { dg-options "-O2 -mdejagnu-cpu=power10 -fdisable-rtl-split1" } */ /* force the constant splitter run after RA: -fdisable-rtl-split1. */ +#define FORCE_CONST_INTO_REG(DEST, CST) \ + { \ + register long long d asm ("r0") = CST; \ + asm volatile ("std %1, %0" : : "m"(DEST), "r"(d)); \ + } + void foo (unsigned long long *a) { /* Test oris/ori is used where paddi does not work with 'r0'. */ - register long long d asm("r0") = 0x1245abcef9240dec; /* pli+sldi+oris+ori */ - long long n; - asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); - *a++ = n; - - *a++ = 0x235a8470a7480000ULL; /* pli+sldi+oris */ - *a++ = 0x23a184700000b677ULL; /* pli+sldi+ori */ + FORCE_CONST_INTO_REG (*a++, 0x1245abcef9240dec); /* pli+sldi+oris+ori */ + FORCE_CONST_INTO_REG (*a++, 0x235a8470a7480000ULL); /* pli+sldi+oris */ + FORCE_CONST_INTO_REG (*a++, 0x23a184700000b677ULL); /* pli+sldi+ori */ } /* { dg-final { scan-assembler-times {\mpli\M} 3 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr63281.c b/gcc/testsuite/gcc.target/powerpc/pr63281.c new file mode 100644 index 00000000000..9763a7181fc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr63281.c @@ -0,0 +1,11 @@ +/* Check loading constant from memory pool. */ +/* { dg-options "-O2 -mpowerpc64" } */ + +void +foo (unsigned long long *a) +{ + *a++ = 0x2351847027482577ULL; +} + +/* { dg-final { scan-assembler-times {\mp?ld\M} 1 { target { lp64 } } } } */ + diff --git a/gcc/testsuite/gcc.target/powerpc/pr87870.c b/gcc/testsuite/gcc.target/powerpc/pr87870.c index d2108ac3386..09b2e8de901 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr87870.c +++ b/gcc/testsuite/gcc.target/powerpc/pr87870.c @@ -25,4 +25,7 @@ test3 (void) return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; } -/* { dg-final { scan-assembler-not {\mld\M} } } */ +/* test3 is using "ld" to load the value to r3 and r4. So there are 2 'ld's + test0, test1 and test2 are using "li", then check 6 'li's. */ +/* { dg-final { scan-assembler-times {\mp?ld\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mli\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index 4f764d0576f..660fb0dddfa 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -10,4 +10,8 @@ unsigned long long mskh1() { return 0xffff9234ffff9234ULL; } unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 7 { target has_arch_pwr10 } } } */ + +/* 4 complicated constants can be loaded from pool. */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 3 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\mld\M} 4 { target { ! has_arch_pwr10 } } } } */