From patchwork Thu Oct 12 12:32:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajit Agarwal X-Patchwork-Id: 1847437 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=fHvXZ/JQ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S5pvg2hPnz23jm for ; Thu, 12 Oct 2023 23:33:17 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2634C38555A0 for ; Thu, 12 Oct 2023 12:33:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 870A4385B512 for ; Thu, 12 Oct 2023 12:33:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 870A4385B512 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CBvFln021004; Thu, 12 Oct 2023 12:32:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : cc : from : subject : content-type : content-transfer-encoding; s=pp1; bh=GY2OQu9JZvZprL9PM5VIN7guRGy9u6RC5fMoNKFgdbk=; b=fHvXZ/JQkCzhpo3K9y+opRKRjU06x17ZKZzTCShrjsC1RPD+/3deCADfeZItNxQU2JLI wjOG/iqppYgnYOoVkMjPQZckf/m4Il4G2G/i9kOfSkUjbsXJjEJQsOajGKL7/zDnyXiL MxCgZPuaKccwFwcvM1442FktflaaiBQKcGImhvTtkn6YaZqV7YMMZQrQXIq/5Dee8Yar oq7nEoYjniByCw5cW1pTBIBUwd/987bx50cWRZ4C89Rd7BJfePUgi8uXR/ABKnlG4j5G e2RjjJwwoSJ9WXN6fayD23FBPdEqmjktMjqB8H824/I5+sK1BgYhZxY7PEzppeOI6YuX 8w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tpgcks57m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:32:59 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39CBw4BO024312; Thu, 12 Oct 2023 12:32:59 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tpgcks56d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:32:59 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39CBk0p7001149; Thu, 12 Oct 2023 12:32:57 GMT Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3tkkvk7132-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Oct 2023 12:32:57 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39CCWvIA50069762 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 12 Oct 2023 12:32:57 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2DF2E5805E; Thu, 12 Oct 2023 12:32:57 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E479D58054; Thu, 12 Oct 2023 12:32:54 +0000 (GMT) Received: from [9.43.115.172] (unknown [9.43.115.172]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 12 Oct 2023 12:32:54 +0000 (GMT) Message-ID: <902905ff-6cfe-c20a-57a3-b734b7be13f9@linux.ibm.com> Date: Thu, 12 Oct 2023 18:02:53 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Content-Language: en-US To: gcc-patches Cc: Richard Biener , Jeff Law , Segher Boessenkool , Peter Bergner From: Ajit Agarwal Subject: [PATCH v9] Improve code sinking pass X-TM-AS-GCONF: 00 X-Proofpoint-GUID: C95gPkEw3DpG7Xen0io6k-DTjQEM02xd X-Proofpoint-ORIG-GUID: 2ATBaWJ8i1Aii2tXsxgj0qbJrlDDPe7V X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_05,2023-10-12_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 phishscore=0 adultscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=803 spamscore=0 lowpriorityscore=0 impostorscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310120103 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch improves code sinking pass to sink statements before call to reduce register pressure. Review comments are incorporated. Synced with latest sources and modify the code changes accordingly. For example : void bar(); int j; void foo(int a, int b, int c, int d, int e, int f) { int l; l = a + b + c + d +e + f; if (a != 5) { bar(); j = l; } } Code Sinking does the following: void bar(); int j; void foo(int a, int b, int c, int d, int e, int f) { int l; if (a != 5) { l = a + b + c + d +e + f; bar(); j = l; } } Bootstrapped regtested on powerpc64-linux-gnu. Thanks & Regards Ajit tree-ssa-sink: Improve code sinking pass Currently, code sinking will sink code after function calls. This increases register pressure for callee-saved registers. The following patch improves code sinking by placing the sunk code before calls in the use block or in the immediate dominator of the use blocks. 2023-10-12 Ajit Kumar Agarwal gcc/ChangeLog: PR tree-optimization/81953 * tree-ssa-sink.cc (statement_sink_location): Move statements before calls. (select_best_block): Add heuristics to select the best blocks in the immediate post dominator. gcc/testsuite/ChangeLog: PR tree-optimization/81953 * gcc.dg/tree-ssa/ssa-sink-20.c: New test. * gcc.dg/tree-ssa/ssa-sink-21.c: New test. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 15 ++++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c | 19 ++++++++++ gcc/tree-ssa-sink.cc | 39 ++++++++++++--------- 3 files changed, 56 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c new file mode 100644 index 00000000000..d3b79ca5803 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ +void bar(); +int j; +void foo(int a, int b, int c, int d, int e, int f) +{ + int l; + l = a + b + c + d +e + f; + if (a != 5) + { + bar(); + j = l; + } +} +/* { dg-final { scan-tree-dump {l_12\s+=\s+_4\s+\+\s+f_11\(D\);\n\s+bar\s+\(\)} sink1 } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c new file mode 100644 index 00000000000..84e7938c54f --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sink-stats" } */ +void bar(); +int j, x; +void foo(int a, int b, int c, int d, int e, int f) +{ + int l; + l = a + b + c + d +e + f; + if (a != 5) + { + bar(); + if (b != 3) + x = 3; + else + x = 5; + j = l; + } +} +/* { dg-final { scan-tree-dump {l_13\s+=\s+_4\s+\+\s+f_12\(D\);\n\s+bar\s+\(\)} sink1 } } */ diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc index a360c5cdd6e..95298bc8402 100644 --- a/gcc/tree-ssa-sink.cc +++ b/gcc/tree-ssa-sink.cc @@ -174,7 +174,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts) /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator tree, return the best basic block between them (inclusive) to place - statements. + statements. The best basic block should be an immediate dominator of + best basic block if the use stmt is after the call. We want the most control dependent block in the shallowest loop nest. @@ -196,6 +197,16 @@ select_best_block (basic_block early_bb, basic_block best_bb = late_bb; basic_block temp_bb = late_bb; int threshold; + /* Get the sinking threshold. If the statement to be moved has memory + operands, then increase the threshold by 7% as those are even more + profitable to avoid, clamping at 100%. */ + threshold = param_sink_frequency_threshold; + if (gimple_vuse (stmt) || gimple_vdef (stmt)) + { + threshold += 7; + if (threshold > 100) + threshold = 100; + } while (temp_bb != early_bb) { @@ -204,6 +215,14 @@ select_best_block (basic_block early_bb, if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb)) best_bb = temp_bb; + /* if we have temp_bb post dominated by use block block then immediate + * dominator would be our best block. */ + if (!gimple_vuse (stmt) + && bb_loop_depth (temp_bb) == bb_loop_depth (early_bb) + && !(temp_bb->count * 100 >= early_bb->count * threshold) + && dominated_by_p (CDI_DOMINATORS, late_bb, temp_bb)) + best_bb = temp_bb; + /* Walk up the dominator tree, hopefully we'll find a shallower loop nest. */ temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb); @@ -233,17 +252,6 @@ select_best_block (basic_block early_bb, && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch, best_bb)) return early_bb; - /* Get the sinking threshold. If the statement to be moved has memory - operands, then increase the threshold by 7% as those are even more - profitable to avoid, clamping at 100%. */ - threshold = param_sink_frequency_threshold; - if (gimple_vuse (stmt) || gimple_vdef (stmt)) - { - threshold += 7; - if (threshold > 100) - threshold = 100; - } - /* If BEST_BB is at the same nesting level, then require it to have significantly lower execution frequency to avoid gratuitous movement. */ if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb) @@ -430,6 +438,7 @@ statement_sink_location (gimple *stmt, basic_block frombb, continue; break; } + use = USE_STMT (one_use); if (gimple_code (use) != GIMPLE_PHI) @@ -439,10 +448,7 @@ statement_sink_location (gimple *stmt, basic_block frombb, if (sinkbb == frombb) return false; - if (sinkbb == gimple_bb (use)) - *togsi = gsi_for_stmt (use); - else - *togsi = gsi_after_labels (sinkbb); + *togsi = gsi_after_labels (sinkbb); return true; } @@ -825,7 +831,6 @@ pass_sink_code::execute (function *fun) mark_dfs_back_edges (fun); memset (&sink_stats, 0, sizeof (sink_stats)); calculate_dominance_info (CDI_DOMINATORS); - virtual_operand_live vop_live; int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun));