From patchwork Fri Jul 30 05:20:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1511515 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PqcP1brP; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GbbN35HgSz9sXM for ; Fri, 30 Jul 2021 15:21:06 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AB433858412 for ; Fri, 30 Jul 2021 05:21:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8AB433858412 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1627622463; bh=5XmJBNawBJUJ9ayMWJaAMzLN86a9wEGpzwV0m21h3lA=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=PqcP1brPH6RXALXPAAeL9v03O0L2CzGtHJWcR7eeNAseRHWciZ/bDPuC7T+NlssTi oZtXTYVJk9ihItWASrSsMi4N8qi5XHA5o35s73pEVeLMuK+PX1UCPOFKVh1kebAFuV XKWfMjk+JoCIFMLlLcB5Jjmrjj9VtC4dqV6jGMlE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id BCFA73858401 for ; Fri, 30 Jul 2021 05:20:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BCFA73858401 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16U53Nx0061930; Fri, 30 Jul 2021 01:20:16 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3a49vu1s8r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 30 Jul 2021 01:20:16 -0400 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 16U54mhO068837; Fri, 30 Jul 2021 01:20:15 -0400 Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com with ESMTP id 3a49vu1s7m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 30 Jul 2021 01:20:15 -0400 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16U5CiuS020773; Fri, 30 Jul 2021 05:20:13 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma01fra.de.ibm.com with ESMTP id 3a417pgvy0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 30 Jul 2021 05:20:13 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16U5KA7i14418400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 30 Jul 2021 05:20:10 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C92E64C04E; Fri, 30 Jul 2021 05:20:10 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E78144C04A; Fri, 30 Jul 2021 05:20:08 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.34]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 30 Jul 2021 05:20:08 +0000 (GMT) Subject: [PATCH v2] Make loops_list support an optional loop_p root To: Richard Biener References: <0a8b77ba-1d54-1eff-b54d-d2cb1e769e09@linux.ibm.com> <61ac669c-7293-f53a-20c7-158b5a813cee@linux.ibm.com> Message-ID: <221d8a67-264a-b6a9-e705-bfb4a45f14bb@linux.ibm.com> Date: Fri, 30 Jul 2021 13:20:07 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: eD0mvz5JR1N_ZTQuVysSfVuCwZ8bCUvA X-Proofpoint-GUID: QeqkNk-uPbG4xvOMIHGmIVDMB8G4vpum X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-30_03:2021-07-29, 2021-07-30 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 clxscore=1015 impostorscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 malwarescore=0 phishscore=0 spamscore=0 adultscore=0 lowpriorityscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2107300030 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Bill Schmidt , GCC Patches , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" on 2021/7/29 下午4:01, Richard Biener wrote: > On Fri, Jul 23, 2021 at 10:41 AM Kewen.Lin wrote: >> >> on 2021/7/22 下午8:56, Richard Biener wrote: >>> On Tue, Jul 20, 2021 at 4:37 >>> PM Kewen.Lin wrote: >>>> >>>> Hi, >>>> >>>> This v2 has addressed some review comments/suggestions: >>>> >>>> - Use "!=" instead of "<" in function operator!= (const Iter &rhs) >>>> - Add new CTOR loops_list (struct loops *loops, unsigned flags) >>>> to support loop hierarchy tree rather than just a function, >>>> and adjust to use loops* accordingly. >>> >>> I actually meant struct loop *, not struct loops * ;) At the point >>> we pondered to make loop invariant motion work on single >>> loop nests we gave up not only but also because it iterates >>> over the loop nest but all the iterators only ever can process >>> all loops, not say, all loops inside a specific 'loop' (and >>> including that 'loop' if LI_INCLUDE_ROOT). So the >>> CTOR would take the 'root' of the loop tree as argument. >>> >>> I see that doesn't trivially fit how loops_list works, at least >>> not for LI_ONLY_INNERMOST. But I guess FROM_INNERMOST >>> could be adjusted to do ONLY_INNERMOST as well? >>> >> >> >> Thanks for the clarification! I just realized that the previous >> version with struct loops* is problematic, all traversal is >> still bounded with outer_loop == NULL. I think what you expect >> is to respect the given loop_p root boundary. Since we just >> record the loops' nums, I think we still need the function* fn? > > Would it simplify things if we recorded the actual loop *? > I'm afraid it's unsafe to record the loop*. I had the same question why the loop iterator uses index rather than loop* when I read this at the first time. I guess the design of processing loops allows its user to update or even delete the folllowing loops to be visited. For example, when the user does some tricks on one loop, then it duplicates the loop and its children to somewhere and then removes the loop and its children, when iterating onto its children later, the "index" way will check its validity by get_loop at that point, but the "loop *" way will have some recorded pointers to become dangling, can't do the validity check on itself, seems to need a side linear search to ensure the validity. > There's still the to_visit reserve which needs a bound on > the number of loops for efficiency reasons. > Yes, I still keep the fn in the updated version. >> So I add one optional argument loop_p root and update the >> visiting codes accordingly. Before this change, the previous >> visiting uses the outer_loop == NULL as the termination condition, >> it perfectly includes the root itself, but with this given root, >> we have to use it as the termination condition to avoid to iterate >> onto its possible existing next. >> >> For LI_ONLY_INNERMOST, I was thinking whether we can use the >> code like: >> >> struct loops *fn_loops = loops_for_fn (fn)->larray; >> for (i = 0; vec_safe_iterate (fn_loops, i, &aloop); i++) >> if (aloop != NULL >> && aloop->inner == NULL >> && flow_loop_nested_p (tree_root, aloop)) >> this->to_visit.quick_push (aloop->num); >> >> it has the stable bound, but if the given root only has several >> child loops, it can be much worse if there are many loops in fn. >> It seems impossible to predict the given root loop hierarchy size, >> maybe we can still use the original linear searching for the case >> loops_for_fn (fn) == root? But since this visiting seems not so >> performance critical, I chose to share the code originally used >> for FROM_INNERMOST, hope it can have better readability and >> maintainability. > > I was indeed looking for something that has execution/storage > bound on the subtree we're interested in. If we pull the CTOR > out-of-line we can probably keep the linear search for > LI_ONLY_INNERMOST when looking at the whole loop tree. > OK, I've moved the suggested single loop tree walker out-of-line to cfgloop.c, and brought the linear search back for LI_ONLY_INNERMOST when looking at the whole loop tree. > It just seemed to me that we can eventually re-use a > single loop tree walker for all orders, just adjusting the > places we push. > Wow, good point! Indeed, I have further unified all orders handlings into a single function walk_loop_tree. >> >> Bootstrapped and regtested on powerpc64le-linux-gnu P9, >> x86_64-redhat-linux and aarch64-linux-gnu, also >> bootstrapped on ppc64le P9 with bootstrap-O3 config. >> >> Does the attached patch meet what you expect? > > So yeah, it's probably close to what is sensible. Not sure > whether optimizing the loops for the !only_push_innermost_p > case is important - if we manage to produce a single > walker with conditionals based on 'flags' then IPA-CP should > produce optimal clones as well I guess. > Thanks for the comments, the updated v2 is attached. Comparing with v1, it does: - Unify one single loop tree walker for all orders. - Move walk_loop_tree out-of-line to cfgloop.c. - Keep the linear search for LI_ONLY_INNERMOST with tree_root of fn loops. - Use class loop * instead of loop_p. Bootstrapped & regtested on powerpc64le-linux-gnu Power9 (with/without the hunk for LI_ONLY_INNERMOST linear search, it can have the coverage to exercise LI_ONLY_INNERMOST in walk_loop_tree when "without"). Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * cfgloop.h (loops_list::loops_list): Add one optional argument root and adjust accordingly, update loop tree walking and factor out to ... * cfgloop.c (loops_list::walk_loop_tree): ...this. New function. --- gcc/cfgloop.c | 64 +++++++++++++++++++++++++++++++++++ gcc/cfgloop.h | 92 ++++++++++++++++++--------------------------------- 2 files changed, 97 insertions(+), 59 deletions(-) diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index 6284ae292b6..acdb4ed14c8 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -2104,3 +2104,67 @@ mark_loop_for_removal (loop_p loop) loop->latch = NULL; loops_state_set (LOOPS_NEED_FIXUP); } + +/* Starting from loop tree ROOT, walk loop tree as the visiting + order specified by FLAGS, skipping the loop with number MN. + The supported visiting orders are: + - LI_ONLY_INNERMOST + - LI_FROM_INNERMOST + - Preorder (if neither of above is specified) */ + +void +loops_list::walk_loop_tree (class loop *root, unsigned flags, int mn) +{ + bool only_innermost_p = flags & LI_ONLY_INNERMOST; + bool from_innermost_p = flags & LI_FROM_INNERMOST; + bool preorder_p = !(only_innermost_p || from_innermost_p); + + /* Early handle root without any inner loops, make later + processing simpler, that is all loops processed in the + following while loop are impossible to be root. */ + if (!root->inner) + { + if (root->num != mn) + this->to_visit.quick_push (root->num); + return; + } + + class loop *aloop; + for (aloop = root; + aloop->inner != NULL; + aloop = aloop->inner) + { + if (preorder_p && aloop->num != mn) + this->to_visit.quick_push (aloop->num); + continue; + } + + while (1) + { + gcc_assert (aloop != root); + if (from_innermost_p || aloop->inner == NULL) + this->to_visit.quick_push (aloop->num); + + if (aloop->next) + { + for (aloop = aloop->next; + aloop->inner != NULL; + aloop = aloop->inner) + { + if (preorder_p) + this->to_visit.quick_push (aloop->num); + continue; + } + } + else if (loop_outer (aloop) == root) + break; + else + aloop = loop_outer (aloop); + } + + /* When visiting from innermost, we need to consider root here + since the previous loop doesn't handle it. */ + if (from_innermost_p && root->num != mn) + this->to_visit.quick_push (root->num); +} + diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index d5eee6b4840..3046bf713bb 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -669,13 +669,15 @@ as_const (T &t) } /* A list for visiting loops, which contains the loop numbers instead of - the loop pointers. The scope is restricted in function FN and the - visiting order is specified by FLAGS. */ + the loop pointers. If the loop ROOT is offered (non-null), the visiting + will start from it, otherwise it would start from the tree_root of + loops_for_fn (FN) instead. The scope is restricted in function FN and + the visiting order is specified by FLAGS. */ class loops_list { public: - loops_list (function *fn, unsigned flags); + loops_list (function *fn, unsigned flags, class loop *root = nullptr); template class Iter { @@ -750,6 +752,10 @@ public: } private: + /* Walk loop tree starting from ROOT as the visiting order specified + by FLAGS, skipping the loop with number MN. */ + void walk_loop_tree (class loop *root, unsigned flags, int mn); + /* The function we are visiting. */ function *fn; @@ -782,76 +788,44 @@ loops_list::Iter::fill_curr_loop () } /* Set up the loops list to visit according to the specified - function scope FN and iterating order FLAGS. */ + function scope FN and iterating order FLAGS. If ROOT is + not null, the visiting would start from it, otherwise it + will start from tree_root of loops_for_fn (FN). */ -inline loops_list::loops_list (function *fn, unsigned flags) +inline loops_list::loops_list (function *fn, unsigned flags, class loop *root) { - class loop *aloop; - unsigned i; - int mn; + struct loops *loops = loops_for_fn (fn); + gcc_assert (!root || loops); + + /* Check mutually exclusive flags should not co-exist. */ + unsigned checked_flags = LI_ONLY_INNERMOST | LI_FROM_INNERMOST; + gcc_assert ((flags & checked_flags) != checked_flags); this->fn = fn; - if (!loops_for_fn (fn)) + if (!loops) return; + class loop *tree_root = root ? root : loops->tree_root; + this->to_visit.reserve_exact (number_of_loops (fn)); - mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1; + int mn = (flags & LI_INCLUDE_ROOT) ? -1 : tree_root->num; - if (flags & LI_ONLY_INNERMOST) + /* When root is tree_root of loops_for_fn (fn) and the visiting + order is LI_ONLY_INNERMOST, we would like to use linear + search here since it has a more stable bound than the + walk_loop_tree. */ + if (flags & LI_ONLY_INNERMOST && tree_root == loops->tree_root) { - for (i = 0; vec_safe_iterate (loops_for_fn (fn)->larray, i, &aloop); i++) + class loop *aloop; + unsigned int i; + for (i = 0; vec_safe_iterate (loops->larray, i, &aloop); i++) if (aloop != NULL && aloop->inner == NULL - && aloop->num >= mn) + && aloop->num != mn) this->to_visit.quick_push (aloop->num); } - else if (flags & LI_FROM_INNERMOST) - { - /* Push the loops to LI->TO_VISIT in postorder. */ - for (aloop = loops_for_fn (fn)->tree_root; - aloop->inner != NULL; - aloop = aloop->inner) - continue; - - while (1) - { - if (aloop->num >= mn) - this->to_visit.quick_push (aloop->num); - - if (aloop->next) - { - for (aloop = aloop->next; - aloop->inner != NULL; - aloop = aloop->inner) - continue; - } - else if (!loop_outer (aloop)) - break; - else - aloop = loop_outer (aloop); - } - } else - { - /* Push the loops to LI->TO_VISIT in preorder. */ - aloop = loops_for_fn (fn)->tree_root; - while (1) - { - if (aloop->num >= mn) - this->to_visit.quick_push (aloop->num); - - if (aloop->inner != NULL) - aloop = aloop->inner; - else - { - while (aloop != NULL && aloop->next == NULL) - aloop = loop_outer (aloop); - if (aloop == NULL) - break; - aloop = aloop->next; - } - } - } + walk_loop_tree (tree_root, flags, mn); } /* The properties of the target. */