From patchwork Sun Sep 2 19:35:10 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Matz X-Patchwork-Id: 181228 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 69B652C008A for ; Mon, 3 Sep 2012 05:35:30 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1347219331; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Date: From:To:Subject:Message-ID:User-Agent:MIME-Version:Content-Type: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=J5C8+8hEK0FBf+kj45JD GSXk8iQ=; b=h1csDB+8X23WWBpPLOysiqw7Mbtp6u7d+oufnDe6HH4ARO8zD7nF mO393nJdTNkdGMkI5ttPxO4MSzDe2krwcuUzpRC+8WUOpEckGBpzcCfBtI8DKrmo uith0HYqczcxLdXD1U8Nmj1/dX1R0A4rHKfoNWJK5bvLZjiquttClbg= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Date:From:To:Subject:Message-ID:User-Agent:MIME-Version:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=j0Euw2nSMV7oYRFAqsU92FO/WRO4RLABKEuVS20jvFPegWW1BBZA2znspWymiv dLz0rAp4+B0dvUHlwsnaYnDdNew4+kMDqGphJvV/zCcB09cb1G3Hryw5+L7E2HC+ jMx6CX+c7+FruDsAGsA9klsQFJqYCgOMoaD0dYSLeZbb4=; Received: (qmail 26089 invoked by alias); 2 Sep 2012 19:35:27 -0000 Received: (qmail 25595 invoked by uid 22791); 2 Sep 2012 19:35:26 -0000 X-SWARE-Spam-Status: No, hits=-5.3 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 02 Sep 2012 19:35:12 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id D1DB1A3D8B for ; Sun, 2 Sep 2012 21:35:10 +0200 (CEST) Date: Sun, 2 Sep 2012 21:35:10 +0200 (CEST) From: Michael Matz To: gcc-patches@gcc.gnu.org Subject: Speedup loop header copying [part of PR 46590] Message-ID: User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, as the bug report tells us one speed problem is loop header copying, in particular the update_ssa call that is done for each and every copied loop header but touches all blocks in a function. Now, one idea was to use an optimized update_ssa that works only on the relevant subset of blocks (it's dominance frontiers that are the problem). I've experimented with the original formulation of frontiers as per Cytron which allows to calculate the domfrontier of one basic block lazily. The end result was none, no speedup, no slowdown. I haven't investigated, but I guess the problem is that too often most of the blocks are relevant for most of the header copies, either because of virtual ops or because calculating the domfrontier of a block needs all domfrontiers of all dominated childs (i.e. the domfrontier of the entry needs domfrontiers of all blocks). So, no cake there. But actually there's no reason that we need to keep SSA form uptodate during the multiple header copyings. We use gimple_duplicate_sese_region to do the copying which updates the SSA web before returning (actually loop header copying is the only caller of it). The next thing done is just another call to gimple_duplicate_sese_region to copy some other BBs, then some split edges, repeat from start. We can just as well defer the whole SSA web updating to after we've duplicated everything we want. That's what this patch does. Time for various things on the testcase (with -O1): without with patch tree SSA rewrite 26.2s 4.8s tree SSA incremental 21.7s 4.6s dominance computation 15.0s 4.2s dominance frontiers 25.6s 6.7s TOTAL 135.6s 67.8s Regstrapped on x86_64-linux, no regressions. Okay for trunk? Ciao, Michael. ------------------ PR tree-optimization/46590 * tree-cfg.c (gimple_duplicate_sese_region): Don't update SSA web here ... * tree-ssa-loop-ch.c (copy_loop_headers): ... but here. Index: tree-cfg.c =================================================================== --- tree-cfg.c (revision 190803) +++ tree-cfg.c (working copy) @@ -5530,9 +5530,10 @@ add_phi_args_after_copy (basic_block *re important exit edge EXIT. By important we mean that no SSA name defined inside region is live over the other exit edges of the region. All entry edges to the region must go to ENTRY->dest. The edge ENTRY is redirected - to the duplicate of the region. SSA form, dominance and loop information - is updated. The new basic blocks are stored to REGION_COPY in the same - order as they had in REGION, provided that REGION_COPY is not NULL. + to the duplicate of the region. Dominance and loop information is + updated, but not the SSA web. The new basic blocks are stored to + REGION_COPY in the same order as they had in REGION, provided that + REGION_COPY is not NULL. The function returns false if it is unable to copy the region, true otherwise. */ @@ -5593,8 +5594,6 @@ gimple_duplicate_sese_region (edge entry free_region_copy = true; } - gcc_assert (!need_ssa_update_p (cfun)); - /* Record blocks outside the region that are dominated by something inside. */ doms = NULL; @@ -5663,9 +5662,6 @@ gimple_duplicate_sese_region (edge entry /* Add the other PHI node arguments. */ add_phi_args_after_copy (region_copy, n_region, NULL); - /* Update the SSA web. */ - update_ssa (TODO_update_ssa); - if (free_region_copy) free (region_copy); Index: tree-ssa-loop-ch.c =================================================================== --- tree-ssa-loop-ch.c (revision 190803) +++ tree-ssa-loop-ch.c (working copy) @@ -241,6 +241,7 @@ copy_loop_headers (void) split_edge (loop_latch_edge (loop)); } + update_ssa (TODO_update_ssa); free (bbs); free (copied_bbs);