From patchwork Tue Oct 15 14:59:07 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Teresa Johnson X-Patchwork-Id: 283668 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 961FF2C0196 for ; Wed, 16 Oct 2013 01:59:18 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=tffNVIPqHeEMuukGTp KDGoOOjuThG6SRzXOUK+m5dNShh5tfjJeQeoTo4eG1cYO3mQF2m1Plwo2mrk1wqb zLR1UKOXkdQEpgnOK5L/x7fVzaoCRnh+JnI7sDtetR7gWmVvtFWRvOuzQHtHpwgv NOrVnb0a7SnoiaSIMVZUq6y/Q= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=A2euPrSk1UdE0a/kkfs8iY93 mek=; b=FjQbI7ihR9HFSEtYfDtueyMpxmph04QWW0oKOBYu1pp6aF1z7IVmXE/8 TZoFRocm8CxjIyzi8htC3O3cPtLfdvZQU5GXhTlHPKfKXFOv8hIua7ByPo392h2h vs2a+1mSPWaG0T20SVmtO+SF3WZYi73eeZnUtOrhAAz5lbOt0tU= Received: (qmail 28595 invoked by alias); 15 Oct 2013 14:59:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 28586 invoked by uid 89); 15 Oct 2013 14:59:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qe0-f54.google.com Received: from mail-qe0-f54.google.com (HELO mail-qe0-f54.google.com) (209.85.128.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 15 Oct 2013 14:59:09 +0000 Received: by mail-qe0-f54.google.com with SMTP id 1so6256188qec.41 for ; Tue, 15 Oct 2013 07:59:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=Dn3JjxJvCadWJqW0DPyWyl8ZQsDS0DJ9nb7j4jlia9Q=; b=dLHwEtl3nBxxs6RMykzDRBPee0uEIDMh46fJUFS0GsCg4WHAdhl/wn18R5ucDiZRFP 4MlvnjFQXy1ejnWhBapIzYoc6igDFE0nxT1RyM0kATmRhnGw8egBtQ3MuyoY1ZowQDuW pTvpQlSswEppaE+CAhMFAzyl+IZ4xkD4FN4hBvW8OwFA9lgBSBiuX6sLNmUvQge8tpIl mFbHWxa/2iJH3Och5onuIHjiYU3hCq9C/fYxYe3j0JrKB11YZ8DDtnXTSoKPYhTa+MDN 2HwHHMBjBJICm/V6PYILrXZFnoq4jo7nMoyRGHp2SObV4ZVyQD8H7iSIGzFUiqiTUy70 o/ZQ== X-Gm-Message-State: ALoCoQl+aWNlQADBbp6B/W8XCsWPnFLACJrM+Z54z45q2USC9psyAO31POrW1gwMExJlyJCKuEFUY1Cu1oQjTXsK5ekCp2Y4j/AVSdXPB+xfJP9Laa7o1MfG4X48ET8/Ll1HniAHdEoqKb/SomPOp+d5iDQXQYHHV1bnhvuLyKVlzMTzHnkz6/AzngT1AClz7y44Iw1BSaOfWIgR4+Pw0VlsWf5SUcCUew== MIME-Version: 1.0 X-Received: by 10.224.29.131 with SMTP id q3mr4464299qac.37.1381849147232; Tue, 15 Oct 2013 07:59:07 -0700 (PDT) Received: by 10.49.82.230 with HTTP; Tue, 15 Oct 2013 07:59:07 -0700 (PDT) In-Reply-To: References: <20131014194908.GA16717@kam.mff.cuni.cz> <20131014215027.GD17422@kam.mff.cuni.cz> Date: Tue, 15 Oct 2013 07:59:07 -0700 Message-ID: Subject: Re: [PATCH] decide edge's hotness when there is profile info From: Teresa Johnson To: Dehao Chen Cc: Jan Hubicka , Xinliang David Li , GCC Patches X-IsSubscribed: yes On Mon, Oct 14, 2013 at 3:26 PM, Dehao Chen wrote: > On Mon, Oct 14, 2013 at 2:50 PM, Jan Hubicka wrote: >>> For my test case, the entire inline instance is optimized away, so >>> there is no info about it in the profile. I can do some fixup in the >>> rebuild_cgraph_edge though. >> >> Yep, I understand that. In this case we should turn PROFILE_READ to PROFILE_GUESSED >> and guess the profile when we detect this (i.e. we have edges with non-0 counts into >> functions with 0 profile). That should prvent these from getting UNLIKELY_EXECUTED >> and they will be inlined normal way. > > Oh, actually in AutoFDO, only functions with samples will be marked as > PROFILE_READ. Others will all be marked as PROFILE_GUESSED. Here is Honza's patch that he was referring to: Which is discussed in this email: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01185.html For COMDATs I need to extend this to do it a little later to do it recursively to catch the case of COMDATs feeding other COMDATs and I need to do some other handling to compute counts from the frequencies when inlining. I have been meaning to work on this for awhile but finally am getting to it this week. (Here's the last message from a later thread that forked off the above one: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01907.html) In the meantime, perhaps Honza's patch will suffice? Teresa > > Dehao > >> >> Honza >>> >>> Dehao >>> >>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li wrote: >>> > Is it possible to update the callee node summary after profile >>> > annotate (using information from inline instances which are not >>> > inlined in early inline)? >>> > >>> > David >>> > >>> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen wrote: >>> >> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka wrote: >>> >>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this >>> >>>> could be a potential risk because some callee is marked unlikely >>> >>>> executed simply because they are inlined and eliminated in the O2 >>> >>>> binary. But in ipa-inline it will not get inlined because the edge is >>> >>>> not hot from cgraph_maybe_hot_edge_p (because callee is >>> >>>> UNLIKELY_EXECUTED), while the edge->count is actually hot. >>> >>> >>> >>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead? >>> >>> It seems that having profile set incorrectly will lead to other problems later, too. >>> >>> We discussed similar problem with Teresa about the missing profiles for comdat, >>> >>> basically one should detect these cases as profile being lost and go with guessed >>> >>> profile. (I believe patch for that was posted, too, and so far it seems best approach >>> >>> to this issue) >>> >> >>> >> The current AutoFDO implementation will take all functions that do not >>> >> have have profile as normally executed, thus use guessed profile for >>> >> it. This is like using profile for truly hot functions, and using O2 >>> >> for other functions. This works fine. However, it leads to larger code >>> >> size (approximately 10%~20% larger than FDO). >>> >> >>> >> I'd like to introduce another mode for users who care about both >>> >> performance and code size, and can be sure that profile is >>> >> representative. In this mode, we will mark all functions without >>> >> sample as "unlikely executed". However, because AutoFDO use debug info >>> >> (of optimized code) to represent profile, it's possible that some hot >>> >> functions (say foo) are inlined and fully eliminated into another hot >>> >> function (say bar). So in the profile, bar is cold, and because the >>> >> profile for foo::bar is eliminated, bar will not be inlined into foo >>> >> before the profile annotation. However, after profile annotate, we can >>> >> infer from the bb count that foo->bar is hot, thus it should be >>> >> inlined in ipa-inline phase. However, because bar itself is marked >>> >> UNLIKELY_EXECUTED, it will not be inlined. >>> >> >>> >> One possible workaround would be that during rebuild_cgraph_edges, if >>> >> we find an edge's callee is unlikely executed, add the edge count to >>> >> the callee's count and recalculate callee's frequency. >>> >> >>> >> Dehao >>> >> >>> >>> >>> >>> Honza Index: tree-profile.c =================================================================== --- tree-profile.c (revision 201838) +++ tree-profile.c (working copy) @@ -604,6 +604,34 @@ pop_cfun (); } + /* See if 0 count function has non-0 count callers. In this case we + lost some profile. Drop its function profile to PROFILE_GUESSED. */ + FOR_EACH_DEFINED_FUNCTION (node) + { + struct cgraph_edge *e; + bool called = false; + if (node->count) + continue; + for (e = node->callers; e; e = e->next_caller) + { + if (e->count) + called = true; + if (cgraph_maybe_hot_edge_p (e)) + break; + } + if (e || called + && profile_status_for_function + (DECL_STRUCT_FUNCTION (node->symbol.decl)) == PROFILE_READ) + { + if (dump_file) + fprintf (stderr, "Dropping 0 profile for %s/%i.%s based on calls.\n", + cgraph_node_name (node), node->symbol.order, + e ? "function is hot" : "function is normal"); + profile_status_for_function (DECL_STRUCT_FUNCTION (node->symbol.decl)) + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); + node->frequency = e ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; + } + } del_node_map(); return 0; Index: predict.c =================================================================== --- predict.c (revision 201838) +++ predict.c (working copy) @@ -2715,6 +2715,9 @@ gcov_type count_max, true_count_max = 0; basic_block bb; + if (!ENTRY_BLOCK_PTR->count) + return 0; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) true_count_max = MAX (bb->count, true_count_max);