From patchwork Wed Jan 13 04:18:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 566816 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4E2F1140B9C for ; Wed, 13 Jan 2016 15:18:34 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=NGIHBHtC; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=eNMHA8AQ5gowVg1oFhdnUlTYkZi2gloFsX4uKOjKMrDep+D/Jt BA0NN1G8Q6RpkGaDVuH3qDLEq5Qy9R7p+rWDg6GeQ1OfTzYzcI13R1ZLg/qRNflP yaJjHmF29kClks6SHTZRWgOsNxqh1bIA+wBVHkRP+4PVF4wymGZHMST44= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=lpD8GZPR0lE/jPpkvhyeiJiYAxc=; b=NGIHBHtC/WXDd3yBjlQG sf/dsnhg8YvyqIQVyLrlpip+yuv/llJEIMeB5ZEWt2xTiN2M02GvmduhVFkIjFvh GpocFZ0N9dzSiwqIDAg+o3GaU9cpxYrbVAV3KVmKQpdnh/aJfNdjTz0ltXVBT6+e JODZUFGIuaNW2BL74MzFaBc= Received: (qmail 62810 invoked by alias); 13 Jan 2016 04:18:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 62791 invoked by uid 89); 13 Jan 2016 04:18:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=dom, 1, 25, dump_compare, dump-noaddr.x X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 13 Jan 2016 04:18:15 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id DEF0AA1734 for ; Wed, 13 Jan 2016 04:18:13 +0000 (UTC) Received: from slagheap.utah.redhat.com (ovpn-113-92.phx2.redhat.com [10.3.113.92]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u0D4IDZ7009959 for ; Tue, 12 Jan 2016 23:18:13 -0500 To: gcc-patches From: Jeff Law Subject: [PATCH][PR tree-optimization/pr67755] Fix profile insanity adjustments Message-ID: <5695D005.7040508@redhat.com> Date: Tue, 12 Jan 2016 21:18:13 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 X-IsSubscribed: yes tree-ssa-threadupdate.c has code to compensate for situations where threading will produce "profile insanities". Specifically if we have multiple jump thread paths through a common joiner to different final targets, then we can end up increasing the counts on one path to compensate for counts that will get lost when threading the later path. I feel that code is papering over the real issue elsewhere in the threader's profile update code, but I'm not sure we can fix without some significant revamping, which is probably riskier than we'd like at this stage. So this patch detects more accurately when those adjustments may be needed rather than just blindly applying them all the time. The net result is more accurate counts/probabilities in cases where there's a single destination for one or more jump threads through a common join block as is the case in the BZ. I also ran some older GCC .i files and saw it making similar fixes in a half-dozen or so files via that test. Additionally, I verified that there are no new "Invalid sum of..." messages across blob of .i files I had lying around with the patch installed for both vrp and dom passes. And, of course the usual bootstrap and regression testing on x86_64-linux-gnu and verification of the testcase using a ppc64 cross compiler. Installed on the trunk. Jeff commit f1af94e0b32743de707f7f872975034ff1a56b64 Author: law Date: Wed Jan 13 04:17:36 2016 +0000 [PATCH][PR tree-optimization/pr67755] Fix profile insanity adjustments PR tree-optimization/pr67755 * tree-ssa-threadupdate.c (struct ssa_local_info_t): Add new field "need_profile_correction". (thread_block_1): Initialize new field to false by default. If we have multiple thread paths through a common joiner to different final targets, then set new field to true. (compute_path_counts): Only do count adjustment when it's really needed. PR tree-optimization/67755 * gcc.dg/tree-ssa/pr67755.c: New test. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232313 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6423a37..f8f818b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,14 @@ +2016-01-12 Jeff Law + + PR tree-optimization/pr67755 + * tree-ssa-threadupdate.c (struct ssa_local_info_t): Add new field + "need_profile_correction". + (thread_block_1): Initialize new field to false by default. If we + have multiple thread paths through a common joiner to different + final targets, then set new field to true. + (compute_path_counts): Only do count adjustment when it's really + needed. + 2016-01-12 Sandra Loosemore * doc/invoke.texi (Spec Files): Move section down in file, past diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 11f3b0c..240fae7 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,4 +1,9 @@ -2015-01-13 Thomas Preud'homme +2016-01-13 Jeff Law + + PR tree-optimization/67755 + * gcc.dg/tree-ssa/pr67755.c: New test. + +2016-01-13 Thomas Preud'homme * gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Replace static pass number in output by a star. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr67755.c b/gcc/testsuite/gcc.dg/tree-ssa/pr67755.c new file mode 100644 index 0000000..64ffd0b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr67755.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-dom2-details-blocks" } */ +/* We want to verify no outgoing edge from a conditional + has a probability of 100%. */ +/* { dg-final { scan-tree-dump-not "succ:\[ \]+. .100.0%. .\(TRUE|FALSE\)_VALUE" "dom2"} } */ + + +void (*zend_block_interruptions) (void); + +int * _zend_mm_alloc_int (int * heap, long int size) +{ + int *best_fit; + long int true_size = (size < 15 ? 32 : size); + + if (zend_block_interruptions) + zend_block_interruptions (); + + if (__builtin_expect ((true_size < 543), 1)) + best_fit = heap + 2; + else + best_fit = heap; + + return best_fit; +} + diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 1bf9ae6..4783c4b 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -239,6 +239,11 @@ struct ssa_local_info_t /* Blocks duplicated for the thread. */ bitmap duplicate_blocks; + + /* When we have multiple paths through a joiner which reach different + final destinations, then we may need to correct for potential + profile insanities. */ + bool need_profile_correction; }; /* Passes which use the jump threading code register jump threading @@ -826,7 +831,8 @@ compute_path_counts (struct redirection_data *rd, So ensure that this path's path_out_count is at least the difference between elast->count and nonpath_count. Otherwise the edge counts after threading will not be sane. */ - if (has_joiner && path_out_count < elast->count - nonpath_count) + if (local_info->need_profile_correction + && has_joiner && path_out_count < elast->count - nonpath_count) { path_out_count = elast->count - nonpath_count; /* But neither can we go above the minimum count along the path @@ -1492,6 +1498,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners) ssa_local_info_t local_info; local_info.duplicate_blocks = BITMAP_ALLOC (NULL); + local_info.need_profile_correction = false; /* To avoid scanning a linear array for the element we need we instead use a hash table. For normal code there should be no noticeable @@ -1502,6 +1509,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners) /* Record each unique threaded destination into a hash table for efficient lookups. */ + edge last = NULL; FOR_EACH_EDGE (e, ei, bb->preds) { if (e->aux == NULL) @@ -1555,6 +1563,17 @@ thread_block_1 (basic_block bb, bool noloop_only, bool joiners) /* Insert the outgoing edge into the hash table if it is not already in the hash table. */ lookup_redirection_data (e, INSERT); + + /* When we have thread paths through a common joiner with different + final destinations, then we may need corrections to deal with + profile insanities. See the big comment before compute_path_counts. */ + if ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK) + { + if (!last) + last = e2; + else if (e2 != last) + local_info.need_profile_correction = true; + } } /* We do not update dominance info. */