From patchwork Mon Dec 21 13:13:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Lawrence X-Patchwork-Id: 559547 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 2E457140313 for ; Tue, 22 Dec 2015 00:14:41 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=ZiWtRG0/; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=IXYsWEd1tBShDI6cgC6nmjubbpKjp+biO1DRDVv7kwrbEGqsot/Zr Q7V4CVoIfGyDdxpLaNu5+ruGNtHwUb8HvF/16vGroWI5Y+sqWwOgM+N6FvRYD3my RVTBFpbQEh3cCqaA5j0AcuRDUDr0DZJ5aZLPdsH/l01v1bXimWEU9k= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:in-reply-to:references; s=default; bh=RU24cT0MxLOB4J4nmo1LZ5FYrLg=; b=ZiWtRG0/INxd9Qro276sxPrCX7NV GXRjoMJ0dmyTusasasl+bbgZ2rDisVZIn5SetH+tVpPoDijy1DrvMSjwelzoq5Xu k7B8f5bKL2Bun20yRzgOVL/XkWLnHg9vCFrcm0ZWlMNTp0UQvUoWt+vMORyMgBsK ctZPE2yOJlUTJ8Q= Received: (qmail 78336 invoked by alias); 21 Dec 2015 13:14:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 78261 invoked by uid 89); 21 Dec 2015 13:14:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=fa, evolution, max_size, concerns X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 21 Dec 2015 13:14:04 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ACF67535 for ; Mon, 21 Dec 2015 05:13:36 -0800 (PST) Received: from arm.com (e105915-lin.emea.arm.com [10.2.206.30]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A88023F21A for ; Mon, 21 Dec 2015 05:14:02 -0800 (PST) From: Alan Lawrence To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c Date: Mon, 21 Dec 2015 13:13:26 +0000 Message-Id: <1450703608-8617-3-git-send-email-alan.lawrence@arm.com> In-Reply-To: <1450703608-8617-1-git-send-email-alan.lawrence@arm.com> References: <1450703608-8617-1-git-send-email-alan.lawrence@arm.com> X-IsSubscribed: yes This is a respin of patches https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03266.html and https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03267.html, which were "too quickly" approved before concerns with efficiency were pointed out. I tried to change the hashing just in tree-ssa-dom.c using C++ subclassing, but couldn't cleanly separate this out from tree-ssa-scopedtables and tree-ssa-threadedge.c due to use of avail_exprs_stack. So I figured it was probably appropriate to use the equivalences in jump threading too. Also, using get_ref_base_and_extent unifies handling of MEM_REFs and ARRAY_REFs (hence only one patch rather than two). I've added a couple of testcases that show the improvement in DOM, but in all cases I had to disable FRE, even PRE, to get any improvement, apart from on ssa-dom-cse-2.c itself (where on the affected platforms FRE still does not do the optimization). This makes me wonder if this is the right approach or whether changing the references output by SRA (as per https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01490.html , judged as a hack to SRA to work around limitations in DOM - or is it?) would be better. Also, this causes gcc to miss vectorization of pr65947-2.c, because the additional equivalents spotted in dom1 lead to an extra jump-thread (partially peeling the loop's first iter, as this has become entirely predictable); loop-header-copying (ch_vect) then kicks in, restoring the loop structure but leading to: i_49 = 1; ivtmp_46 = 253; if (ivtmp_46 != 0) goto ; else goto ; : <--- OUTSIDE LOOP : <--- LOOP HEADER # i_53 = PHI and scalar evolution is unable to analyze i_49 (defined outside the loop) even though it's equal to a constant (analyze_scalar_evolution_1, test of flow_bb_inside_loop_p). This is fixed in the next patch... gcc/ChangeLog: * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and ARRAY_REF using get_ref_base_and_extent. (equal_mem_array_ref_p): New. (hashable_expr_equal_p): Add call to previous. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-dom-cse-5.c: New. * gcc.dg/tree-ssa/ssa-dom-cse-6.c: New. * gcc.dg/tree-ssa/ssa-dom-cse-7.c: New. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c | 18 +++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c | 16 +++++++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c | 19 ++++++++ gcc/tree-ssa-scopedtables.c | 67 +++++++++++++++++++++++++-- 4 files changed, 117 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c new file mode 100644 index 0000000..cd38d3e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-5.c @@ -0,0 +1,18 @@ +/* Test normalization of ARRAY_REF expressions to MEM_REFs in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-fre -fdump-tree-dom2" } */ + +#define N 8 + +int +main (int argc, char **argv) +{ + int a[N]; + for (int i = 0; i < N; i++) + a[i] = 2*i + 1; + int *p = &a[0]; + __builtin_printf ("%d\n", a[argc]); + return *(++p); +} + +/* { dg-final { scan-tree-dump-times "return 3;" 1 "dom2"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c new file mode 100644 index 0000000..b0c5138 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-6.c @@ -0,0 +1,16 @@ +/* Test normalization of ARRAY_REF expressions to MEM_REFs in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-fre -fdump-tree-dom1" } */ + +int +main (int argc, char **argv) +{ + union { + int a[8]; + int b[2]; + } u = { .a = { 1, 42, 3, 4, 5, 6, 7, 8 } }; + __builtin_printf ("%d\n", u.a[argc]); + return u.b[1]; +} + +/* { dg-final { scan-assembler-times "return 42;" 1 "dom1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c new file mode 100644 index 0000000..1fc4c5e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-7.c @@ -0,0 +1,19 @@ +/* Test normalization of MEM_REF expressions in dom. */ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */ + +typedef struct { + int a[8]; +} foo; + +foo f; + +int +test () +{ + foo g = { { 1, 2, 3, 4, 5, 6, 7, 8 } }; + f=g; + return f.a[2]; +} + +/* { dg-final { scan-tree-dump-times "return 3;" 1 "optimized" } } */ diff --git a/gcc/tree-ssa-scopedtables.c b/gcc/tree-ssa-scopedtables.c index 6034f79..8df7125 100644 --- a/gcc/tree-ssa-scopedtables.c +++ b/gcc/tree-ssa-scopedtables.c @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see #include "fold-const.h" #include "tree-eh.h" #include "internal-fn.h" +#include "tree-dfa.h" static bool hashable_expr_equal_p (const struct hashable_expr *, const struct hashable_expr *); @@ -209,11 +210,70 @@ avail_expr_hash (class expr_hash_elt *p) const struct hashable_expr *expr = p->expr (); inchash::hash hstate; + if (expr->kind == EXPR_SINGLE) + { + /* T could potentially be a switch index or a goto dest. */ + tree t = expr->ops.single.rhs; + if (TREE_CODE (t) == MEM_REF || TREE_CODE (t) == ARRAY_REF) + { + /* Make equivalent statements of both these kinds hash together. + Dealing with both MEM_REF and ARRAY_REF allows us not to care + about equivalence with other statements not considered here. */ + bool reverse; + HOST_WIDE_INT offset, size, max_size; + tree base = get_ref_base_and_extent (t, &offset, &size, &max_size, + &reverse); + /* Strictly, we could try to normalize variable-sized accesses too, + but here we just deal with the common case. */ + if (size == max_size) + { + enum tree_code code = MEM_REF; + hstate.add_object (code); + inchash::add_expr (base, hstate); + hstate.add_object (offset); + hstate.add_object (size); + return hstate.end (); + } + } + } + inchash::add_hashable_expr (expr, hstate); return hstate.end (); } +/* Compares trees T0 and T1 to see if they are MEM_REF or ARRAY_REFs equivalent + to each other. (That is, they return the value of the same bit of memory.) + + Return TRUE if the two are so equivalent; FALSE if not (which could still + mean the two are equivalent by other means). */ + +static bool +equal_mem_array_ref_p (tree t0, tree t1) +{ + if (TREE_CODE (t0) != MEM_REF && TREE_CODE (t0) != ARRAY_REF) + return false; + if (TREE_CODE (t1) != MEM_REF && TREE_CODE (t1) != ARRAY_REF) + return false; + + if (!types_compatible_p (TREE_TYPE (t0), TREE_TYPE (t1))) + return false; + bool rev0; + HOST_WIDE_INT off0, sz0, max0; + tree base0 = get_ref_base_and_extent (t0, &off0, &sz0, &max0, &rev0); + + bool rev1; + HOST_WIDE_INT off1, sz1, max1; + tree base1 = get_ref_base_and_extent (t1, &off1, &sz1, &max1, &rev1); + + /* Types were compatible, so these are sanity checks. */ + gcc_assert (sz0 == sz1); + gcc_assert (max0 == max1); + gcc_assert (rev0 == rev1); + + return (off0 == off1) && operand_equal_p (base0, base1, 0); +} + /* Compare two hashable_expr structures for equivalence. They are considered equivalent when the expressions they denote must necessarily be equal. The logic is intended to follow that of @@ -246,9 +306,10 @@ hashable_expr_equal_p (const struct hashable_expr *expr0, switch (expr0->kind) { case EXPR_SINGLE: - return operand_equal_p (expr0->ops.single.rhs, - expr1->ops.single.rhs, 0); - + return equal_mem_array_ref_p (expr0->ops.single.rhs, + expr1->ops.single.rhs) + || operand_equal_p (expr0->ops.single.rhs, + expr1->ops.single.rhs, 0); case EXPR_UNARY: if (expr0->ops.unary.op != expr1->ops.unary.op) return false;