From patchwork Mon Dec 21 13:13:25 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Lawrence X-Patchwork-Id: 559546 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 96A9C140313 for ; Tue, 22 Dec 2015 00:14:27 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=OjKlpuH+; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=kTI9l4QoD6Mu+7jxdcXvIhWSGlvl3nh2NPe7XFBbpgc6KeGpgf+mG ldgKumpBzPWtKkLrOF865OSo8Rbxiq6VQUF3t4sHe/ut3+zpodGvri+a63EiVzj9 kUJxnpPKQDkNBp+lhBuQ1quDb9GYvnGzFbyUHZgGfOH1SjIWj/S0lc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=IRhKpm0FGrs2+QYLiYz271YNmqY=; b=OjKlpuH+H8vQLng3Niic PN1kk5oZoBR36gYTs0vbmt/3+HSjQqdlidGCYx+N+vm02bVy6ol3vZv2vlgp8Rfa O1HnuKXdjcxarSOStSwtFOpIQyNcEdz9PR6kIkf94rmbetccwEOkX43oQpZxcxjJ CgIqhUGLiaCzWuqaCa1bUZI= Received: (qmail 78296 invoked by alias); 21 Dec 2015 13:14:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 78210 invoked by uid 89); 21 Dec 2015 13:14:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=AWL, BAYES_50, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=sk:access_, sk:array_r, sk:disqual, sk:generat X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 21 Dec 2015 13:14:03 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1B2083A8; Mon, 21 Dec 2015 05:13:36 -0800 (PST) Received: from arm.com (e105915-lin.emea.arm.com [10.2.206.30]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5D5D03F21A; Mon, 21 Dec 2015 05:14:01 -0800 (PST) From: Alan Lawrence To: gcc-patches@gcc.gnu.org Cc: james.greenhalgh@arm.com, marcus.shawcroft@arm.com, richard.earnshaw@arm.com Subject: [PATCH 1/4] Make SRA scalarize constant-pool loads Date: Mon, 21 Dec 2015 13:13:25 +0000 Message-Id: <1450703608-8617-2-git-send-email-alan.lawrence@arm.com> In-Reply-To: <1450703608-8617-1-git-send-email-alan.lawrence@arm.com> References: <1450703608-8617-1-git-send-email-alan.lawrence@arm.com> X-IsSubscribed: yes This is the same as the version conditionally approved near the end of stage 1 https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01948.html, plus the requested comment and a couple of testcases for platforms where the constants are pushed into the constant pool. (I didn't commit this at the time because without the rest of the series, benchmarking results didn't really show any improvement, and the PR was not fixed.) I've left the disqualified_constants check in, although I could be persuaded to remove it if people felt strongly. Also on AArch64, this still causes FAILs in guality pr54970.c: FAIL: gcc.dg/guality/pr54970.c -Os line 15 a[0] == 1 FAIL: gcc.dg/guality/pr54970.c -Os line 20 a[0] == 1 FAIL: gcc.dg/guality/pr54970.c -Os line 25 a[0] == 1 and I think I must ask the AArch64 maintainers to take these for the time being. Bootstrapped + check-gcc,g++,ada on x86_64, ARM, AArch64. gcc/ChangeLog: PR target/63679 * tree-sra.c (disqualified_constants): New. (sra_initialize): Allocate disqualified_constants. (sra_deinitialize): Free disqualified_constants. (disqualify_candidate): Update disqualified_constants when appropriate. (create_access): Scan for constant-pool entries as we go along. (scalarizable_type_p): Add check against type_contains_placeholder_p. (maybe_add_sra_candidate): Allow constant-pool entries. (initialize_constant_pool_replacements): New. (sra_modify_expr): Avoid mangling assignments created by previous. (sra_modify_function_body): Call initialize_constant_pool_replacements. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/sra-17.c: New. * gcc.dg/tree-ssa/sra-18.c: New. --- gcc/testsuite/gcc.dg/tree-ssa/sra-17.c | 19 +++++++ gcc/testsuite/gcc.dg/tree-ssa/sra-18.c | 28 ++++++++++ gcc/tree-sra.c | 99 ++++++++++++++++++++++++++++++++-- 3 files changed, 143 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-17.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-18.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c b/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c new file mode 100644 index 0000000..cff868a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c @@ -0,0 +1,19 @@ +/* { dg-do run { target { aarch64*-*-* alpha*-*-* arm*-*-* hppa*-*-* powerpc*-*-* s390*-*-* } } } */ +/* { dg-options "-O2 -fdump-tree-esra --param sra-max-scalarization-size-Ospeed=16" } */ + +extern void abort (void); + +int +main (int argc, char **argv) +{ + int a[4] = { 7, 19, 11, 255 }; + int tot = 0; + for (int i = 0; i < 4; i++) + tot = (tot*256) + a[i]; + if (tot == 0x07130bff) + return 0; + abort (); +} + +/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1 "esra" } } */ +/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\\[" 4 "esra" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c b/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c new file mode 100644 index 0000000..6c0d52b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c @@ -0,0 +1,28 @@ +/* { dg-do run { target { aarch64*-*-* alpha*-*-* arm*-*-* hppa*-*-* powerpc*-*-* s390*-*-* } } } */ +/* { dg-options "-O2 -fdump-tree-esra --param sra-max-scalarization-size-Ospeed=16" } */ + +extern void abort (void); +struct foo { int x; }; + +struct bar { struct foo f[2]; }; + +struct baz { struct bar b[2]; }; + +int +main (int argc, char **argv) +{ + struct baz a = { { { { { 4 }, { 5 } } }, { { { 6 }, { 7 } } } } }; + int tot = 0; + for (int i = 0; i < 2; i++) + for (int j = 0; j < 2; j++) + tot = (tot*256) + a.b[i].f[j].x; + if (tot == 0x04050607) + return 0; + abort (); +} + +/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1 "esra" } } */ +/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\.b\\\[0\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */ +/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\.b\\\[0\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */ +/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\.b\\\[1\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */ +/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\.b\\\[1\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */ diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index c4fea5b..4ea8550 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -328,6 +328,10 @@ candidate (unsigned uid) those which cannot be (because they are and need be used as a whole). */ static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap; +/* Bitmap of candidates in the constant pool, which cannot be scalarized + because this would produce non-constant expressions (e.g. Ada). */ +static bitmap disqualified_constants; + /* Obstack for creation of fancy names. */ static struct obstack name_obstack; @@ -652,6 +656,7 @@ sra_initialize (void) (vec_safe_length (cfun->local_decls) / 2); should_scalarize_away_bitmap = BITMAP_ALLOC (NULL); cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL); + disqualified_constants = BITMAP_ALLOC (NULL); gcc_obstack_init (&name_obstack); base_access_vec = new hash_map >; memset (&sra_stats, 0, sizeof (sra_stats)); @@ -670,6 +675,7 @@ sra_deinitialize (void) candidates = NULL; BITMAP_FREE (should_scalarize_away_bitmap); BITMAP_FREE (cannot_scalarize_away_bitmap); + BITMAP_FREE (disqualified_constants); access_pool.release (); assign_link_pool.release (); obstack_free (&name_obstack, NULL); @@ -684,6 +690,8 @@ disqualify_candidate (tree decl, const char *reason) { if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl))) candidates->remove_elt_with_hash (decl, DECL_UID (decl)); + if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl)) + bitmap_set_bit (disqualified_constants, DECL_UID (decl)); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -835,8 +843,11 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size) return access; } +static bool maybe_add_sra_candidate (tree); + /* Create and insert access for EXPR. Return created access, or NULL if it is - not possible. */ + not possible. Also scan for uses of constant pool as we go along and add + to candidates. */ static struct access * create_access (tree expr, gimple *stmt, bool write) @@ -859,6 +870,25 @@ create_access (tree expr, gimple *stmt, bool write) else ptr = false; + /* For constant-pool entries, check we can substitute the constant value. */ + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) + && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) + { + gcc_assert (!bitmap_bit_p (disqualified_constants, DECL_UID (base))); + if (expr != base + && !is_gimple_reg_type (TREE_TYPE (expr)) + && dump_file && (dump_flags & TDF_DETAILS)) + { + /* This occurs in Ada with accesses to ARRAY_RANGE_REFs, + and elements of multidimensional arrays (which are + multi-element arrays in their own right). */ + fprintf (dump_file, "Allowing non-reg-type load of part" + " of constant-pool entry: "); + print_generic_expr (dump_file, expr, 0); + } + maybe_add_sra_candidate (base); + } + if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) return NULL; @@ -918,6 +948,8 @@ static bool scalarizable_type_p (tree type) { gcc_assert (!is_gimple_reg_type (type)); + if (type_contains_placeholder_p (type)) + return false; switch (TREE_CODE (type)) { @@ -1852,7 +1884,12 @@ maybe_add_sra_candidate (tree var) reject (var, "not aggregate"); return false; } - if (needs_to_live_in_memory (var)) + /* Allow constant-pool entries (that "need to live in memory") + unless we are doing IPA SRA. */ + if (needs_to_live_in_memory (var) + && (sra_mode == SRA_MODE_EARLY_IPA + || TREE_CODE (var) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (var))) { reject (var, "needs to live in memory"); return false; @@ -3272,6 +3309,9 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) racc = get_access_for_expr (rhs); if (!lacc && !racc) return SRA_AM_NONE; + /* Avoid modifying initializations of constant-pool replacements. */ + if (racc && (racc->replacement_decl == lhs)) + return SRA_AM_NONE; loc = gimple_location (stmt); if (lacc && lacc->grp_to_be_replaced) @@ -3388,7 +3428,10 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) || contains_vce_or_bfcref_p (lhs) || stmt_ends_bb_p (stmt)) { - if (access_has_children_p (racc)) + /* No need to copy into a constant-pool, it comes pre-initialized. */ + if (access_has_children_p (racc) + && (TREE_CODE (racc->base) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (racc->base))) generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0, gsi, false, false, loc); if (access_has_children_p (lacc)) @@ -3491,6 +3534,54 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) } } +/* Set any scalar replacements of values in the constant pool to the initial + value of the constant. (Constant-pool decls like *.LC0 have effectively + been initialized before the program starts, we must do the same for their + replacements.) Thus, we output statements like 'SR.1 = *.LC0[0];' into + the function's entry block. */ + +static void +initialize_constant_pool_replacements (void) +{ + gimple_seq seq = NULL; + gimple_stmt_iterator gsi = gsi_start (seq); + bitmap_iterator bi; + unsigned i; + + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) + if (bitmap_bit_p (should_scalarize_away_bitmap, i) + && !bitmap_bit_p (cannot_scalarize_away_bitmap, i)) + { + tree var = candidate (i); + if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var)) + continue; + vec *access_vec = get_base_access_vector (var); + if (!access_vec) + continue; + for (unsigned i = 0; i < access_vec->length (); i++) + { + struct access *access = (*access_vec)[i]; + if (!access->replacement_decl) + continue; + gassign *stmt = gimple_build_assign ( + get_access_replacement (access), unshare_expr (access->expr)); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Generating constant initializer: "); + print_gimple_stmt (dump_file, stmt, 0, 1); + fprintf (dump_file, "\n"); + } + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); + update_stmt (stmt); + } + } + + seq = gsi_seq (gsi); + if (seq) + gsi_insert_seq_on_edge_immediate ( + single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq); +} + /* Traverse the function body and all modifications as decided in analyze_all_variable_accesses. Return true iff the CFG has been changed. */ @@ -3501,6 +3592,8 @@ sra_modify_function_body (void) bool cfg_changed = false; basic_block bb; + initialize_constant_pool_replacements (); + FOR_EACH_BB_FN (bb, cfun) { gimple_stmt_iterator gsi = gsi_start_bb (bb);