From patchwork Thu Nov 12 18:35:30 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Lawrence X-Patchwork-Id: 543598 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 182521402D5 for ; Fri, 13 Nov 2015 05:36:10 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=L5FeYFbc; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :content-type:content-transfer-encoding; q=dns; s=default; b=YCX dbLde5gmrSi4jcE/B/oJBo3vvBgP/e8nIcsgCO293sAEXzxqtBZSSwZlzLWf9RLs siSOnQgi9eq4c/gdfWKfSNa7hBnAsgziYSqv70+qyV4nngrUAaV8KHS7ldhIvXd7 g8FSIkzT+9S/e17tPlE40E9oVBuVa13xn2SK/vPk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :content-type:content-transfer-encoding; s=default; bh=FzGtbe1mn lQO20HAuiIuPQ74Fi4=; b=L5FeYFbcfVkfATK+kW7KuyqVX1F1/oXVSHyDF0O9O Vn/hlPszyjpedOzyxzIeM3jnKosAzTryI4FkKOTWQy1+2wANhLVLXuGyBU4xJH3y 5XH7sZnl+uPJjbOiB4kvFwJ+m6hkHX7lqVSNI5jjp1NqRvYcIshYECJ4lr57/k2o Ac= Received: (qmail 78844 invoked by alias); 12 Nov 2015 18:36:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 78827 invoked by uid 89); 12 Nov 2015 18:36:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Nov 2015 18:35:59 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-30-Ado8dlqbQCm6qJrjorbi7g-1; Thu, 12 Nov 2015 18:35:53 +0000 Received: from arm.com ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 12 Nov 2015 18:35:53 +0000 From: Alan Lawrence To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, law@redhat.com Subject: Re: [PATCH 6/6] Make SRA replace constant-pool loads Date: Thu, 12 Nov 2015 18:35:30 +0000 Message-Id: <1447353330-7776-1-git-send-email-alan.lawrence@arm.com> In-Reply-To: <= References: <= X-MC-Unique: Ado8dlqbQCm6qJrjorbi7g-1 X-IsSubscribed: yes On 06/11/15 16:29, Richard Biener wrote: >>> 2) You should be able to use fold_ctor_reference directly (in place >> of >>> all your code >>> in case offset and size are readily available - don't remember >> exactly how >>> complete scalarization "walks" elements). Alternatively use >>> fold_const_aggregate_ref. Well, yes I was able to remove subst_constant_pool_initial by using fold_ctor_reference, with some fairly strict checks about the access expressions found while scanning the input. I still needed to either deep-scan the constant value itself, or allow completely_scalarize to fail, due to Ada c460010 where a variable has DECL_INITIAL: {VIEW_CONVERT_EXPR({1, 2, 3})} (that is, a CONSTRUCTOR whose only element is a V.C.E. of another CONSTRUCTOR). However: > I tried to suggest creating piecewise loads from the constant pool as opposed to the aggregate one. But I suppose the difficulty is to determine 'pieces', not their values (that followup passes and folding can handle quite efficiently). I've now got this working, and it simplifies the patch massively :). We now replace e.g. a constant-pool load a = *.LC0, with a series of loads e.g. SR_a1 = SRlc0_1 SR_a2 = SRlc0_2 etc. etc. (well those wouldn't be quite the names, but you get the idea.) And then at the start of the function we initialise SRlc0_1 = *.LC0[0] SRlc0_2 = *.LC0[1] etc. etc. - I needed to use SRlc0_1 etc. variables because some of the constant-pool loads coming from Ada are rather more complicated than 'a = *.LC0' and substituting the access expression (e.g. '*.LC0[0].foo'), rather than the replacement_decl, into those lead to just too many verify_gimple failures. However, the initialization seems to work well enough, if I put a check into sra_modify_expr to avoid changing 'SRlc0_1 = *.LC0[0]' into 'SRlc0_1 = SRlc0_1' (!). Slightly hacky perhaps but harmless as there cannot be a statement writing to a scalar replacement prior to sra_modify_expr. Also I had to prevent writing scalar replacements back to the constant pool (gnat opt31.adb). Finally - I've left the disqualified_constants code in, but am now only using it for an assert...so I guess it'd be reasonable to take that out. In an ideal world we could do those checks only in checking builds but allocating bitmaps only for checking builds feels like it would make checking vs non-checking diverge too much. This is now passing bootstrap+check-{gcc,g++,fortran} on AArch64, and the same plus Ada on x64_64 and ARM, except for some additional guality failures in pr54970-1.c on AArch64 (tests which are already failing on ARM) - I haven't had any success in figuring those out yet, suggestions welcome. However, without the DOM patches, this does not fix the oiginal ssa-dom-cse-2.c testcase, even though the load is scalarized in esra and the individual element loads resolved in ccp2. I don't think I'll be able to respin those between now and stage 1 ending on Saturday, so hence I've had to drop the testsuite change, and can only / will merely describe the remaining problem in PR/63679. I'm now benchmarking on AArch64 to see what difference this patch makes on its own. Thoughts? gcc/ChangeLog: PR target/63679 * tree-sra.c (disqualified_constants): New. (sra_initialize): Allocate disqualified_constants. (sra_deinitialize): Free disqualified_constants. (disqualify_candidate): Update disqualified_constants when appropriate. (create_access): Scan for constant-pool entries as we go along. (scalarizable_type_p): Add check against type_contains_placeholder_p. (maybe_add_sra_candidate): Allow constant-pool entries. (initialize_constant_pool_replacements): New. (sra_modify_expr): Avoid mangling assignments created by previous. (sra_modify_function_body): Call initialize_constant_pool_replacements. --- gcc/tree-sra.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 90 insertions(+), 3 deletions(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 1d4a632..aa60f06 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -325,6 +325,10 @@ candidate (unsigned uid) those which cannot be (because they are and need be used as a whole). */ static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap; +/* Bitmap of candidates in the constant pool, which cannot be scalarized + because this would produce non-constant expressions (e.g. Ada). */ +static bitmap disqualified_constants; + /* Obstack for creation of fancy names. */ static struct obstack name_obstack; @@ -647,6 +651,7 @@ sra_initialize (void) (vec_safe_length (cfun->local_decls) / 2); should_scalarize_away_bitmap = BITMAP_ALLOC (NULL); cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL); + disqualified_constants = BITMAP_ALLOC (NULL); gcc_obstack_init (&name_obstack); base_access_vec = new hash_map >; memset (&sra_stats, 0, sizeof (sra_stats)); @@ -665,6 +670,7 @@ sra_deinitialize (void) candidates = NULL; BITMAP_FREE (should_scalarize_away_bitmap); BITMAP_FREE (cannot_scalarize_away_bitmap); + BITMAP_FREE (disqualified_constants); access_pool.release (); assign_link_pool.release (); obstack_free (&name_obstack, NULL); @@ -679,6 +685,8 @@ disqualify_candidate (tree decl, const char *reason) { if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl))) candidates->remove_elt_with_hash (decl, DECL_UID (decl)); + if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl)) + bitmap_set_bit (disqualified_constants, DECL_UID (decl)); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -830,8 +838,11 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size) return access; } +static bool maybe_add_sra_candidate (tree); + /* Create and insert access for EXPR. Return created access, or NULL if it is - not possible. */ + not possible. Also scan for uses of constant pool as we go along and add + to candidates. */ static struct access * create_access (tree expr, gimple *stmt, bool write) @@ -854,6 +865,25 @@ create_access (tree expr, gimple *stmt, bool write) else ptr = false; + /* For constant-pool entries, check we can substitute the constant value. */ + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) + && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) + { + gcc_assert (!bitmap_bit_p (disqualified_constants, DECL_UID (base))); + if (expr != base + && !is_gimple_reg_type (TREE_TYPE (expr)) + && dump_file && (dump_flags & TDF_DETAILS)) + { + /* This occurs in Ada with accesses to ARRAY_RANGE_REFs, + and elements of multidimensional arrays (which are + multi-element arrays in their own right). */ + fprintf (dump_file, "Allowing non-reg-type load of part" + " of constant-pool entry: "); + print_generic_expr (dump_file, expr, 0); + } + maybe_add_sra_candidate (base); + } + if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) return NULL; @@ -912,6 +942,8 @@ static bool scalarizable_type_p (tree type) { gcc_assert (!is_gimple_reg_type (type)); + if (type_contains_placeholder_p (type)) + return false; switch (TREE_CODE (type)) { @@ -1832,7 +1864,12 @@ maybe_add_sra_candidate (tree var) reject (var, "not aggregate"); return false; } - if (needs_to_live_in_memory (var)) + /* Allow constant-pool entries (that "need to live in memory") + unless we are doing IPA SRA. */ + if (needs_to_live_in_memory (var) + && (sra_mode == SRA_MODE_EARLY_IPA + || TREE_CODE (var) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (var))) { reject (var, "needs to live in memory"); return false; @@ -3250,6 +3287,9 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) racc = get_access_for_expr (rhs); if (!lacc && !racc) return SRA_AM_NONE; + /* Avoid modifying initializations of constant-pool replacements. */ + if (racc && (racc->replacement_decl == lhs)) + return SRA_AM_NONE; loc = gimple_location (stmt); if (lacc && lacc->grp_to_be_replaced) @@ -3366,7 +3406,10 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) || contains_vce_or_bfcref_p (lhs) || stmt_ends_bb_p (stmt)) { - if (access_has_children_p (racc)) + /* No need to copy into a constant-pool, it comes pre-initialized. */ + if (access_has_children_p (racc) + && (TREE_CODE (racc->base) != VAR_DECL + || !DECL_IN_CONSTANT_POOL (racc->base))) generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0, gsi, false, false, loc); if (access_has_children_p (lacc)) @@ -3469,6 +3512,48 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) } } +static void +initialize_constant_pool_replacements (void) +{ + gimple_seq seq = NULL; + gimple_stmt_iterator gsi = gsi_start (seq); + bitmap_iterator bi; + unsigned i; + + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) + if (bitmap_bit_p (should_scalarize_away_bitmap, i) + && !bitmap_bit_p (cannot_scalarize_away_bitmap, i)) + { + tree var = candidate (i); + if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var)) + continue; + vec *access_vec = get_base_access_vector (var); + if (!access_vec) + continue; + for (unsigned i = 0; i < access_vec->length (); i++) + { + struct access *access = (*access_vec)[i]; + if (!access->replacement_decl) + continue; + gassign *stmt = gimple_build_assign ( + get_access_replacement (access), unshare_expr (access->expr)); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Generating constant initializer: "); + print_gimple_stmt (dump_file, stmt, 0, 1); + fprintf (dump_file, "\n"); + } + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); + update_stmt (stmt); + } + } + + seq = gsi_seq (gsi); + if (seq) + gsi_insert_seq_on_edge_immediate ( + single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq); +} + /* Traverse the function body and all modifications as decided in analyze_all_variable_accesses. Return true iff the CFG has been changed. */ @@ -3479,6 +3564,8 @@ sra_modify_function_body (void) bool cfg_changed = false; basic_block bb; + initialize_constant_pool_replacements (); + FOR_EACH_BB_FN (bb, cfun) { gimple_stmt_iterator gsi = gsi_start_bb (bb);