From patchwork Thu Nov 19 09:35:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 546368 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DCCC14149A for ; Thu, 19 Nov 2015 20:36:55 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=MXDctcED; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=S93wnB9tHzK4S2KK2 b7BaCbzr7T013M6l6X2NPC+Dd6MyjJCcPOOnV/vpbOgRqK8f5hqPMveSpdCly2T6 GHxXbhDuW9Hbv9Ei9PaYg0+OmhcQy3U9beeSjWLH2Knsp/4FZ54bFXzdFfVApLub dZUFQ37ij/Glyn0FIg6ykrS+HU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=rodPZe9nxd3oVbtJ3uf6Mtq xfko=; b=MXDctcEDboE/y8zLNEqS+6o9gnJ9RkfhA/WXjWQ6Y6cXqp6rW9AxH1A qfomYgTtZI4/f9RkmQObaOQVtIpZeyMevq7M7+2AWS34Aa7HTioo6s06F4qSDsYu OgfwENdG3c5ImA1ZCmW3YNzZ348edGg7QnqawWo4qtCJQ8ivyBRo= Received: (qmail 47334 invoked by alias); 19 Nov 2015 09:36:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 47316 invoked by uid 89); 19 Nov 2015 09:36:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 19 Nov 2015 09:36:41 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41650) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZzLdv-0001hM-4W for gcc-patches@gnu.org; Thu, 19 Nov 2015 04:36:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZzLdp-0007yC-MZ for gcc-patches@gnu.org; Thu, 19 Nov 2015 04:36:38 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:42516) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzLdp-0007y7-Dx for gcc-patches@gnu.org; Thu, 19 Nov 2015 04:36:33 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZzLdm-00010Q-Dv from Tom_deVries@mentor.com ; Thu, 19 Nov 2015 01:36:30 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Thu, 19 Nov 2015 09:36:28 +0000 Subject: Re: [PATCH, PR68373 ] Call scev_const_prop in pass_parallelize_loops::execute To: Richard Biener References: <5640BD31.2060602@mentor.com> <5640FB07.6010008@mentor.com> <5649C41A.40403@mentor.com> <564A64B3.7080305@mentor.com> <564B3F69.50600@mentor.com> <564B49F6.308@mentor.com> <564BA844.1010103@mentor.com> CC: Richard Biener , "gcc-patches@gnu.org" , Jakub Jelinek From: Tom de Vries Message-ID: <564D97F4.1020507@mentor.com> Date: Thu, 19 Nov 2015 10:35:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <564BA844.1010103@mentor.com> X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 17/11/15 23:20, Tom de Vries wrote: > [ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ] > > Hi, > > Consider test-case test.c, with a use of the final value of the > iteration variable (return i): > ... > unsigned int > foo (int *a, unsigned int n) > { > unsigned int i; > for (i = 0; i < n; ++i) > a[i] = 1; > > return i; > } > ... > > Compiled with: > ... > $ gcc -S -O2 test.c -ftree-parallelize-loops=2 -fdump-tree-all-details > ... > > Before parloops, we have: > ... > : > # i_12 = PHI <0(3), i_10(5)> > _5 = (long unsigned int) i_12; > _6 = _5 * 4; > _8 = a_7(D) + _6; > *_8 = 1; > i_10 = i_12 + 1; > if (n_4(D) > i_10) > goto ; > else > goto ; > > : > goto ; > > : > # i_14 = PHI > ... > > Parloops will fail because: > ... > phi is n_2 = PHI > arg of phi to exit: value n_4(D) used outside loop > checking if it a part of reduction pattern: > FAILED: it is not a part of reduction.... > ... > [ note that the phi looks slightly different. In > gather_scalar_reductions -> vect_analyze_loop_form -> > vect_analyze_loop_form_1 -> split_loop_exit_edge we split the edge from > bb4 to bb6. ] > > This patch uses scev_const_prop at the start of parloops. > scev_const_prop first also splits the exit edge, and then replaces the > phi with a assignment: > ... > final value replacement: > n_2 = PHI > with > n_2 = n_4(D); > ... > > This allows parloops to succeed. > > And there's a similar story when we compile with -fno-tree-scev-cprop in > addition. > > Bootstrapped and reg-tested on x86_64. > > OK for stage3/stage1? The patch has been updated to do the final value replacement only for the loop that parloops is processing, as suggested in review comment at https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02166.html . That means the patch is now also required for the kernels patch series. Bootstrapped and reg-tested on x86_64. OK for stage 3 trunk? Thanks, - Tom Do final value replacement in try_create_reduction_list 2015-11-18 Tom de Vries * tree-scalar-evolution.c (final_value_replacement_loop): Factor out of ... (scev_const_prop): ... here. * tree-scalar-evolution.h (final_value_replacement_loop): Declare. * tree-parloops.c (try_create_reduction_list): Call final_value_replacement_loop. * gcc.dg/autopar/pr68373.c: New test. --- gcc/testsuite/gcc.dg/autopar/pr68373.c | 14 ++ gcc/tree-parloops.c | 3 + gcc/tree-scalar-evolution.c | 248 +++++++++++++++++---------------- gcc/tree-scalar-evolution.h | 1 + 4 files changed, 145 insertions(+), 121 deletions(-) diff --git a/gcc/testsuite/gcc.dg/autopar/pr68373.c b/gcc/testsuite/gcc.dg/autopar/pr68373.c new file mode 100644 index 0000000..8e0f8a5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/autopar/pr68373.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-parallelize-loops=2 -fdump-tree-parloops-details" } */ + +unsigned int +foo (int *a, unsigned int n) +{ + unsigned int i; + for (i = 0; i < n; ++i) + a[i] = 1; + + return i; +} + +/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops" } } */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 17415a8..8d7912d 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2539,6 +2539,9 @@ try_create_reduction_list (loop_p loop, gcc_assert (exit); + /* Try to get rid of exit phis. */ + final_value_replacement_loop (loop); + gather_scalar_reductions (loop, reduction_list); diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c index 27630f0..9b33693 100644 --- a/gcc/tree-scalar-evolution.c +++ b/gcc/tree-scalar-evolution.c @@ -3417,6 +3417,131 @@ expression_expensive_p (tree expr) } } +/* Do final value replacement for LOOP. */ + +void +final_value_replacement_loop (struct loop *loop) +{ + /* If we do not know exact number of iterations of the loop, we cannot + replace the final value. */ + edge exit = single_exit (loop); + if (!exit) + return; + + tree niter = number_of_latch_executions (loop); + if (niter == chrec_dont_know) + return; + + /* Ensure that it is possible to insert new statements somewhere. */ + if (!single_pred_p (exit->dest)) + split_loop_exit_edge (exit); + + /* Set stmt insertion pointer. All stmts are inserted before this point. */ + gimple_stmt_iterator gsi = gsi_after_labels (exit->dest); + + struct loop *ex_loop + = superloop_at_depth (loop, + loop_depth (exit->dest->loop_father) + 1); + + gphi_iterator psi; + for (psi = gsi_start_phis (exit->dest); !gsi_end_p (psi); ) + { + gphi *phi = psi.phi (); + tree rslt = PHI_RESULT (phi); + tree def = PHI_ARG_DEF_FROM_EDGE (phi, exit); + if (virtual_operand_p (def)) + { + gsi_next (&psi); + continue; + } + + if (!POINTER_TYPE_P (TREE_TYPE (def)) + && !INTEGRAL_TYPE_P (TREE_TYPE (def))) + { + gsi_next (&psi); + continue; + } + + bool folded_casts; + def = analyze_scalar_evolution_in_loop (ex_loop, loop, def, + &folded_casts); + def = compute_overall_effect_of_inner_loop (ex_loop, def); + if (!tree_does_not_contain_chrecs (def) + || chrec_contains_symbols_defined_in_loop (def, ex_loop->num) + /* Moving the computation from the loop may prolong life range + of some ssa names, which may cause problems if they appear + on abnormal edges. */ + || contains_abnormal_ssa_name_p (def) + /* Do not emit expensive expressions. The rationale is that + when someone writes a code like + + while (n > 45) n -= 45; + + he probably knows that n is not large, and does not want it + to be turned into n %= 45. */ + || expression_expensive_p (def)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "not replacing:\n "); + print_gimple_stmt (dump_file, phi, 0, 0); + fprintf (dump_file, "\n"); + } + gsi_next (&psi); + continue; + } + + /* Eliminate the PHI node and replace it by a computation outside + the loop. */ + if (dump_file) + { + fprintf (dump_file, "\nfinal value replacement:\n "); + print_gimple_stmt (dump_file, phi, 0, 0); + fprintf (dump_file, " with\n "); + } + def = unshare_expr (def); + remove_phi_node (&psi, false); + + /* If def's type has undefined overflow and there were folded + casts, rewrite all stmts added for def into arithmetics + with defined overflow behavior. */ + if (folded_casts && ANY_INTEGRAL_TYPE_P (TREE_TYPE (def)) + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (def))) + { + gimple_seq stmts; + gimple_stmt_iterator gsi2; + def = force_gimple_operand (def, &stmts, true, NULL_TREE); + gsi2 = gsi_start (stmts); + while (!gsi_end_p (gsi2)) + { + gimple *stmt = gsi_stmt (gsi2); + gimple_stmt_iterator gsi3 = gsi2; + gsi_next (&gsi2); + gsi_remove (&gsi3, false); + if (is_gimple_assign (stmt) + && arith_code_with_undefined_signed_overflow + (gimple_assign_rhs_code (stmt))) + gsi_insert_seq_before (&gsi, + rewrite_to_defined_overflow (stmt), + GSI_SAME_STMT); + else + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); + } + } + else + def = force_gimple_operand_gsi (&gsi, def, false, NULL_TREE, + true, GSI_SAME_STMT); + + gassign *ass = gimple_build_assign (rslt, def); + gsi_insert_before (&gsi, ass, GSI_SAME_STMT); + if (dump_file) + { + print_gimple_stmt (dump_file, ass, 0, 0); + fprintf (dump_file, "\n"); + } + } +} + /* Replace ssa names for that scev can prove they are constant by the appropriate constants. Also perform final value replacement in loops, in case the replacement expressions are cheap. @@ -3430,8 +3555,7 @@ scev_const_prop (void) basic_block bb; tree name, type, ev; gphi *phi; - gassign *ass; - struct loop *loop, *ex_loop; + struct loop *loop; bitmap ssa_names_to_remove = NULL; unsigned i; gphi_iterator psi; @@ -3507,126 +3631,8 @@ scev_const_prop (void) /* Now the regular final value replacement. */ FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) - { - edge exit; - tree def, rslt, niter; - gimple_stmt_iterator gsi; - - /* If we do not know exact number of iterations of the loop, we cannot - replace the final value. */ - exit = single_exit (loop); - if (!exit) - continue; - - niter = number_of_latch_executions (loop); - if (niter == chrec_dont_know) - continue; - - /* Ensure that it is possible to insert new statements somewhere. */ - if (!single_pred_p (exit->dest)) - split_loop_exit_edge (exit); - gsi = gsi_after_labels (exit->dest); + final_value_replacement_loop (loop); - ex_loop = superloop_at_depth (loop, - loop_depth (exit->dest->loop_father) + 1); - - for (psi = gsi_start_phis (exit->dest); !gsi_end_p (psi); ) - { - phi = psi.phi (); - rslt = PHI_RESULT (phi); - def = PHI_ARG_DEF_FROM_EDGE (phi, exit); - if (virtual_operand_p (def)) - { - gsi_next (&psi); - continue; - } - - if (!POINTER_TYPE_P (TREE_TYPE (def)) - && !INTEGRAL_TYPE_P (TREE_TYPE (def))) - { - gsi_next (&psi); - continue; - } - - bool folded_casts; - def = analyze_scalar_evolution_in_loop (ex_loop, loop, def, - &folded_casts); - def = compute_overall_effect_of_inner_loop (ex_loop, def); - if (!tree_does_not_contain_chrecs (def) - || chrec_contains_symbols_defined_in_loop (def, ex_loop->num) - /* Moving the computation from the loop may prolong life range - of some ssa names, which may cause problems if they appear - on abnormal edges. */ - || contains_abnormal_ssa_name_p (def) - /* Do not emit expensive expressions. The rationale is that - when someone writes a code like - - while (n > 45) n -= 45; - - he probably knows that n is not large, and does not want it - to be turned into n %= 45. */ - || expression_expensive_p (def)) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "not replacing:\n "); - print_gimple_stmt (dump_file, phi, 0, 0); - fprintf (dump_file, "\n"); - } - gsi_next (&psi); - continue; - } - - /* Eliminate the PHI node and replace it by a computation outside - the loop. */ - if (dump_file) - { - fprintf (dump_file, "\nfinal value replacement:\n "); - print_gimple_stmt (dump_file, phi, 0, 0); - fprintf (dump_file, " with\n "); - } - def = unshare_expr (def); - remove_phi_node (&psi, false); - - /* If def's type has undefined overflow and there were folded - casts, rewrite all stmts added for def into arithmetics - with defined overflow behavior. */ - if (folded_casts && ANY_INTEGRAL_TYPE_P (TREE_TYPE (def)) - && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (def))) - { - gimple_seq stmts; - gimple_stmt_iterator gsi2; - def = force_gimple_operand (def, &stmts, true, NULL_TREE); - gsi2 = gsi_start (stmts); - while (!gsi_end_p (gsi2)) - { - gimple *stmt = gsi_stmt (gsi2); - gimple_stmt_iterator gsi3 = gsi2; - gsi_next (&gsi2); - gsi_remove (&gsi3, false); - if (is_gimple_assign (stmt) - && arith_code_with_undefined_signed_overflow - (gimple_assign_rhs_code (stmt))) - gsi_insert_seq_before (&gsi, - rewrite_to_defined_overflow (stmt), - GSI_SAME_STMT); - else - gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); - } - } - else - def = force_gimple_operand_gsi (&gsi, def, false, NULL_TREE, - true, GSI_SAME_STMT); - - ass = gimple_build_assign (rslt, def); - gsi_insert_before (&gsi, ass, GSI_SAME_STMT); - if (dump_file) - { - print_gimple_stmt (dump_file, ass, 0, 0); - fprintf (dump_file, "\n"); - } - } - } return 0; } diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h index 6d31280..29c7cd4 100644 --- a/gcc/tree-scalar-evolution.h +++ b/gcc/tree-scalar-evolution.h @@ -33,6 +33,7 @@ extern tree analyze_scalar_evolution (struct loop *, tree); extern tree instantiate_scev (basic_block, struct loop *, tree); extern tree resolve_mixers (struct loop *, tree, bool *); extern void gather_stats_on_scev_database (void); +extern void final_value_replacement_loop (struct loop *); extern unsigned int scev_const_prop (void); extern bool expression_expensive_p (tree); extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *,