From patchwork Mon Nov 30 12:11:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 549972 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6D1711401F6 for ; Mon, 30 Nov 2015 23:12:47 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=c7NdEdQ4; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=MX77kEgsnsXemFmgD OXz62xtyDIHPwJl2r9OGQE7FeDa6R9XP2pPSVCc6CFEWJJEoCyYwbN3eBWQ4W6yD OX0cKegDDlsMYg7PH+mHPoBPzJ2miF6rNlOcg9VOIR1COLZt32aJ9Zj9poGLt1M9 yFl85+zsZ66man/gEfeAGID+tc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=RfMcnjVnrlbDJLOYckZi7CK 5wbM=; b=c7NdEdQ4ygDkdAQBSPa2E03xSU+JQmVG56kaVaFuUi68y8hyeqhcuKm +nIbXy6NBN0Vb/KS7zQ5Rr9KcUr7zhu57FGDBGV6FxhSP+9b83DRcapqgpJfaIYk 6I9lm5A7m6LBqpGKvzGydgmKSu8wxOLSbK16jxxsrfyh6nJ3V4GQ= Received: (qmail 57644 invoked by alias); 30 Nov 2015 12:12:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 57627 invoked by uid 89); 30 Nov 2015 12:12:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 30 Nov 2015 12:12:31 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53168) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1a3NJl-0000cI-AR for gcc-patches@gnu.org; Mon, 30 Nov 2015 07:12:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3NJi-0003N3-Dr for gcc-patches@gnu.org; Mon, 30 Nov 2015 07:12:29 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:49315) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3NJi-0003Mm-5x for gcc-patches@gnu.org; Mon, 30 Nov 2015 07:12:26 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1a3NJh-0002nM-6n from Tom_deVries@mentor.com ; Mon, 30 Nov 2015 04:12:25 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Mon, 30 Nov 2015 12:12:23 +0000 Subject: Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta To: Richard Biener References: <565C0F47.5020604@mentor.com> CC: Jakub Jelinek , "gcc-patches@gnu.org" From: Tom de Vries Message-ID: <565C3CEC.9040209@mentor.com> Date: Mon, 30 Nov 2015 13:11:24 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 30/11/15 10:16, Richard Biener wrote: > On Mon, 30 Nov 2015, Tom de Vries wrote: > >> Hi, >> >> this patch fixes PR46032. >> >> It handles a call: >> ... >> __builtin_GOMP_parallel (fn, data, num_threads, flags) >> ... >> as: >> ... >> fn (data) >> ... >> in ipa-pta. >> >> This improves ipa-pta alias analysis in the parallelized function fn, and >> allows vectorization in the testcase without a runtime alias test. >> >> Bootstrapped and reg-tested on x86_64. >> >> OK for stage3 trunk? > > + /* Assign the passed argument to the appropriate incoming > + parameter of the function. */ > + struct constraint_expr lhs ; > + lhs = get_function_part_constraint (fi, fi_parm_base + 0); > + auto_vec rhsc; > + struct constraint_expr *rhsp; > + get_constraint_for_rhs (arg, &rhsc); > + while (rhsc.length () != 0) > + { > + rhsp = &rhsc.last (); > + process_constraint (new_constraint (lhs, *rhsp)); > + rhsc.pop (); > + } > > please use style used elsewhere with > > FOR_EACH_VEC_ELT (rhsc, j, rhsp) > process_constraint (new_constraint (lhs, *rhsp)); > rhsc.truncate (0); > That code was copied from find_func_aliases_for_call. I've factored out the bit that I copied as find_func_aliases_for_call_arg, and fixed the style there (and dropped 'rhsc.truncate (0)' since AFAIU it's redundant at the end of a function). > + /* Parameter passed by value is used. */ > + lhs = get_function_part_constraint (fi, fi_uses); > + struct constraint_expr *rhsp; > + get_constraint_for_address_of (arg, &rhsc); > > This isn't correct - you want to use get_constraint_for (arg, &rhsc). > After all rhs is already an ADDR_EXPR. > Can we add an assert somewhere to detect this incorrect usage? > + FOR_EACH_VEC_ELT (rhsc, j, rhsp) > + process_constraint (new_constraint (lhs, *rhsp)); > + rhsc.truncate (0); > + > + /* The caller clobbers what the callee does. */ > + lhs = get_function_part_constraint (fi, fi_clobbers); > + rhs = get_function_part_constraint (cfi, fi_clobbers); > + process_constraint (new_constraint (lhs, rhs)); > + > + /* The caller uses what the callee does. */ > + lhs = get_function_part_constraint (fi, fi_uses); > + rhs = get_function_part_constraint (cfi, fi_uses); > + process_constraint (new_constraint (lhs, rhs)); > > I don't see why you need those. The solver should compute these > in even better precision (context sensitive on the call side). > > The same is true for the function parameter. That is, the only > needed part of the patch should be that making sure we see > the "direct" call and assign parameters correctly. > Dropped this bit. OK for stage3 trunk if bootstrap and reg-test succeeds? Thanks, - Tom Handle BUILT_IN_GOMP_PARALLEL in ipa-pta 2015-11-30 Tom de Vries PR tree-optimization/46032 * tree-ssa-structalias.c (find_func_aliases_for_call_arg): New function, factored out of ... (find_func_aliases_for_call): ... here. (find_func_aliases_for_builtin_call, find_func_clobbers): Handle BUILT_IN_GOMP_PARALLEL. (ipa_pta_execute): Same. Handle node->parallelized_function as a local function. * gcc.dg/pr46032.c: New test. * testsuite/libgomp.c/pr46032.c: New test. --- gcc/testsuite/gcc.dg/pr46032.c | 47 +++++++++++++++++++++++++++ gcc/tree-ssa-structalias.c | 60 +++++++++++++++++++++++++++-------- libgomp/testsuite/libgomp.c/pr46032.c | 44 +++++++++++++++++++++++++ 3 files changed, 138 insertions(+), 13 deletions(-) diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c new file mode 100644 index 0000000..b91190e --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr46032.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */ + +extern void abort (void); + +#define nEvents 1000 + +static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize"))) +init (unsigned *results, unsigned *pData) +{ + unsigned int i; + for (i = 0; i < nEvents; ++i) + pData[i] = i % 3; +} + +static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize"))) +check (unsigned *results) +{ + unsigned sum = 0; + for (int idx = 0; idx < (int)nEvents; idx++) + sum += results[idx]; + + if (sum != 1998) + abort (); +} + +int +main (void) +{ + unsigned results[nEvents]; + unsigned pData[nEvents]; + unsigned coeff = 2; + + init (&results[0], &pData[0]); + +#pragma omp parallel for + for (int idx = 0; idx < (int)nEvents; idx++) + results[idx] = coeff * pData[idx]; + + check (&results[0]); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "note: vectorized 1 loop" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" } } */ + diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index f24ebeb..23f79b4 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -4139,6 +4139,24 @@ get_fi_for_callee (gcall *call) return get_vi_for_tree (fn); } +/* Create constraints for assigning call argument ARG to the incoming parameter + INDEX of function FI. */ + +static void +find_func_aliases_for_call_arg (varinfo_t fi, unsigned index, tree arg) +{ + struct constraint_expr lhs; + lhs = get_function_part_constraint (fi, fi_parm_base + index); + + auto_vec rhsc; + get_constraint_for_rhs (arg, &rhsc); + + unsigned j; + struct constraint_expr *rhsp; + FOR_EACH_VEC_ELT (rhsc, j, rhsp) + process_constraint (new_constraint (lhs, *rhsp)); +} + /* Create constraints for the builtin call T. Return true if the call was handled, otherwise false. */ @@ -4488,6 +4506,25 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t) } return true; } + case BUILT_IN_GOMP_PARALLEL: + { + /* Handle __builtin_GOMP_parallel (fn, data, num_threads, flags) as + fn (data). */ + if (in_ipa_mode) + { + tree fnarg = gimple_call_arg (t, 0); + gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR); + tree fndecl = TREE_OPERAND (fnarg, 0); + tree arg = gimple_call_arg (t, 1); + gcc_assert (TREE_CODE (arg) == ADDR_EXPR); + + varinfo_t fi = get_vi_for_tree (fndecl); + find_func_aliases_for_call_arg (fi, 0, arg); + return true; + } + /* Else fallthru to generic call handling. */ + break; + } /* printf-style functions may have hooks to set pointers to point to somewhere into the generated string. Leave them for a later exercise... */ @@ -4546,18 +4583,8 @@ find_func_aliases_for_call (struct function *fn, gcall *t) parameters of the function. */ for (j = 0; j < gimple_call_num_args (t); j++) { - struct constraint_expr lhs ; - struct constraint_expr *rhsp; tree arg = gimple_call_arg (t, j); - - get_constraint_for_rhs (arg, &rhsc); - lhs = get_function_part_constraint (fi, fi_parm_base + j); - while (rhsc.length () != 0) - { - rhsp = &rhsc.last (); - process_constraint (new_constraint (lhs, *rhsp)); - rhsc.pop (); - } + find_func_aliases_for_call_arg (fi, j, arg); } /* If we are returning a value, assign it to the result. */ @@ -5036,6 +5063,8 @@ find_func_clobbers (struct function *fn, gimple *origt) case BUILT_IN_VA_START: case BUILT_IN_VA_END: return; + case BUILT_IN_GOMP_PARALLEL: + return; /* printf-style functions may have hooks to set pointers to point to somewhere into the generated string. Leave them for a later exercise... */ @@ -7352,7 +7381,8 @@ ipa_pta_execute (void) bool nonlocal_p = (node->used_from_other_partition || node->externally_visible || node->force_output - || node->address_taken); + || (node->address_taken + && !node->parallelized_function)); vi = create_function_info_for (node->decl, alias_get_name (node->decl), false, @@ -7504,7 +7534,11 @@ ipa_pta_execute (void) continue; /* Handle direct calls to functions with body. */ - decl = gimple_call_fndecl (stmt); + if (gimple_call_builtin_p (stmt, BUILT_IN_GOMP_PARALLEL)) + decl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0); + else + decl = gimple_call_fndecl (stmt); + if (decl && (fi = lookup_vi_for_tree (decl)) && fi->is_fn_info) diff --git a/libgomp/testsuite/libgomp.c/pr46032.c b/libgomp/testsuite/libgomp.c/pr46032.c new file mode 100644 index 0000000..2178aa7 --- /dev/null +++ b/libgomp/testsuite/libgomp.c/pr46032.c @@ -0,0 +1,44 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -std=c99 -fipa-pta" } */ + + +extern void abort (void); + +#define nEvents 1000 + +static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize"))) +init (unsigned *results, unsigned *pData) +{ + unsigned int i; + for (i = 0; i < nEvents; ++i) + pData[i] = i % 3; +} + +static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize"))) +check (unsigned *results) +{ + unsigned sum = 0; + for (int idx = 0; idx < (int)nEvents; idx++) + sum += results[idx]; + + if (sum != 1998) + abort (); +} + +int +main (void) +{ + unsigned results[nEvents]; + unsigned pData[nEvents]; + unsigned coeff = 2; + + init (&results[0], &pData[0]); + +#pragma omp parallel for + for (int idx = 0; idx < (int)nEvents; idx++) + results[idx] = coeff * pData[idx]; + + check (&results[0]); + + return 0; +}