From patchwork Wed Jul 7 11:10:26 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 58101 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1EADAB6EEB for ; Wed, 7 Jul 2010 21:10:40 +1000 (EST) Received: (qmail 3402 invoked by alias); 7 Jul 2010 11:10:37 -0000 Received: (qmail 3391 invoked by uid 22791); 7 Jul 2010 11:10:35 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL, BAYES_00, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from cam-admin0.cambridge.arm.com (HELO cam-admin0.cambridge.arm.com) (217.140.96.50) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 07 Jul 2010 11:10:29 +0000 Received: from cam-owa1.Emea.Arm.com (cam-owa1.emea.arm.com [10.1.255.62]) by cam-admin0.cambridge.arm.com (8.12.6/8.12.6) with ESMTP id o67BAReI023936; Wed, 7 Jul 2010 12:10:27 +0100 (BST) Received: from [10.1.66.29] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 7 Jul 2010 12:10:26 +0100 Subject: Re: [Patch tree-sra] Fix to set up correct context for call to compute_inline_parameter (PR44768) From: Ramana Radhakrishnan Reply-To: ramana.radhakrishnan@arm.com To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, mjambor@suse.cz In-Reply-To: References: <1278489425.17030.13.camel@e102325-lin.cambridge.arm.com> Date: Wed, 07 Jul 2010 12:10:26 +0100 Message-Id: <1278501026.25686.23.camel@e102325-lin.cambridge.arm.com> Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Wed, 2010-07-07 at 11:33 +0200, Richard Guenther wrote: > Switching cfun is expensive. Why and where does > compute_inline_parameters end up using cfun? We should fix > that instead. The reason compute_inline_parameters ends up using cfun / current_function_decl is because this ends up calling estimated_stack_frame_size that ends up calling a backend hook that uses current_function_decl as can be seen in the audit trail http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44768#c5 Based on your idea after our IRC chat - Does this look any better ? Verified that this actually generates the correct code by manual inspection of generated code. Ok after bootstrapping on arm-linux-gnueabi and regression testing ? cheers Ramana 2010-07-07 Ramana Radhakrishnan PR bootstrap/44768 * cfgexpand.c (estimated_stack_frame_size): Make self-contained with respect to current_function_decl. Pass decl of the function. * tree-inline.h (estimated_stack_frame_size): Adjust prototype. * ipa-inline.c (compute_inline_parameters): Pass decl to estimated_stack_frame_size. Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 161901) +++ ipa-inline.c (working copy) @@ -2019,7 +2019,7 @@ compute_inline_parameters (struct cgraph /* Estimate the stack size for the function. But not at -O0 because estimated_stack_frame_size is a quadratic problem. */ - self_stack_size = optimize ? estimated_stack_frame_size () : 0; + self_stack_size = optimize ? estimated_stack_frame_size (node->decl) : 0; inline_summary (node)->estimated_self_stack_size = self_stack_size; node->global.estimated_stack_size = self_stack_size; node->global.stack_frame_offset = 0; Index: cfgexpand.c =================================================================== --- cfgexpand.c (revision 161901) +++ cfgexpand.c (working copy) @@ -1252,8 +1252,8 @@ fini_vars_expansion (void) stack_vars_alloc = stack_vars_num = 0; } -/* Make a fair guess for the size of the stack frame of the current - function. This doesn't have to be exact, the result is only used +/* Make a fair guess for the size of the stack frame of the decl + passed. This doesn't have to be exact, the result is only used in the inline heuristics. So we don't want to run the full stack var packing algorithm (which is quadratic in the number of stack vars). Instead, we calculate the total size of all stack vars. @@ -1261,12 +1261,15 @@ fini_vars_expansion (void) vars doesn't happen very often. */ HOST_WIDE_INT -estimated_stack_frame_size (void) +estimated_stack_frame_size (tree decl) { HOST_WIDE_INT size = 0; size_t i; tree var, outer_block = DECL_INITIAL (current_function_decl); unsigned ix; + tree old_cur_fun_decl = current_function_decl; + current_function_decl = decl; + push_cfun (DECL_STRUCT_FUNCTION (decl)); init_vars_expansion (); @@ -1287,7 +1290,8 @@ estimated_stack_frame_size (void) size += account_stack_vars (); fini_vars_expansion (); } - + pop_cfun (); + current_function_decl = old_cur_fun_decl; return size; } Index: tree-inline.h =================================================================== --- tree-inline.h (revision 161901) +++ tree-inline.h (working copy) @@ -185,6 +185,6 @@ extern tree remap_decl (tree decl, copy_ extern tree remap_type (tree type, copy_body_data *id); extern gimple_seq copy_gimple_seq_and_replace_locals (gimple_seq seq); -extern HOST_WIDE_INT estimated_stack_frame_size (void); +extern HOST_WIDE_INT estimated_stack_frame_size (tree); #endif /* GCC_TREE_INLINE_H */