From patchwork Wed Mar 16 11:52:52 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 87243 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 7A678B7020 for ; Wed, 16 Mar 2011 22:53:06 +1100 (EST) Received: (qmail 1201 invoked by alias); 16 Mar 2011 11:53:05 -0000 Received: (qmail 1191 invoked by uid 22791); 16 Mar 2011 11:53:04 -0000 X-SWARE-Spam-Status: No, hits=-6.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 16 Mar 2011 11:52:57 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p2GBqtLN025368 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 16 Mar 2011 07:52:55 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p2GBqsYF011295 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 16 Mar 2011 07:52:54 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (localhost.localdomain [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id p2GBqrFe010687; Wed, 16 Mar 2011 12:52:53 +0100 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id p2GBqrET010685; Wed, 16 Mar 2011 12:52:53 +0100 Date: Wed, 16 Mar 2011 12:52:52 +0100 From: Jakub Jelinek To: Paolo Bonzini Cc: Kenneth Zadeck , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix RTL DSE compile time hog (PR rtl-optimization/48141) Message-ID: <20110316115252.GI30899@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek References: <20110315230526.GD30899@tyan-ft48-01.lab.bos.redhat.com> <4D7FF24D.6050709@naturalbridge.com> <4D8078DC.9000808@gnu.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4D8078DC.9000808@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Wed, Mar 16, 2011 at 09:46:20AM +0100, Paolo Bonzini wrote: > On 03/16/2011 12:12 AM, Kenneth Zadeck wrote: > >so how much time does this save? > > > >I agree that this is a useful simplification, but it seems unlikely to > >be that important in real code. > >it seems like the 5000 store test would in general provide a better > >safety valve. > > I think having both is a good idea. Here is the second patch, ok for both to trunk if this one passes bootstrap/regtest? With just this patch alone on the testcase with the default --param=max-dse-active-local-stores=5000 cc1 spends 18.9 seconds in DSE1+DSE2, with =1000 just 4 seconds and with =10 0.4 seconds. With the earlier patch in addition to this one the time in DSE1+DSE2 is 0.3 seconds no matter what the parameter is. 2011-03-16 Jakub Jelinek PR rtl-optimization/48141 * params.def (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES): New. * dse.c: Include params.h. (active_local_stores_len): New variable. (add_wild_read, dse_step1): Clear it when setting active_local_stores to NULL. (record_store, check_mem_read_rtx): Decrease it when removing from the chain. (scan_insn): Likewise. Increase it when adding to chain, if it reaches PARAM_MAX_DSE_ACTIVE_LOCAL_STORES limit, set to 1 and set active_local_stores to NULL before the addition. * Makefile.in (dse.o): Depend on $(PARAMS_H). Jakub --- gcc/params.def.jj 2011-02-15 15:42:27.000000000 +0100 +++ gcc/params.def 2011-03-16 12:20:16.000000000 +0100 @@ -698,6 +698,12 @@ DEFPARAM(PARAM_MAX_SCHED_READY_INSNS, "The maximum number of instructions ready to be issued to be considered by the scheduler during the first scheduling pass", 100, 0, 0) +/* This is the maximum number of active local stores RTL DSE will consider. */ +DEFPARAM (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES, + "max-dse-active-local-stores", + "Maximum number of active local stores in RTL dead store elimination", + 5000, 0, 0) + /* Prefetching and cache-optimizations related parameters. Default values are usually set by machine description. */ --- gcc/dse.c.jj 2011-02-15 15:42:26.000000000 +0100 +++ gcc/dse.c 2011-03-16 12:34:18.000000000 +0100 @@ -47,6 +47,7 @@ along with GCC; see the file COPYING3. #include "optabs.h" #include "dbgcnt.h" #include "target.h" +#include "params.h" /* This file contains three techniques for performing Dead Store Elimination (dse). @@ -387,6 +388,7 @@ static alloc_pool insn_info_pool; /* The linked list of stores that are under consideration in this basic block. */ static insn_info_t active_local_stores; +static int active_local_stores_len; struct bb_info { @@ -947,6 +949,7 @@ add_wild_read (bb_info_t bb_info) } insn_info->wild_read = true; active_local_stores = NULL; + active_local_stores_len = 0; } @@ -1538,6 +1541,7 @@ record_store (rtx body, bb_info_t bb_inf { insn_info_t insn_to_delete = ptr; + active_local_stores_len--; if (last) last->next_local_store = ptr->next_local_store; else @@ -2074,6 +2078,7 @@ check_mem_read_rtx (rtx *loc, void *data if (dump_file) dump_insn_info ("removing from active", i_ptr); + active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else @@ -2163,6 +2168,7 @@ check_mem_read_rtx (rtx *loc, void *data if (dump_file) dump_insn_info ("removing from active", i_ptr); + active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else @@ -2222,6 +2228,7 @@ check_mem_read_rtx (rtx *loc, void *data if (dump_file) dump_insn_info ("removing from active", i_ptr); + active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else @@ -2426,6 +2433,7 @@ scan_insn (bb_info_t bb_info, rtx insn) if (dump_file) dump_insn_info ("removing from active", i_ptr); + active_local_stores_len--; if (last) last->next_local_store = i_ptr->next_local_store; else @@ -2453,6 +2461,12 @@ scan_insn (bb_info_t bb_info, rtx insn) fprintf (dump_file, "handling memset as BLKmode store\n"); if (mems_found == 1) { + if (active_local_stores_len++ + >= PARAM_VALUE (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES)) + { + active_local_stores_len = 1; + active_local_stores = NULL; + } insn_info->next_local_store = active_local_stores; active_local_stores = insn_info; } @@ -2496,6 +2510,12 @@ scan_insn (bb_info_t bb_info, rtx insn) it as cannot delete. This simplifies the processing later. */ if (mems_found == 1) { + if (active_local_stores_len++ + >= PARAM_VALUE (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES)) + { + active_local_stores_len = 1; + active_local_stores = NULL; + } insn_info->next_local_store = active_local_stores; active_local_stores = insn_info; } @@ -2534,6 +2554,7 @@ remove_useless_values (cselib_val *base) if (del) { + active_local_stores_len--; if (last) last->next_local_store = insn_info->next_local_store; else @@ -2584,6 +2605,7 @@ dse_step1 (void) = create_alloc_pool ("cse_store_info_pool", sizeof (struct store_info), 100); active_local_stores = NULL; + active_local_stores_len = 0; cselib_clear_table (); /* Scan the insns. */ --- gcc/Makefile.in.jj 2011-02-02 16:30:50.000000000 +0100 +++ gcc/Makefile.in 2011-03-16 12:26:12.000000000 +0100 @@ -3070,7 +3070,7 @@ dse.o : dse.c $(CONFIG_H) $(SYSTEM_H) co $(TREE_H) $(TM_P_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h \ $(RECOG_H) $(EXPR_H) $(DF_H) cselib.h $(DBGCNT_H) $(TIMEVAR_H) \ $(TREE_PASS_H) alloc-pool.h $(ALIAS_H) dse.h $(OPTABS_H) $(TARGET_H) \ - $(BITMAP_H) + $(BITMAP_H) $(PARAMS_H) fwprop.o : fwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(DIAGNOSTIC_CORE_H) insn-config.h $(RECOG_H) $(FLAGS_H) $(OBSTACK_H) $(BASIC_BLOCK_H) \ output.h $(DF_H) alloc-pool.h $(TIMEVAR_H) $(TREE_PASS_H) $(TARGET_H) \