From patchwork Wed Jul 25 16:13:42 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 173207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A169D2C0096 for ; Thu, 26 Jul 2012 02:21:16 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1343838076; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Received:Subject:From:To:Cc: Content-Type:Date:Message-ID:Mime-Version: Content-Transfer-Encoding:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=04YxUSkgnU9gTb08plS/QzdcXIA=; b=GEO9qutUU1ykTwW LBwCOwCBkIWDVVhUKV0xdC7cOFB+Tf5CRm5BwFJf3cFR8jCekpsEvvNwXcASbw1S ugyz0yqeoCZhzqe5juvb8SVaDoeaCof+a87audByr/uIrC/YgV5WsuQ7aqDnCodd hLqtPvfZTXTsSxxQE4HJEw7b8SpQ= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Subject:From:To:Cc:Content-Type:Date:Message-ID:Mime-Version:Content-Transfer-Encoding:X-Content-Scanned:x-cbid:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Ky/PJiXubauVkUGhDvIHEsV0TpF0fgR0T9hFR7tn9HO557zVLXiZSqpyEFD74C rQV/5/0qQu+5NaLEyTM3Vvr6b1lKGPYzYkiemy22wg706g2BCmKJ/yMk3zrhDpeD C4fJxlgGMN64rAXDL7goYS3EHfnIjxgvAvOEppuZ29vb0=; Received: (qmail 4066 invoked by alias); 25 Jul 2012 16:21:03 -0000 Received: (qmail 4022 invoked by uid 22791); 25 Jul 2012 16:20:56 -0000 X-SWARE-Spam-Status: No, hits=-3.4 required=5.0 tests=AWL, BAYES_00, KAM_STOCKGEN, KHOP_RCVD_UNTRUST, MAY_BE_FORGED, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e35.co.us.ibm.com (HELO e35.co.us.ibm.com) (32.97.110.153) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 25 Jul 2012 16:20:39 +0000 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Jul 2012 10:20:37 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 25 Jul 2012 10:15:36 -0600 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 38BAD1FF0106 for ; Wed, 25 Jul 2012 16:14:19 +0000 (WET) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6PGDlhO055270 for ; Wed, 25 Jul 2012 10:13:53 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6PGEs4h014267 for ; Wed, 25 Jul 2012 10:14:54 -0600 Received: from [9.10.86.122] (ibm-tp6f2po0ikq.rchland.ibm.com [9.10.86.122] (may be forged)) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q6PGEsXC014129; Wed, 25 Jul 2012 10:14:54 -0600 Subject: [PATCH] Change IVOPTS and strength reduction to use expmed cost model From: "William J. Schmidt" To: gcc-patches@gcc.gnu.org Cc: bergner@vnet.ibm.com, rguenther@suse.de, rth@redhat.com Date: Wed, 25 Jul 2012 11:13:42 -0500 Message-ID: <1343232822.4638.16.camel@oc2474580526.ibm.com> Mime-Version: 1.0 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12072516-6148-0000-0000-000007FEBD06 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Per Richard Henderson's suggestion (http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch changes the IVOPTS and straight-line strength reduction passes to make use of data computed by init_expmed. This required adding a new convert_cost array in expmed to store the costs of converting between various scalar integer modes, and exposing expmed's multiplication hash table for external use (new function mult_by_coeff_cost). Richard H, I'd appreciate it if you could look at what I did there and make sure it's correct. Thanks! I decided it wasn't worth distinguishing between reg-reg add costs and reg-constant add costs, so I simplified the strength reduction calculations rather than adding another array to expmed for this purpose. But I can make this distinction if that's preferable. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill 2012-07-25 Bill Schmidt * tree-ssa-loop-ivopts.c (mbc_entry_hash): Remove. (mbc_entry_eq): Likewise. (mult_costs): Likewise. (cost_tables_exist): Likewise. (initialize_costs): Likewise. (finalize_costs): Likewise. (tree_ssa_iv_optimize_init): Remove call to initialize_costs. (add_regs_cost): Remove. (multiply_regs_cost): Likewise. (add_const_cost): Likewise. (extend_or_trunc_reg_cost): Likewise. (negate_reg_cost): Likewise. (struct mbc_entry): Likewise. (multiply_by_const_cost): Likewise. (get_address_cost): Change add_regs_cost calls to add_cost lookups; change multiply_by_const_cost to mult_by_coeff_cost. (force_expr_to_var_cost): Likewise. (difference_cost): Change multiply_by_const_cost to mult_by_coeff_cost. (get_computation_cost_at): Change add_regs_cost calls to add_cost lookups; change multiply_by_const_cost to mult_by_coeff_cost. (determine_iv_cost): Change add_regs_cost calls to add_cost lookups. (tree_ssa_iv_optimize_finalize): Remove call to finalize_costs. * tree-ssa-address.c (expmed.h): New #include. (most_expensive_mult_to_index): Change multiply_by_const_cost to mult_by_coeff_cost. * gimple-ssa-strength-reduction.c (expmed.h): New #include. (stmt_cost): Change to use mult_by_coeff_cost, mul_cost, add_cost, neg_cost, and convert_cost instead of IVOPTS interfaces. (execute_strength_reduction): Remove calls to initialize_costs and finalize_costs. * expmed.c (struct init_expmed_rtl): Add convert rtx_def. (init_expmed_one_mode): Initialize convert rtx_def; initialize convert_cost for related modes. (mult_by_coeff_cost): New function. * expmed.h (struct target_expmed): Add x_convert_cost matrix. (convert_cost): New #define. (mult_by_coeff_cost): New extern decl. * tree-flow.h (initialize_costs): Remove decl. (finalize_costs): Likewise. (multiply_by_const_cost): Likewise. (add_regs_cost): Likewise. (multiply_regs_cost): Likewise. (add_const_cost): Likewise. (extend_or_trunc_reg_cost): Likewise. (negate_reg_cost): Likewise. Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 189845) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -88,9 +88,6 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-propagate.h" #include "expmed.h" -static hashval_t mbc_entry_hash (const void *); -static int mbc_entry_eq (const void*, const void *); - /* FIXME: Expressions are expanded to RTL in this pass to determine the cost of different addressing modes. This should be moved to a TBD interface between the GIMPLE and RTL worlds. */ @@ -381,11 +378,6 @@ struct iv_ca_delta static VEC(tree,heap) *decl_rtl_to_reset; -/* Cached costs for multiplies by constants, and a flag to indicate - when they're valid. */ -static htab_t mult_costs[2]; -static bool cost_tables_exist = false; - static comp_cost force_expr_to_var_cost (tree, bool); /* Number of uses recorded in DATA. */ @@ -851,26 +843,6 @@ htab_inv_expr_hash (const void *ent) return expr->hash; } -/* Allocate data structures for the cost model. */ - -void -initialize_costs (void) -{ - mult_costs[0] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free); - mult_costs[1] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free); - cost_tables_exist = true; -} - -/* Release data structures for the cost model. */ - -void -finalize_costs (void) -{ - cost_tables_exist = false; - htab_delete (mult_costs[0]); - htab_delete (mult_costs[1]); -} - /* Initializes data structures used by the iv optimization pass, stored in DATA. */ @@ -889,8 +861,6 @@ tree_ssa_iv_optimize_init (struct ivopts_data *dat htab_inv_expr_eq, free); data->inv_expr_id = 0; decl_rtl_to_reset = VEC_alloc (tree, heap, 20); - - initialize_costs (); } /* Returns a memory object to that EXPR points. In case we are able to @@ -3077,250 +3047,6 @@ adjust_setup_cost (struct ivopts_data *data, unsig return cost; } -/* Returns cost of addition in MODE. */ - -unsigned -add_regs_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_ee (PLUS, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Addition in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of multiplication in MODE. */ - -unsigned -multiply_regs_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_ee (MULT, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Multiplication in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of addition with a constant in MODE. */ - -unsigned -add_const_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - /* Arbitrarily generate insns for x + 2, as the exact constant - shouldn't matter. */ - start_sequence (); - force_operand (gen_rtx_fmt_ee (PLUS, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_int_mode (2, mode)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Addition to constant in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of extend or truncate in MODE. */ - -unsigned -extend_or_trunc_reg_cost (tree type_to, tree type_from, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - enum machine_mode mode_to = TYPE_MODE (type_to); - enum machine_mode mode_from = TYPE_MODE (type_from); - tree size_to = TYPE_SIZE (type_to); - tree size_from = TYPE_SIZE (type_from); - enum rtx_code code; - - gcc_assert (TREE_CODE (size_to) == INTEGER_CST - && TREE_CODE (size_from) == INTEGER_CST); - - if (costs[mode_to][mode_from][speed]) - return costs[mode_to][mode_from][speed]; - - if (tree_int_cst_lt (size_to, size_from)) - code = TRUNCATE; - else if (TYPE_UNSIGNED (type_to)) - code = ZERO_EXTEND; - else - code = SIGN_EXTEND; - - start_sequence (); - gen_rtx_fmt_e (code, mode_to, - gen_raw_REG (mode_from, LAST_VIRTUAL_REGISTER + 1)); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Conversion from %s to %s costs %d\n", - GET_MODE_NAME (mode_to), GET_MODE_NAME (mode_from), cost); - - costs[mode_to][mode_from][speed] = cost; - return cost; -} - -/* Returns cost of negation in MODE. */ - -unsigned -negate_reg_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_e (NEG, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Negation in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Entry in a hashtable of already known costs for multiplication. */ -struct mbc_entry -{ - HOST_WIDE_INT cst; /* The constant to multiply by. */ - enum machine_mode mode; /* In mode. */ - unsigned cost; /* The cost. */ -}; - -/* Counts hash value for the ENTRY. */ - -static hashval_t -mbc_entry_hash (const void *entry) -{ - const struct mbc_entry *e = (const struct mbc_entry *) entry; - - return 57 * (hashval_t) e->mode + (hashval_t) (e->cst % 877); -} - -/* Compares the hash table entries ENTRY1 and ENTRY2. */ - -static int -mbc_entry_eq (const void *entry1, const void *entry2) -{ - const struct mbc_entry *e1 = (const struct mbc_entry *) entry1; - const struct mbc_entry *e2 = (const struct mbc_entry *) entry2; - - return (e1->mode == e2->mode - && e1->cst == e2->cst); -} - -/* Returns cost of multiplication by constant CST in MODE. */ - -unsigned -multiply_by_const_cost (HOST_WIDE_INT cst, enum machine_mode mode, bool speed) -{ - struct mbc_entry **cached, act; - rtx seq; - unsigned cost; - - gcc_assert (cost_tables_exist); - - act.mode = mode; - act.cst = cst; - cached = (struct mbc_entry **) - htab_find_slot (mult_costs[speed], &act, INSERT); - - if (*cached) - return (*cached)->cost; - - *cached = XNEW (struct mbc_entry); - (*cached)->mode = mode; - (*cached)->cst = cst; - - start_sequence (); - expand_mult (mode, gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_int_mode (cst, mode), NULL_RTX, 0); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Multiplication by %d in %s costs %d\n", - (int) cst, GET_MODE_NAME (mode), cost); - - (*cached)->cost = cost; - - return cost; -} - /* Returns true if multiplying by RATIO is allowed in an address. Test the validity for a memory reference accessing memory of mode MODE in address space AS. */ @@ -3582,7 +3308,7 @@ get_address_cost (bool symbol_present, bool var_pr If VAR_PRESENT is true, try whether the mode with SYMBOL_PRESENT = false is cheaper even with cost of addition, and if this is the case, use it. */ - add_c = add_regs_cost (address_mode, speed); + add_c = add_cost[speed][address_mode]; for (i = 0; i < 8; i++) { var_p = i & 1; @@ -3663,10 +3389,10 @@ get_address_cost (bool symbol_present, bool var_pr && multiplier_allowed_in_address_p (ratio, mem_mode, as)); if (ratio != 1 && !ratio_p) - cost += multiply_by_const_cost (ratio, address_mode, speed); + cost += mult_by_coeff_cost (ratio, address_mode, speed); if (s_offset && !offset_p && !symbol_present) - cost += add_regs_cost (address_mode, speed); + cost += add_cost[speed][address_mode]; if (may_autoinc) *may_autoinc = autoinc; @@ -3833,7 +3559,7 @@ force_expr_to_var_cost (tree expr, bool speed) case PLUS_EXPR: case MINUS_EXPR: case NEGATE_EXPR: - cost = new_cost (add_regs_cost (mode, speed), 0); + cost = new_cost (add_cost[speed][mode], 0); if (TREE_CODE (expr) != NEGATE_EXPR) { tree mult = NULL_TREE; @@ -3853,11 +3579,11 @@ force_expr_to_var_cost (tree expr, bool speed) case MULT_EXPR: if (cst_and_fits_in_hwi (op0)) - cost = new_cost (multiply_by_const_cost (int_cst_value (op0), - mode, speed), 0); + cost = new_cost (mult_by_coeff_cost (int_cst_value (op0), + mode, speed), 0); else if (cst_and_fits_in_hwi (op1)) - cost = new_cost (multiply_by_const_cost (int_cst_value (op1), - mode, speed), 0); + cost = new_cost (mult_by_coeff_cost (int_cst_value (op1), + mode, speed), 0); else return new_cost (target_spill_cost [speed], 0); break; @@ -4023,7 +3749,7 @@ difference_cost (struct ivopts_data *data, if (integer_zerop (e1)) { comp_cost cost = force_var_cost (data, e2, depends_on); - cost.cost += multiply_by_const_cost (-1, mode, data->speed); + cost.cost += mult_by_coeff_cost (-1, mode, data->speed); return cost; } @@ -4334,7 +4060,7 @@ get_computation_cost_at (struct ivopts_data *data, &symbol_present, &var_present, &offset, depends_on)); cost.cost /= avg_loop_niter (data->current_loop); - cost.cost += add_regs_cost (TYPE_MODE (ctype), data->speed); + cost.cost += add_cost[data->speed][TYPE_MODE (ctype)]; } if (inv_expr_id) @@ -4367,7 +4093,7 @@ get_computation_cost_at (struct ivopts_data *data, if (!symbol_present && !var_present && !offset) { if (ratio != 1) - cost.cost += multiply_by_const_cost (ratio, TYPE_MODE (ctype), speed); + cost.cost += mult_by_coeff_cost (ratio, TYPE_MODE (ctype), speed); return cost; } @@ -4375,18 +4101,18 @@ get_computation_cost_at (struct ivopts_data *data, are added once to the variable, if present. */ if (var_present && (symbol_present || offset)) cost.cost += adjust_setup_cost (data, - add_regs_cost (TYPE_MODE (ctype), speed)); + add_cost[speed][TYPE_MODE (ctype)]); /* Having offset does not affect runtime cost in case it is added to symbol, but it increases complexity. */ if (offset) cost.complexity++; - cost.cost += add_regs_cost (TYPE_MODE (ctype), speed); + cost.cost += add_cost[speed][TYPE_MODE (ctype)]; aratio = ratio > 0 ? ratio : -ratio; if (aratio != 1) - cost.cost += multiply_by_const_cost (aratio, TYPE_MODE (ctype), speed); + cost.cost += mult_by_coeff_cost (aratio, TYPE_MODE (ctype), speed); return cost; fallback: @@ -5232,7 +4958,7 @@ determine_iv_cost (struct ivopts_data *data, struc or a const set. */ if (cost_base.cost == 0) cost_base.cost = COSTS_N_INSNS (1); - cost_step = add_regs_cost (TYPE_MODE (TREE_TYPE (base)), data->speed); + cost_step = add_cost[data->speed][TYPE_MODE (TREE_TYPE (base))]; cost = cost_step + adjust_setup_cost (data, cost_base.cost); @@ -6804,8 +6530,6 @@ tree_ssa_iv_optimize_finalize (struct ivopts_data VEC_free (iv_use_p, heap, data->iv_uses); VEC_free (iv_cand_p, heap, data->iv_candidates); htab_delete (data->inv_expr_tab); - - finalize_costs (); } /* Returns true if the loop body BODY includes any function calls. */ Index: gcc/tree-ssa-address.c =================================================================== --- gcc/tree-ssa-address.c (revision 189845) +++ gcc/tree-ssa-address.c (working copy) @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "expr.h" #include "ggc.h" #include "target.h" +#include "expmed.h" /* TODO -- handling of symbols (according to Richard Hendersons comments, http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00949.html): @@ -554,7 +555,7 @@ most_expensive_mult_to_index (tree type, struct me || !multiplier_allowed_in_address_p (coef, TYPE_MODE (type), as)) continue; - acost = multiply_by_const_cost (coef, address_mode, speed); + acost = mult_by_coeff_cost (coef, address_mode, speed); if (acost > best_mult_cost) { Index: gcc/gimple-ssa-strength-reduction.c =================================================================== --- gcc/gimple-ssa-strength-reduction.c (revision 189845) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-flow.h" #include "domwalk.h" #include "pointer-set.h" +#include "expmed.h" /* Information about a strength reduction candidate. Each statement in the candidate table represents an expression of one of the @@ -340,29 +341,22 @@ stmt_cost (gimple gs, bool speed) rhs2 = gimple_assign_rhs2 (gs); if (host_integerp (rhs2, 0)) - return multiply_by_const_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, - speed); + return mult_by_coeff_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed); gcc_assert (TREE_CODE (rhs1) != INTEGER_CST); - return multiply_regs_cost (TYPE_MODE (TREE_TYPE (lhs)), speed); + return mul_cost[speed][lhs_mode]; case PLUS_EXPR: case POINTER_PLUS_EXPR: case MINUS_EXPR: rhs2 = gimple_assign_rhs2 (gs); + return add_cost[speed][lhs_mode]; - if (host_integerp (rhs2, 0)) - return add_const_cost (TYPE_MODE (TREE_TYPE (rhs1)), speed); - - gcc_assert (TREE_CODE (rhs1) != INTEGER_CST); - return add_regs_cost (lhs_mode, speed); - case NEGATE_EXPR: - return negate_reg_cost (lhs_mode, speed); + return neg_cost[speed][lhs_mode]; case NOP_EXPR: - return extend_or_trunc_reg_cost (TREE_TYPE (lhs), TREE_TYPE (rhs1), - speed); + return convert_cost[speed][lhs_mode][TYPE_MODE (TREE_TYPE (rhs1))]; /* Note that we don't assign costs to copies that in most cases will go away. */ @@ -1460,9 +1454,6 @@ execute_strength_reduction (void) back edges, and this gives us dominator information as well. */ loop_optimizer_init (AVOID_CFG_MODIFICATIONS); - /* Initialize costs tables in IVOPTS. */ - initialize_costs (); - /* Set up callbacks for the generic dominator tree walker. */ walk_data.dom_direction = CDI_DOMINATORS; walk_data.initialize_block_local_data = NULL; @@ -1493,7 +1484,6 @@ execute_strength_reduction (void) pointer_map_destroy (stmt_cand_map); VEC_free (slsr_cand_t, heap, cand_vec); obstack_free (&cand_obstack, NULL); - finalize_costs (); return 0; } Index: gcc/expmed.c =================================================================== --- gcc/expmed.c (revision 189845) +++ gcc/expmed.c (working copy) @@ -112,6 +112,7 @@ struct init_expmed_rtl struct rtx_def shift_add; rtunion shift_add_fld1; struct rtx_def shift_sub0; rtunion shift_sub0_fld1; struct rtx_def shift_sub1; rtunion shift_sub1_fld1; + struct rtx_def convert; rtx pow2[MAX_BITS_PER_WORD]; rtx cint[MAX_BITS_PER_WORD]; @@ -122,6 +123,7 @@ init_expmed_one_mode (struct init_expmed_rtl *all, enum machine_mode mode, int speed) { int m, n, mode_bitsize; + enum machine_mode mode_from; mode_bitsize = GET_MODE_UNIT_BITSIZE (mode); @@ -139,6 +141,7 @@ init_expmed_one_mode (struct init_expmed_rtl *all, PUT_MODE (&all->shift_add, mode); PUT_MODE (&all->shift_sub0, mode); PUT_MODE (&all->shift_sub1, mode); + PUT_MODE (&all->convert, mode); add_cost[speed][mode] = set_src_cost (&all->plus, speed); neg_cost[speed][mode] = set_src_cost (&all->neg, speed); @@ -183,6 +186,30 @@ init_expmed_one_mode (struct init_expmed_rtl *all, mul_highpart_cost[speed][mode] = set_src_cost (&all->wide_trunc, speed); } + + for (mode_from = GET_CLASS_NARROWEST_MODE (MODE_INT); + mode_from != VOIDmode; + mode_from = GET_MODE_WIDER_MODE (mode_from)) + if (mode != mode_from) + { + unsigned short size_to = GET_MODE_SIZE (mode); + unsigned short size_from = GET_MODE_SIZE (mode_from); + if (size_to < size_from) + { + PUT_CODE (&all->convert, TRUNCATE); + PUT_MODE (&all->reg, mode_from); + convert_cost[speed][mode][mode_from] + = set_src_cost (&all->convert, speed); + } + else if (size_from < size_to) + { + /* Assume cost of zero-extend and sign-extend is the same. */ + PUT_CODE (&all->convert, ZERO_EXTEND); + PUT_MODE (&all->reg, mode_from); + convert_cost[speed][mode][mode_from] + = set_src_cost (&all->convert, speed); + } + } } } @@ -262,6 +289,9 @@ init_expmed (void) XEXP (&all.shift_sub1, 0) = &all.reg; XEXP (&all.shift_sub1, 1) = &all.shift_mult; + PUT_CODE (&all.convert, TRUNCATE); + XEXP (&all.convert, 0) = &all.reg; + for (speed = 0; speed < 2; speed++) { crtl->maybe_hot_insn_p = speed; @@ -3262,6 +3292,24 @@ expand_mult (enum machine_mode mode, rtx op0, rtx return op0; } +/* Return a cost estimate for multiplying a register by the given + COEFFicient in the given MODE and SPEED. */ + +int +mult_by_coeff_cost (HOST_WIDE_INT coeff, enum machine_mode mode, bool speed) +{ + int max_cost; + struct algorithm algorithm; + enum mult_variant variant; + + rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1); + max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed); + if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost)) + return algorithm.cost.cost; + else + return max_cost; +} + /* Perform a widening multiplication and return an rtx for the result. MODE is mode of value; OP0 and OP1 are what to multiply (rtx's); TARGET is a suggestion for where to store the result (an rtx). Index: gcc/expmed.h =================================================================== --- gcc/expmed.h (revision 189845) +++ gcc/expmed.h (working copy) @@ -155,6 +155,11 @@ struct target_expmed { int x_udiv_cost[2][NUM_MACHINE_MODES]; int x_mul_widen_cost[2][NUM_MACHINE_MODES]; int x_mul_highpart_cost[2][NUM_MACHINE_MODES]; + + /* Conversion costs are only defined between two scalar integer modes + of different sizes. The first machine mode is the destination mode, + and the second is the source mode. */ + int x_convert_cost[2][NUM_MACHINE_MODES][NUM_MACHINE_MODES]; }; extern struct target_expmed default_target_expmed; @@ -196,5 +201,8 @@ extern struct target_expmed *this_target_expmed; (this_target_expmed->x_mul_widen_cost) #define mul_highpart_cost \ (this_target_expmed->x_mul_highpart_cost) +#define convert_cost \ + (this_target_expmed->x_convert_cost) +extern int mult_by_coeff_cost (HOST_WIDE_INT, enum machine_mode, bool); #endif Index: gcc/tree-flow.h =================================================================== --- gcc/tree-flow.h (revision 189845) +++ gcc/tree-flow.h (working copy) @@ -806,14 +806,6 @@ bool expr_invariant_in_loop_p (struct loop *, tree bool stmt_invariant_in_loop_p (struct loop *, gimple); bool multiplier_allowed_in_address_p (HOST_WIDE_INT, enum machine_mode, addr_space_t); -void initialize_costs (void); -void finalize_costs (void); -unsigned multiply_by_const_cost (HOST_WIDE_INT, enum machine_mode, bool); -unsigned add_regs_cost (enum machine_mode, bool); -unsigned multiply_regs_cost (enum machine_mode, bool); -unsigned add_const_cost (enum machine_mode, bool); -unsigned extend_or_trunc_reg_cost (tree, tree, bool); -unsigned negate_reg_cost (enum machine_mode, bool); bool may_be_nonaddressable_p (tree expr); /* In tree-ssa-threadupdate.c. */