From patchwork Mon Apr 13 13:49:09 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 460784 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 2D22F1402F6 for ; Mon, 13 Apr 2015 23:49:36 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=wgW6aGGI; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=yTVZgfuS4gUFYDnwmTGgv+lqpfYXh/YnaDs67foM5AI jsSA458mMu9gt6h+whm1wUKTTOm6GU0aJp3XkuRKJhtPXB25PYBK/dwfhonGkdYk yKgYGO5iVPmabhN4/KYd32vYfzGJe8JcCT7fHB8xGFBU4rLoWKqd5N/BNGUPcc7o = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=mi5icrbyBqYTwLZzkTLfHlOg1ok=; b=wgW6aGGIP6Njz8FsS cwj8EeNfPeTLPV9aeHaSh54NkiFU2L7ghTi0hhxxyQsU7pANOaMdUPIIPCuFxCHo 0xSS3s4Xcl+hEIJI96rOKfLiX5Yor9M4/ITFPCNoLOf+JmJIKHGUgewJ5e5K4N4/ i/CrtziIy3tsYx1cz9ltdsgN/I= Received: (qmail 8182 invoked by alias); 13 Apr 2015 13:49:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8156 invoked by uid 89); 13 Apr 2015 13:49:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 13 Apr 2015 13:49:23 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-13.uk.mimecast.lan; Mon, 13 Apr 2015 14:49:09 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 13 Apr 2015 14:49:09 +0100 Message-ID: <552BC955.3000501@arm.com> Date: Mon, 13 Apr 2015 14:49:09 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Ramana Radhakrishnan , Richard Earnshaw Subject: [PATCH][ARM] Make issue rate part of per-core tuning structs X-MC-Unique: m-wWwn-lQ-uG5-UXoAkFaQ-1 X-IsSubscribed: yes Hi all, This is an update to https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02706.html, rebased on top of the new cores that went in since that time. It's just a refactoring. Bootstrapped and tested on arm-linux. Ok for trunk (to commit after GCC 5 release)? Thanks, Kyrill 2015-04-13 Kyrylo Tkachov * config/arm/arm-protos.h (struct tune_params): Add issue_rate field. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune, arm_xgene1_tune): Specify issue_rate value. (arm_issue_rate): Look up issue rate from tuning structs. Remove large switch statement. (arm_marvell_pj4_tune): New struct. * config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune struct. commit ff6b444a330ab084834a6baa3f1ee67a029c5a7c Author: Kyrylo Tkachov Date: Fri Jan 16 16:51:25 2015 +0000 [ARM] Refactor issue_rate diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 7ade8a1..103c314 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -158,7 +158,7 @@ ARM_CORE("cortex-r7", cortexr7, cortexr7, 7R, FL_LDSCHED | FL_ARM_DIV, cortex ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED | FL_NO_VOLATILE_CE, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) -ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, marvell_pj4) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 16eb854..e2a0ccd 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -303,6 +303,8 @@ struct tune_params unsigned int fuseable_ops; /* Depth of scheduling queue to check for L2 autoprefetcher. */ enum arm_sched_autopref sched_autopref; + /* Issue rate of the processor. */ + unsigned int issue_rate; }; extern const struct tune_params *current_tune; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0466399..1f4a9f0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1699,7 +1699,8 @@ const struct tune_params arm_slowmul_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; const struct tune_params arm_fastmul_tune = @@ -1720,7 +1721,8 @@ const struct tune_params arm_fastmul_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; /* StrongARM has early execution of branches, so a sequence that is worth @@ -1744,7 +1746,8 @@ const struct tune_params arm_strongarm_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; const struct tune_params arm_xscale_tune = @@ -1765,7 +1768,8 @@ const struct tune_params arm_xscale_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; const struct tune_params arm_9e_tune = @@ -1786,7 +1790,30 @@ const struct tune_params arm_9e_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ +}; + +const struct tune_params arm_marvell_pj4_tune = +{ + arm_9e_rtx_costs, + NULL, + NULL, /* Sched adj cost. */ + 1, /* Constant limit. */ + 5, /* Max cond insns. */ + ARM_PREFETCH_NOT_BENEFICIAL, + true, /* Prefer constant pool. */ + arm_default_branch_cost, + false, /* Prefer LDRD/STRD. */ + {true, true}, /* Prefer non short circuit. */ + &arm_default_vec_cost, /* Vectorizer costs. */ + false, /* Prefer Neon for 64-bits bitops. */ + false, false, /* Prefer 32-bit encodings. */ + false, /* Prefer Neon for stringops. */ + 8, /* Maximum insns to inline memset. */ + ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_v6t2_tune = @@ -1807,9 +1834,11 @@ const struct tune_params arm_v6t2_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; + /* Generic Cortex tuning. Use more specific tunings if appropriate. */ const struct tune_params arm_cortex_tune = { @@ -1829,7 +1858,8 @@ const struct tune_params arm_cortex_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a8_tune = @@ -1850,7 +1880,8 @@ const struct tune_params arm_cortex_a8_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a7_tune = @@ -1871,7 +1902,8 @@ const struct tune_params arm_cortex_a7_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a15_tune = @@ -1892,7 +1924,8 @@ const struct tune_params arm_cortex_a15_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_FULL /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_FULL, /* Sched L2 autopref. */ + 3 /* Issue rate. */ }; const struct tune_params arm_cortex_a53_tune = @@ -1913,7 +1946,8 @@ const struct tune_params arm_cortex_a53_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_MOVW_MOVT, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a57_tune = @@ -1934,7 +1968,8 @@ const struct tune_params arm_cortex_a57_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_MOVW_MOVT, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_FULL /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_FULL, /* Sched L2 autopref. */ + 3 /* Issue rate. */ }; const struct tune_params arm_xgene1_tune = @@ -1955,7 +1990,8 @@ const struct tune_params arm_xgene1_tune = false, /* Prefer Neon for stringops. */ 32, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 4 /* Issue rate. */ }; /* Branches can be dual-issued on Cortex-A5, so conditional execution is @@ -1979,7 +2015,8 @@ const struct tune_params arm_cortex_a5_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a9_tune = @@ -2000,7 +2037,8 @@ const struct tune_params arm_cortex_a9_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; const struct tune_params arm_cortex_a12_tune = @@ -2021,7 +2059,8 @@ const struct tune_params arm_cortex_a12_tune = true, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_MOVW_MOVT, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; /* armv7m tuning. On Cortex-M4 cores for example, MOVW/MOVT take a single @@ -2049,7 +2088,8 @@ const struct tune_params arm_v7m_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; /* Cortex-M7 tuning. */ @@ -2072,7 +2112,8 @@ const struct tune_params arm_cortex_m7_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; /* The arm_v6m_tune is duplicated from arm_cortex_tune, rather than @@ -2095,7 +2136,8 @@ const struct tune_params arm_v6m_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 1 /* Issue rate. */ }; const struct tune_params arm_fa726te_tune = @@ -2116,7 +2158,8 @@ const struct tune_params arm_fa726te_tune = false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ ARM_FUSE_NOTHING, /* Fuseable pairs of instructions. */ - ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ + ARM_SCHED_AUTOPREF_OFF, /* Sched L2 autopref. */ + 2 /* Issue rate. */ }; @@ -27191,40 +27234,12 @@ thumb2_output_casesi (rtx *operands) } } -/* Most ARM cores are single issue, but some newer ones can dual issue. - The scheduler descriptions rely on this being correct. */ +/* Implement TARGET_SCHED_ISSUE_RATE. Lookup the issue rate in the + per-core tuning structs. */ static int arm_issue_rate (void) { - switch (arm_tune) - { - case xgene1: - return 4; - - case cortexa15: - case cortexa57: - case exynosm1: - return 3; - - case cortexm7: - case cortexr4: - case cortexr4f: - case cortexr5: - case genericv7a: - case cortexa5: - case cortexa7: - case cortexa8: - case cortexa9: - case cortexa12: - case cortexa17: - case cortexa53: - case fa726te: - case marvell_pj4: - return 2; - - default: - return 1; - } + return current_tune->issue_rate; } /* Return how many instructions should scheduler lookahead to choose the