From patchwork Thu Jun 24 21:18:28 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fang, Changpeng" X-Patchwork-Id: 56848 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 47E47B6F1C for ; Fri, 25 Jun 2010 07:18:58 +1000 (EST) Received: (qmail 15732 invoked by alias); 24 Jun 2010 21:18:56 -0000 Received: (qmail 15722 invoked by uid 22791); 24 Jun 2010 21:18:55 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from tx2ehsobe002.messaging.microsoft.com (HELO TX2EHSOBE004.bigfish.com) (65.55.88.12) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 24 Jun 2010 21:18:46 +0000 Received: from mail166-tx2-R.bigfish.com (10.9.14.246) by TX2EHSOBE004.bigfish.com (10.9.40.24) with Microsoft SMTP Server id 8.1.340.0; Thu, 24 Jun 2010 21:18:44 +0000 Received: from mail166-tx2 (localhost.localdomain [127.0.0.1]) by mail166-tx2-R.bigfish.com (Postfix) with ESMTP id 57872E85D4; Thu, 24 Jun 2010 21:18:44 +0000 (UTC) X-SpamScore: -17 X-BigFish: VPS-17(z21aejz98dN4015L9371P15bfM853kzz1202hzzz32i2a8h34h43h61h) X-Spam-TCS-SCL: 0:0 Received: from mail166-tx2 (localhost.localdomain [127.0.0.1]) by mail166-tx2 (MessageSwitch) id 1277414323725607_1250; Thu, 24 Jun 2010 21:18:43 +0000 (UTC) Received: from TX2EHSMHS022.bigfish.com (unknown [10.9.14.246]) by mail166-tx2.bigfish.com (Postfix) with ESMTP id 763BB1B40051; Thu, 24 Jun 2010 21:18:43 +0000 (UTC) Received: from ausb3extmailp01.amd.com (163.181.251.8) by TX2EHSMHS022.bigfish.com (10.9.99.122) with Microsoft SMTP Server (TLS) id 14.0.482.44; Thu, 24 Jun 2010 21:18:42 +0000 Received: from ausb3twp02.amd.com ([163.181.250.38]) by ausb3extmailp01.amd.com (Switch-3.2.7/Switch-3.2.7) with SMTP id o5OLMCmo016893; Thu, 24 Jun 2010 16:22:15 -0500 X-M-MSG: Received: from sausexhtp02.amd.com (sausexhtp02.amd.com [163.181.3.152]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by ausb3twp02.amd.com (Tumbleweed MailGate 3.7.2) with ESMTP id 2DE42C889F; Thu, 24 Jun 2010 16:18:27 -0500 (CDT) Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp02.amd.com ([163.181.3.152]) with mapi; Thu, 24 Jun 2010 16:18:28 -0500 From: "Fang, Changpeng" To: Mark Mitchell , Christian Borntraeger CC: "gcc-patches@gcc.gnu.org" , "H.J. Lu" , "rguenther@suse.de" , "sebpop@gmail.com" , Zdenek Dvorak , Maxim Kuvyrkov Date: Thu, 24 Jun 2010 16:18:28 -0500 Subject: RE: [PATCH] Enabling Software Prefetching by Default at -O3 Message-ID: References: <201006192104.54441.borntraeger@de.ibm.com>, <4C1D2304.5080007@codesourcery.com> In-Reply-To: <4C1D2304.5080007@codesourcery.com> MIME-Version: 1.0 X-Reverse-DNS: unknown Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Attached is the version of the patch that turns prefetching on at -O3 for AMD cpus only. As discussed elsewhere in this thread, we use tri-state for -fprefetch-loop-arrays. If this flag is not explicitly set, (for -O3) we turn it on in gcc/config/i386/i386.c (override_options). Is this OK to commit now? Thanks, Changpeng From 7f48bc625b0e451dd8c05a3a3cc20f68dcaa695c Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Wed, 23 Jun 2010 17:05:59 -0700 Subject: [PATCH 3/3] Enable prefetching at -O3 for AMD cpus * gcc/common.opt (fprefetch-loop-arrays): Re-define -fprefetch-loop-arrays as a tri-state option with the initial value of -1. * gcc/tree-ssa-loop.c (gate_tree_ssa_loop_prefetch): Invoke prefetch pass only when flag_prefetch_loop_arrays > 0. * gcc/toplev.c (process_options): Note that, with tri-states, flag_prefetch_loop_arrays>0 means prefetching is enabled. * gcc/config/i386/i386.c (override_options): Enable prefetching at -O3 for a set of CPUs that sw prefetching is helpful. (software_prefetching_beneficial_p): New. Return TRUE if software prefetching is beneficial for the given CPU. --- gcc/common.opt | 2 +- gcc/config/i386/i386.c | 27 +++++++++++++++++++++++++++ gcc/toplev.c | 6 +++--- gcc/tree-ssa-loop.c | 2 +- 4 files changed, 32 insertions(+), 5 deletions(-) diff --git a/gcc/common.opt b/gcc/common.opt index 4904481..74fbd1d 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -937,7 +937,7 @@ Common Report Var(flag_predictive_commoning) Optimization Run predictive commoning optimization. fprefetch-loop-arrays -Common Report Var(flag_prefetch_loop_arrays) Optimization +Common Report Var(flag_prefetch_loop_arrays) Init(-1) Optimization Generate prefetch instructions, if available, for arrays in loops fprofile diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 2a46f89..605e57b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2691,6 +2691,26 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune, return ret; } +/* Return TRUE if software prefetching is beneficial for the + given CPU. */ + +static bool +software_prefetching_beneficial_p (void) +{ + switch (ix86_tune) + { + case PROCESSOR_GEODE: + case PROCESSOR_K6: + case PROCESSOR_ATHLON: + case PROCESSOR_K8: + case PROCESSOR_AMDFAM10: + return true; + + default: + return false; + } +} + /* Function that is callable from the debugger to print the current options. */ void @@ -3531,6 +3551,13 @@ override_options (bool main_args_p) if (!PARAM_SET_P (PARAM_L2_CACHE_SIZE)) set_param_value ("l2-cache-size", ix86_cost->l2_cache_size); + /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful. */ + if (flag_prefetch_loop_arrays < 0 + && HAVE_prefetch + && optimize >= 3 + && software_prefetching_beneficial_p()) + flag_prefetch_loop_arrays = 1; + /* If using typedef char *va_list, signal that __builtin_va_start (&ap, 0) can be optimized to ap = __builtin_next_arg (0). */ if (!TARGET_64BIT) diff --git a/gcc/toplev.c b/gcc/toplev.c index ff4c850..369820b 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -2078,13 +2078,13 @@ process_options (void) } #ifndef HAVE_prefetch - if (flag_prefetch_loop_arrays) + if (flag_prefetch_loop_arrays > 0) { warning (0, "-fprefetch-loop-arrays not supported for this target"); flag_prefetch_loop_arrays = 0; } #else - if (flag_prefetch_loop_arrays && !HAVE_prefetch) + if (flag_prefetch_loop_arrays > 0 && !HAVE_prefetch) { warning (0, "-fprefetch-loop-arrays not supported for this target (try -march switches)"); flag_prefetch_loop_arrays = 0; @@ -2093,7 +2093,7 @@ process_options (void) /* This combination of options isn't handled for i386 targets and doesn't make much sense anyway, so don't allow it. */ - if (flag_prefetch_loop_arrays && optimize_size) + if (flag_prefetch_loop_arrays > 0 && optimize_size) { warning (0, "-fprefetch-loop-arrays is not supported with -Os"); flag_prefetch_loop_arrays = 0; diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c index 344cfa8..c9c5bbd 100644 --- a/gcc/tree-ssa-loop.c +++ b/gcc/tree-ssa-loop.c @@ -600,7 +600,7 @@ tree_ssa_loop_prefetch (void) static bool gate_tree_ssa_loop_prefetch (void) { - return flag_prefetch_loop_arrays != 0; + return flag_prefetch_loop_arrays > 0; } struct gimple_opt_pass pass_loop_prefetch = -- 1.6.3.3