From patchwork Sat Jun 25 00:12:30 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Fang, Changpeng" <Changpeng.Fang@amd.com>
X-Patchwork-Id: 101928
Return-Path: 
 <gcc-patches-return-295218-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id 552EEB6F18
	for <incoming@patchwork.ozlabs.org>;
	Sat, 25 Jun 2011 10:16:24 +1000 (EST)
Received: (qmail 22282 invoked by alias); 25 Jun 2011 00:16:19 -0000
Received: (qmail 22227 invoked by uid 22791); 25 Jun 2011 00:16:15 -0000
X-SWARE-Spam-Status: No, hits=-2.2 required=5.0	tests=AWL, BAYES_00,
	RCVD_IN_DNSWL_LOW, TW_AV, TW_BD
X-Spam-Check-By: sourceware.org
Received: from db3ehsobe002.messaging.microsoft.com (HELO
	DB3EHSOBE002.bigfish.com) (213.199.154.140) by sourceware.org
	(qpsmtpd/0.43rc1) with ESMTP; Sat, 25 Jun 2011 00:16:00 +0000
Received: from mail32-db3-R.bigfish.com (10.3.81.248) by
	DB3EHSOBE002.bigfish.com (10.3.84.22) with Microsoft SMTP
	Server id 14.1.225.22; Sat, 25 Jun 2011 00:15:58 +0000
Received: from mail32-db3 (localhost.localdomain [127.0.0.1])	by
	mail32-db3-R.bigfish.com (Postfix) with ESMTP id 786F67382C9;
	Sat, 25 Jun 2011 00:15:58 +0000 (UTC)
X-SpamScore: -14
X-BigFish: VPS-14(zz9371M4015L1432Nzz1202hzzz32i668h839h34h61h)
X-Spam-TCS-SCL: 0:0
X-Forefront-Antispam-Report: CIP:163.181.249.108; KIP:(null); UIP:(null);
	IPVD:NLI; H:ausb3twp01.amd.com; RD:none; EFVD:NLI
Received: from mail32-db3 (localhost.localdomain [127.0.0.1]) by mail32-db3
	(MessageSwitch) id 1308960958252293_14116;
	Sat, 25 Jun 2011 00:15:58 +0000 (UTC)
Received: from DB3EHSMHS017.bigfish.com (unknown [10.3.81.245])	by
	mail32-db3.bigfish.com (Postfix) with ESMTP id 2E6C7A4804F;
	Sat, 25 Jun 2011 00:15:58 +0000 (UTC)
Received: from ausb3twp01.amd.com (163.181.249.108) by
	DB3EHSMHS017.bigfish.com (10.3.87.117) with Microsoft SMTP
	Server id 14.1.225.22; Sat, 25 Jun 2011 00:15:57 +0000
X-M-MSG: 
Received: from sausexedgep02.amd.com (sausexedgep02-ext.amd.com
	[163.181.249.73])	(using TLSv1 with cipher AES128-SHA
	(128/128 bits))	(No client certificate requested)	by
	ausb3twp01.amd.com (Axway MailGate 3.8.1) with ESMTP id
	23B7010284B4; Fri, 24 Jun 2011 19:15:52 -0500 (CDT)
Received: from sausexhtp01.amd.com (163.181.3.165) by sausexedgep02.amd.com
	(163.181.36.59) with Microsoft SMTP Server (TLS) id 8.3.106.1;
	Fri, 24 Jun 2011 19:16:51 -0500
Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp01.amd.com
	([163.181.3.165]) with mapi; Fri, 24 Jun 2011 19:15:55 -0500
From: "Fang, Changpeng" <Changpeng.Fang@amd.com>
To: Jan Hubicka <hubicka@ucw.cz>
CC: Uros Bizjak <ubizjak@gmail.com>,
	"gcc-patches@gcc.gnu.org"	<gcc-patches@gcc.gnu.org>,
	"rguenther@suse.de" <rguenther@suse.de>
Date: Fri, 24 Jun 2011 19:12:30 -0500
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer
Message-ID: <D4C76825A6780047854A11E93CDE84D005980DC716@SAUSEXMBP01.amd.com>
References: <D4C76825A6780047854A11E93CDE84D005980DC70C@SAUSEXMBP01.amd.com>,
	<20110623232029.GC3783@kam.mff.cuni.cz>
In-Reply-To: <20110623232029.GC3783@kam.mff.cuni.cz>
MIME-Version: 1.0
X-OriginatorOrg: amd.com
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

Hi,

 I have no preference in tune feature coding. But I agree with you it's better to
put similar things together. I modified the code following your suggestion.

Is it OK to commit this modified patch?

Thanks,

Changpeng

From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1

	* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.

	* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
	(TARGET_AVX128_OPTIMAL): New definition.

	* config/i386/i386.c (initial_ix86_tune_features): Initialize
	X86_TUNE_AVX128_OPTIMAL entry.
	(ix86_option_override_internal): Enable the generation
	of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
	(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
	(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
 gcc/config/i386/i386.c   |   16 ++++++++++++----
 gcc/config/i386/i386.h   |    4 +++-
 gcc/config/i386/i386.opt |    2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
   /* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
      at -O3.  For the moment, the prefetching seems badly tuned for Intel
      chips.  */
-  m_K6_GEODE | m_AMD_MULTIPLE
+  m_K6_GEODE | m_AMD_MULTIPLE,
+
+  /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+     the auto-vectorizer.  */
+  m_BDVER1
 };
 
 /* Feature tests against the various architecture variations.  */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
     { "-mvzeroupper",			MASK_VZEROUPPER },
     { "-mavx256-split-unaligned-load",	MASK_AVX256_SPLIT_UNALIGNED_LOAD},
     { "-mavx256-split-unaligned-store",	MASK_AVX256_SPLIT_UNALIGNED_STORE},
+    { "-mprefer-avx128",		MASK_PREFER_AVX128},
   };
 
   const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
 	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
 	      && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	    target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+	  /* Enable 128-bit AVX instruction generation for the auto-vectorizer.  */
+	  if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+	    target_flags |= MASK_PREFER_AVX128;
 	}
     }
   else 
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
       return V2DImode;
 
     case SFmode:
-      if (TARGET_AVX && !flag_prefer_avx128)
+      if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V8SFmode;
       else
 	return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
     case DFmode:
       if (!TARGET_VECTORIZE_DOUBLE)
 	return word_mode;
-      else if (TARGET_AVX && !flag_prefer_avx128)
+      else if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V4DFmode;
       else if (TARGET_SSE2)
 	return V2DFmode;
@@ -34639,7 +34647,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 static unsigned int
 ix86_autovectorize_vector_sizes (void)
 {
-  return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+  return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }
 
 /* Initialize the GCC target structure.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8badcbb..d9317ed 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -312,6 +312,7 @@ enum ix86_tune_indices {
   X86_TUNE_OPT_AGU,
   X86_TUNE_VECTORIZE_DOUBLE,
   X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL,
+  X86_TUNE_AVX128_OPTIMAL,
 
   X86_TUNE_LAST
 };
@@ -410,7 +411,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 	ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE]
 #define TARGET_SOFTWARE_PREFETCHING_BENEFICIAL \
 	ix86_tune_features[X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL]
-
+#define TARGET_AVX128_OPTIMAL \
+	ix86_tune_features[X86_TUNE_AVX128_OPTIMAL]
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
   X86_ARCH_CMOVE,		/* || TARGET_SSE */
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 21e0def..9886b7b 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -388,7 +388,7 @@ Do dispatch scheduling if processor is bdver1 and Haifa scheduling
 is selected.
 
 mprefer-avx128
-Target Report Var(flag_prefer_avx128) Init(0)
+Target Report Mask(PREFER_AVX128) SAVE
 Use 128-bit AVX instructions instead of 256-bit AVX instructions in the auto-vectorizer.
 
 ;; ISA support
-- 
1.7.0.4