From patchwork Tue Apr 15 16:08:50 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evgeny Stupachenko X-Patchwork-Id: 339314 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1472C14008A for ; Wed, 16 Apr 2014 02:09:02 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=AGd0hZqpitm1Zco3wHNmjbjpoYalD7eQwzcTaiQrrpC 6RJFPlY3xfLXSnYrJHeY5sOydtRbgfK9NmAf5JGdtoFNHo+cJT8znWHhksQDJ7lb jFtsoLFkgpMDV+rjrDCW/GU9JPH/VOvts/7CaY9tb6tlI/LcqVrSeX1x0MVe18Ss = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=sj9mX3LgPysStu/HAF3GRu9WtdA=; b=AuR1AimVRmNdfr71J E8ILcbUerIit5fnVd3IzVgik1M/OtceFa+VAda1yYS/mcFhtxraFXVzhZZ1n9FCM Uf/l8vEUv3qgqThT+Hnw3QVbRd+N1vKx8+czY7w7bQET1Hq5ILDvi+hjzdMa0kpa kDdZ0DgxhMmD5Z7YikZ4lkyEmo= Received: (qmail 32564 invoked by alias); 15 Apr 2014 16:08:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32546 invoked by uid 89); 15 Apr 2014 16:08:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: mail-ob0-f181.google.com Received: from mail-ob0-f181.google.com (HELO mail-ob0-f181.google.com) (209.85.214.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 15 Apr 2014 16:08:52 +0000 Received: by mail-ob0-f181.google.com with SMTP id gq1so1075247obb.26 for ; Tue, 15 Apr 2014 09:08:50 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.55.97 with SMTP id r1mr2170558oep.5.1397578130829; Tue, 15 Apr 2014 09:08:50 -0700 (PDT) Received: by 10.76.170.39 with HTTP; Tue, 15 Apr 2014 09:08:50 -0700 (PDT) Date: Tue, 15 Apr 2014 20:08:50 +0400 Message-ID: Subject: [PATCH 2/3, x86] X86 Silvermont vector cost model tune From: Evgeny Stupachenko To: Uros Bizjak Cc: "H.J. Lu" , GCC Patches , uros@gcc.gnu.org X-IsSubscribed: yes 2d part: 2014-04-15 Evgeny Stupachenko * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow byte shuffle on some x86 architectures. * config/i386/i386.h (TARGET_SLOW_PHUFFB): Ditto. * config/i386/i386.c (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PHUFFB). /*****************************************************************************/ On Thu, Mar 6, 2014 at 12:58 AM, Evgeny Stupachenko wrote: > slm_cost/intel_cost and TARGET_SLOW_PSHUFB are just preparation to a > next vectorization patch. > Changes in ix86_add_stmt_cost gives real performance to Silvermont. > Let's move all to stage1. > > On Wed, Mar 5, 2014 at 9:29 PM, Uros Bizjak wrote: >> On Wed, Mar 5, 2014 at 5:46 PM, H.J. Lu wrote: >>> On Wed, Mar 5, 2014 at 7:58 AM, Evgeny Stupachenko wrote: >>>> Hi, >>>> >>>> The patch is for x86 Silvermont. >>>> It improves x86 Silvermont vector cost model. >>>> It gives +20% on facerec spec on Silvermont. >>>> It passes make check and bootstrap on x86. >>>> >>>> Is this patch ok for stage1? >>>> >>>> ChangeLog: >>>> >>>> 2014-03-05 Evgeny Stupachenko >>>> >>>> * config/i386/x86-tune.def (TARGET_SLOW_PSHUFB): Target for slow byte >>>> shuffle on some x86 architectures. >>>> * config/i386/i386.h (TARGET_SLOW_PSHUFB): Ditto. >>>> * config/i386/i386.c (processor_costs): Fixing vec_to_scalar_cost for >>>> Silvermont according latency table. >>>> (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures >>>> where they are slow (TARGET_SLOW_PSHUFB). >>>> (x86_add_stmt_cost): Fixing vector cost model for Silvermont. >>>> >>>> Thanks, >>>> Evgeny >>> >>> There are 3 separate changes in this patch: >>> >>> 1. Update slm_cost, which doesn't have a ChangeLog entry. >>> 2. Add TARGET_SLOW_PSHUFB. >>> 3. Update ix86_add_stmt_cost. >>> >>> I suggest you break it into 3 independent patches. >> >> I think that slm_cost/intel_cost and TARGET_SLOW_PSHUFB changes can >> still go into mainline at this stage since they are trivial tuning >> changes that should not destabilize the compiler. >> >> The ix86_add_stmt_cost should wait for stage 1. >> >> Uros. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index bf4d576..0ae3cda 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -44026,7 +44026,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: - if (TARGET_SSSE3) + if (TARGET_SSSE3 && !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { @@ -44049,7 +44049,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: - if (TARGET_SSSE3) + if (TARGET_SSSE3 && !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 51659de..1a884d8 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -425,6 +425,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; ix86_tune_features[X86_TUNE_USE_VECTOR_FP_CONVERTS] #define TARGET_USE_VECTOR_CONVERTS \ ix86_tune_features[X86_TUNE_USE_VECTOR_CONVERTS] +#define TARGET_SLOW_PSHUFB \ + ix86_tune_features[X86_TUNE_SLOW_PSHUFB] #define TARGET_FUSE_CMP_AND_BRANCH_32 \ ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32] #define TARGET_FUSE_CMP_AND_BRANCH_64 \ diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 8399102..9b0ff36 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -386,6 +386,10 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_FP_CONVERTS, "use_vector_fp_converts", from integer to FP. */ DEF_TUNE (X86_TUNE_USE_VECTOR_CONVERTS, "use_vector_converts", m_AMDFAM10) +/* X86_TUNE_SLOW_SHUFB: Indicates tunings with slow pshufb instruction. */ +DEF_TUNE (X86_TUNE_SLOW_PSHUFB, "slow_pshufb", + m_BONNELL | m_SILVERMONT | m_INTEL) + /*****************************************************************************/ /* AVX instruction selection tuning (some of SSE flags affects AVX, too) */