PATCH: Disable double precision vectorizer for Atom

On Mon, Sep 13, 2010 at 11:51 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Sep 13, 2010 at 3:47 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>
>> Double precision vector instructions are much slower than double
>> precision scalar instructions on Atom.  This patch disables double
>> precision vectorizer for Atom.  It improves SPEC CPU 2K FP geomean by
>> 7% on 64bit and 3% on 32bit.  OK for trunk?
>>
>> Thanks.
>>
>>
>> H.J.
>> ----
>> gcc/
>>
>> 2010-09-13  H.J. Lu  <hongjiu.lu@intel.com>
>>
>>        * config/i386/i386.c (initial_ix86_tune_features): Add
>>        X86_TUNE_VECTORIZE_DOUBLE.
>>        * config/i386/i386.h (ix86_tune_indices): Likewise.
>>        (TARGET_VECTORIZE_DOUBLE): New.
>>        (UNITS_PER_SIMD_WORD): Return UNITS_PER_WORD for DFmode if
>>        TARGET_VECTORIZE_DOUBLE is false.
>>
>> gcc/testsuite/
>>
>> 2010-09-13  H.J. Lu  <hongjiu.lu@intel.com>
>>
>>        * gcc.target/i386/fma4-256-vector.c: Add -mtune=generic.
>>        * gcc.target/i386/fma4-vector.c: Likewise.
>>        * gcc.target/i386/vectorize2.c: Likewise.
>>        * gcc.target/i386/vectorize4.c: Likewise.
>>        * gcc.target/i386/vectorize5.c: Likewise.
>>        * gcc.target/i386/vectorize6.c: Likewise.
>>        * gcc.target/i386/vectorize8.c: Likewise.
>>
>>        * gcc.target/i386/vect-double-1.c: New.
>>        * gcc.target/i386/vect-double-1a.c: Likewise.
>>        * gcc.target/i386/vect-double-2.c: Likewise.
>>        * gcc.target/i386/vect-double-2a.c: Likewise.
>>
>>        * lib/target-supports.exp (check_effective_target_vect_double):
>>        Set et_vect_double_saved to 0 when tuning for Atom.
>
> OK, but see comments bellow ...
>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 1d79a18..7d165bb 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -1627,6 +1627,10 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
>>   /* X86_TUNE_OPT_AGU: Optimize for Address Generation Unit. This flag
>>      will impact LEA instruction selection. */
>>   m_ATOM,
>> +
>> +  /* X86_TUNE_VECTORIZE_DOUBLE: Enable double precision vector
>> +     instructions.  */
>> +  ~m_ATOM,
>>  };
>>
>>  /* Feature tests against the various architecture variations.  */
>> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
>> index 91238d5..2acf60a 100644
>> --- a/gcc/config/i386/i386.h
>> +++ b/gcc/config/i386/i386.h
>> @@ -312,6 +312,7 @@ enum ix86_tune_indices {
>>   X86_TUNE_USE_VECTOR_CONVERTS,
>>   X86_TUNE_FUSE_CMP_AND_BRANCH,
>>   X86_TUNE_OPT_AGU,
>> +  X86_TUNE_VECTORIZE_DOUBLE,
>>
>>   X86_TUNE_LAST
>>  };
>> @@ -404,6 +405,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
>>  #define TARGET_FUSE_CMP_AND_BRANCH \
>>        ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH]
>>  #define TARGET_OPT_AGU ix86_tune_features[X86_TUNE_OPT_AGU]
>> +#define TARGET_VECTORIZE_DOUBLE \
>> +       ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE]
>>
>>  /* Feature tests against the various architecture variations.  */
>>  enum ix86_arch_indices {
>> @@ -1037,8 +1040,10 @@ enum target_cpu_default
>>    different sizes for integer and floating point vectors.  We limit
>>    vector size to 16byte.  */
>>  #define UNITS_PER_SIMD_WORD(MODE)                                      \
>> -  (TARGET_AVX ? (((MODE) == DFmode || (MODE) == SFmode) ? 16 : 16)     \
>> -             : (TARGET_SSE ? 16 : UNITS_PER_WORD))
>> +  ((MODE) == DFmode && !TARGET_VECTORIZE_DOUBLE                                \
>> +   ? UNITS_PER_WORD                                                    \
>> +   : (TARGET_AVX ? (((MODE) == DFmode || (MODE) == SFmode) ? 16 : 16)  \
>> +                : (TARGET_SSE ? 16 : UNITS_PER_WORD)))
>
> Please rewrite this function to a helper function using switch
> statement. I must admit I'm not able to parse this mess.
>

This is the patch I checked in.

Thanks.

PATCH: Disable double precision vectorizer for Atom

Commit Message

Comments

Patch