From patchwork Tue Feb 25 09:13:23 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ilya Tocar X-Patchwork-Id: 323861 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 7F12B2C0212 for ; Tue, 25 Feb 2014 20:13:53 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; q=dns; s= default; b=EIl2CFIsykIx+jDiwOdBinmdwiGLXxCaNHN9V6o8HfHNqV6kUI9Et yPkGsOW0nt5y8PUNfAJ5jyhHZbpy2UK9BObiwmfbyBelVNTJ9I94Mx12jRTrFj1z I1mj0r44WWc7HUwsJHVRthXZmRpGK3NfVwuH2fN5Xivn0LpZwdIv4o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=default; bh=ES2xPF0yN0su53IOdR4ig2l4cns=; b=neosrEy4a+6/zaeFTlfahL9pSmdi G5lf59DG49gIe723OBcHTqj4Ieuri1fV8ZQECFURRxWFeaMHhWExVlH2bMhYGECl 97mWmkXu8HhyGZ4WIaInheBvqvJlFTjzVPUg+6VCYsdpXF6Jzl9CPsi806ZC4yOt SzXN6gzJ+MBmsv0= Received: (qmail 31249 invoked by alias); 25 Feb 2014 09:13:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 31236 invoked by uid 89); 25 Feb 2014 09:13:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL, BAYES_40, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS, T_FRT_BELOW2 autolearn=ham version=3.3.2 X-HELO: mail-yk0-f174.google.com Received: from mail-yk0-f174.google.com (HELO mail-yk0-f174.google.com) (209.85.160.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 25 Feb 2014 09:13:41 +0000 Received: by mail-yk0-f174.google.com with SMTP id 20so1707202yks.5 for ; Tue, 25 Feb 2014 01:13:39 -0800 (PST) X-Received: by 10.236.140.37 with SMTP id d25mr4838113yhj.40.1393319619788; Tue, 25 Feb 2014 01:13:39 -0800 (PST) Received: from msticlxl7.ims.intel.com (jfdmzpr02-ext.jf.intel.com. [134.134.137.71]) by mx.google.com with ESMTPSA id m9sm6626361yha.2.2014.02.25.01.13.36 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 25 Feb 2014 01:13:39 -0800 (PST) Date: Tue, 25 Feb 2014 13:13:23 +0400 From: Ilya Tocar To: Uros Bizjak Cc: GCC Patches , Jakub Jelinek , Kirill Yukhin Subject: Re: [PATCH][i386][AVX512] Match latest spec. Add CPUID prefetchwt1. Message-ID: <20140225091323.GA31394@msticlxl7.ims.intel.com> References: <20140220153922.GB1312@msticlxl7.ims.intel.com> <20140221152524.GA12550@msticlxl7.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes On 21 Feb 18:35, Uros Bizjak wrote: > On Fri, Feb 21, 2014 at 4:25 PM, Ilya Tocar wrote: > >> > Latest version of AVX512 spec > >> > http://download-software.intel.com/sites/default/files/managed/50/1a/319433-018.pdf > >> > Has a few changes. > >> > > >> > 1)PREFETCHWT1 instruction now has separate CPUID bit PREFETCHWT1. > >> > We can either support new CPUID or disable PREFETCHWT1 from generating, > >> > without removing code, and enable it in 4.9.1/latest version. > >> > I am not sure that adding new -m flag and related stuff this late > >> > is a good idea. Should still add it? > >> > >> Please submit the patch anyway. We can relax release constraints on > >> non-algorithmic patch a bit, weighting in benefits of having gcc > >> release that fully conforms to some published specification. > >> > > Patch bellow add -mprefetchwt1 flag, corresponding TARGET_PREFETCHWT1, > > and uses them for prefetchwt1 instruction. Bootstraps/passes testing. > > Ok for trunk? > > > > * gcc.target/i386/avx-1.c: Update __builtin_prefetch. > > Please also add new switch to gcc-target/i386/sse-{12,13,14}.c and > g++.dg/other/i386-{2,3} and new options to > gcc.tatget/i386/sse-{22,23}.c. Please re-test with new additions and > repost the patch. > I've added new switch to those tests. However when I add prefetchwt1 to pragma GCC target ("sse") sse-22a.c test fails with: pmmintrin.h: In function ‘_mm_loaddup_pd’: emmintrin.h:119:1: error: inlining failed in call to always_inline ‘_mm_load1_pd’: target specific option mismatch I've checked and this isn't a problem with prefetchwt1. I get the same error when I add any other option (e. g. sha) to #pragma GCC target ("sse"). So I haven't added anything there. As that was the only fail, I'm reposting this patch. ChangeLog for GCC: * common/config/i386/i386-common.c (OPTION_MASK_ISA_PREFETCHWT1_SET), (OPTION_MASK_ISA_PREFETCHWT1_UNSET): New. (ix86_handle_option): Handle OPT_mprefetchwt1. * config/i386/cpuid.h (bit_PREFETCHWT1): New. * config/i386/driver-i386.c (host_detect_local_cpu): Detect PREFETCHWT1 CPUID. * config/i386/i386-c.c (ix86_target_macros_internal): Handle OPTION_MASK_ISA_PREFETCHWT1. * config/i386/i386.c (ix86_target_string): Handle mprefetchwt1. (PTA_PREFETCHWT1): New. (ix86_option_override_internal): Handle PTA_PREFETCHWT1. (ix86_valid_target_attribute_inner_p): Handle OPT_mprefetchwt1. * config/i386/i386.h (TARGET_PREFETCHWT1), (TARGET_PREFETCHWT1_P): New. * config/i386/i386.md (prefetch): Check TARGET_PREFETCHWT1 (*prefetch_avx512pf__: Change into ... (*prefetch_prefetchwt1_: This. * config/i386/i386.opt (mprefetchwt1): New. * config/i386/xmmintrin.h (_mm_hint): Add _MM_HINT_ET1. (_mm_prefetch): Handle intent to write. * doc/invoke.texi (mprefetchwt1), (mno-prefetchwt1): Doccument. ChangeLog for tests: * gcc.target/i386/avx-1.c: Update __builtin_prefetch. * gcc.target/i386/prefetchwt1-1.c: New. * g++.dg/other/i386-2.C: Add new option. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/sse-12.c: Ditto. * gcc.target/i386/sse-13.c: Update __builtin_prefetch, add new option. * gcc.target/i386/sse-22.c: Add new option. * gcc.target/i386/sse-23.c: Update __builtin_prefetch, add new option. --- gcc/common/config/i386/i386-common.c | 15 +++++++++++++++ gcc/config/i386/cpuid.h | 4 ++++ gcc/config/i386/driver-i386.c | 7 +++++-- gcc/config/i386/i386-c.c | 2 ++ gcc/config/i386/i386.c | 6 ++++++ gcc/config/i386/i386.h | 2 ++ gcc/config/i386/i386.md | 13 ++++++------- gcc/config/i386/i386.opt | 4 ++++ gcc/config/i386/xmmintrin.h | 6 ++++-- gcc/doc/invoke.texi | 4 +++- gcc/testsuite/g++.dg/other/i386-2.C | 2 +- gcc/testsuite/g++.dg/other/i386-3.C | 2 +- gcc/testsuite/gcc.target/i386/avx-1.c | 2 +- gcc/testsuite/gcc.target/i386/prefetchwt1-1.c | 14 ++++++++++++++ gcc/testsuite/gcc.target/i386/sse-12.c | 2 +- gcc/testsuite/gcc.target/i386/sse-13.c | 4 ++-- gcc/testsuite/gcc.target/i386/sse-14.c | 2 +- gcc/testsuite/gcc.target/i386/sse-22.c | 2 +- gcc/testsuite/gcc.target/i386/sse-23.c | 4 ++-- 19 files changed, 75 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/prefetchwt1-1.c diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index b7f9ff6..a6ab555 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED #define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX +#define OPTION_MASK_ISA_PREFETCHWT1_SET OPTION_MASK_ISA_PREFETCHWT1 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same as -msse4.2. */ @@ -154,6 +155,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED #define OPTION_MASK_ISA_ADX_UNSET OPTION_MASK_ISA_ADX +#define OPTION_MASK_ISA_PREFETCHWT1_UNSET OPTION_MASK_ISA_PREFETCHWT1 /* SSE4 includes both SSE4.1 and SSE4.2. -mno-sse4 should the same as -mno-sse4.1. */ @@ -757,6 +759,19 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mprefetchwt1: + if (value) + { + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PREFETCHWT1_SET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PREFETCHWT1_SET; + } + else + { + opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_PREFETCHWT1_UNSET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_PREFETCHWT1_UNSET; + } + return true; + /* Comes from final.c -- no real reason to change it. */ #define MAX_CODE_ALIGN 16 diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h index c7a53dd..8c323ae 100644 --- a/gcc/config/i386/cpuid.h +++ b/gcc/config/i386/cpuid.h @@ -65,6 +65,7 @@ #define bit_3DNOW (1 << 31) /* Extended Features (%eax == 7) */ +/* %ebx */ #define bit_FSGSBASE (1 << 0) #define bit_BMI (1 << 3) #define bit_HLE (1 << 4) @@ -79,6 +80,9 @@ #define bit_AVX512CD (1 << 28) #define bit_SHA (1 << 29) +/* %ecx */ +#define bit_PREFETCHWT1 (1 << 0) + /* Extended State Enumeration Sub-leaf (%eax == 13, %ecx == 1) */ #define bit_XSAVEOPT (1 << 0) diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index 940ae20..1f5a11c 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -409,7 +409,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0; unsigned int has_osxsave = 0, has_fxsr = 0, has_xsave = 0, has_xsaveopt = 0; unsigned int has_avx512er = 0, has_avx512pf = 0, has_avx512cd = 0; - unsigned int has_avx512f = 0, has_sha = 0; + unsigned int has_avx512f = 0, has_sha = 0, has_prefetchwt1 = 0; bool arch; @@ -486,6 +486,8 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_avx512pf = ebx & bit_AVX512PF; has_avx512cd = ebx & bit_AVX512CD; has_sha = ebx & bit_SHA; + + has_prefetchwt1 = ecx & bit_PREFETCHWT1; } if (max_level >= 13) @@ -883,6 +885,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) const char *avx512er = has_avx512er ? " -mavx512er" : " -mno-avx512er"; const char *avx512cd = has_avx512cd ? " -mavx512cd" : " -mno-avx512cd"; const char *avx512pf = has_avx512pf ? " -mavx512pf" : " -mno-avx512pf"; + const char *prefetchwt1 = has_prefetchwt1 ? " -mprefetchwt1" : " -mno-prefetchwt1"; options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3, sse4a, cx16, sahf, movbe, aes, sha, pclmul, @@ -890,7 +893,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm, hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx, fxsr, xsave, xsaveopt, avx512f, avx512er, - avx512cd, avx512pf, NULL); + avx512cd, avx512pf, prefetchwt1, NULL); } done: diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c index 0c50720..c9977bf 100644 --- a/gcc/config/i386/i386-c.c +++ b/gcc/config/i386/i386-c.c @@ -387,6 +387,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, def_or_undef (parse_in, "__XSAVE__"); if (isa_flag & OPTION_MASK_ISA_XSAVEOPT) def_or_undef (parse_in, "__XSAVEOPT__"); + if (isa_flag & OPTION_MASK_ISA_PREFETCHWT1) + def_or_undef (parse_in, "__PREFETCHWT1__"); if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE)) def_or_undef (parse_in, "__SSE_MATH__"); if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2)) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 4fead55..00773d8 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2622,6 +2622,7 @@ ix86_target_string (HOST_WIDE_INT isa, int flags, const char *arch, { "-mrtm", OPTION_MASK_ISA_RTM }, { "-mxsave", OPTION_MASK_ISA_XSAVE }, { "-mxsaveopt", OPTION_MASK_ISA_XSAVEOPT }, + { "-mprefetchwt1", OPTION_MASK_ISA_PREFETCHWT1 }, }; /* Flag options. */ @@ -3112,6 +3113,7 @@ ix86_option_override_internal (bool main_args_p, #define PTA_AVX512PF (HOST_WIDE_INT_1 << 42) #define PTA_AVX512CD (HOST_WIDE_INT_1 << 43) #define PTA_SHA (HOST_WIDE_INT_1 << 45) +#define PTA_PREFETCHWT1 (HOST_WIDE_INT_1 << 46) #define PTA_CORE2 \ (PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 \ @@ -3666,6 +3668,9 @@ ix86_option_override_internal (bool main_args_p, if (processor_alias_table[i].flags & PTA_AVX512CD && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_AVX512CD)) opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512CD; + if (processor_alias_table[i].flags & PTA_PREFETCHWT1 + && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_PREFETCHWT1)) + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PREFETCHWT1; if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE)) x86_prefetch_sse = true; @@ -4547,6 +4552,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[], IX86_ATTR_ISA ("fxsr", OPT_mfxsr), IX86_ATTR_ISA ("xsave", OPT_mxsave), IX86_ATTR_ISA ("xsaveopt", OPT_mxsaveopt), + IX86_ATTR_ISA ("prefetchwt1", OPT_mprefetchwt1), /* enum options */ IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_), diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 1b6460a..c80878b 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -130,6 +130,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_XSAVE_P(x) TARGET_ISA_XSAVE_P(x) #define TARGET_XSAVEOPT TARGET_ISA_XSAVEOPT #define TARGET_XSAVEOPT_P(x) TARGET_ISA_XSAVEOPT_P(x) +#define TARGET_PREFETCHWT1 TARGET_ISA_PREFETCHWT1 +#define TARGET_PREFETCHWT1_P(x) TARGET_ISA_PREFETCHWT1_P(x) #define TARGET_LP64 TARGET_ABI_64 #define TARGET_LP64_P(x) TARGET_ABI_64_P(x) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 232a334..b9f1320 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17856,7 +17856,7 @@ [(prefetch (match_operand 0 "address_operand") (match_operand:SI 1 "const_int_operand") (match_operand:SI 2 "const_int_operand"))] - "TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_AVX512PF" + "TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1" { bool write = INTVAL (operands[1]) != 0; int locality = INTVAL (operands[2]); @@ -17867,8 +17867,8 @@ supported by SSE counterpart or the SSE prefetch is not available (K6 machines). Otherwise use SSE prefetch as it allows specifying of locality. */ - if (TARGET_AVX512PF && write) - operands[2] = const1_rtx; + if (TARGET_PREFETCHWT1 && write) + operands[2] = const2_rtx; else if (TARGET_PRFCHW && (write || !TARGET_PREFETCH_SSE)) operands[2] = GEN_INT (3); else @@ -17912,14 +17912,13 @@ (symbol_ref "memory_address_length (operands[0], false)")) (set_attr "memory" "none")]) -(define_insn "*prefetch_avx512pf_" +(define_insn "*prefetch_prefetchwt1_" [(prefetch (match_operand:P 0 "address_operand" "p") (const_int 1) - (const_int 1))] - "TARGET_AVX512PF" + (const_int 2))] + "TARGET_PREFETCHWT1" "prefetchwt1\t%a0"; [(set_attr "type" "sse") - (set_attr "prefix" "evex") (set (attr "length_address") (symbol_ref "memory_address_length (operands[0], false)")) (set_attr "memory" "none")]) diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index d5dd0fa..0f463a2 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -757,6 +757,10 @@ mf16c Target Report Mask(ISA_F16C) Var(ix86_isa_flags) Save Support F16C built-in functions and code generation +mprefetchwt1 +Target Report Mask(ISA_PREFETCHWT1) Var(ix86_isa_flags) Save +Support PREFETCHWT1 built-in functions and code generation + mfentry Target Report Var(flag_fentry) Init(-1) Emit profiling counter call at function entry before prologue. diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h index 0511dcf..9cefa2c 100644 --- a/gcc/config/i386/xmmintrin.h +++ b/gcc/config/i386/xmmintrin.h @@ -53,6 +53,8 @@ typedef float __v4sf __attribute__ ((__vector_size__ (16))); /* Constants for use with _mm_prefetch. */ enum _mm_hint { + /* _MM_HINT_ET is _MM_HINT_T with set 3rd bit. */ + _MM_HINT_ET1 = 6, _MM_HINT_T0 = 3, _MM_HINT_T1 = 2, _MM_HINT_T2 = 1, @@ -1191,11 +1193,11 @@ _m_psadbw (__m64 __A, __m64 __B) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_prefetch (const void *__P, enum _mm_hint __I) { - __builtin_prefetch (__P, 0, __I); + __builtin_prefetch (__P, (__I & 0x4) >> 2, __I & 0x3); } #else #define _mm_prefetch(P, I) \ - __builtin_prefetch ((P), 0, (I)) + __builtin_prefetch ((P), ((I & 0x4) >> 2), (I & 0x3)) #endif /* Stores the data in A to the address P without polluting the caches. */ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 959664c..7bcaa83 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -667,7 +667,7 @@ Objective-C and Objective-C++ Dialects}. -mvzeroupper -mprefer-avx128 @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol -mavx2 -mavx512f -mavx512pf -mavx512er -mavx512cd -msha @gol --maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfma @gol +-maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfma -mprefetchwt1 @gol -msse4a -m3dnow -mpopcnt -mabm -mbmi -mtbm -mfma4 -mxop -mlzcnt @gol -mbmi2 -mfxsr -mxsave -mxsaveopt -mrtm -mlwp -mthreads @gol -mno-align-stringops -minline-all-stringops @gol @@ -15264,6 +15264,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @itemx -mno-f16c @itemx -mfma @itemx -mno-fma +@itemx -mprefetchwt1 +@itemx -mno-prefetchwt1 @itemx -msse4a @itemx -mno-sse4a @itemx -mfma4 diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C index a7ef6dc..2f8650a6 100644 --- a/gcc/testsuite/g++.dg/other/i386-2.C +++ b/gcc/testsuite/g++.dg/other/i386-2.C @@ -1,5 +1,5 @@ /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ -/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha" } */ +/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1" } */ /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C index 4c443b1..df0bd27 100644 --- a/gcc/testsuite/g++.dg/other/i386-3.C +++ b/gcc/testsuite/g++.dg/other/i386-3.C @@ -1,5 +1,5 @@ /* { dg-do compile { target i?86-*-* x86_64-*-* } } */ -/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha" } */ +/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1" } */ /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h, diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c index f7e412d..12cfc68 100644 --- a/gcc/testsuite/gcc.target/i386/avx-1.c +++ b/gcc/testsuite/gcc.target/i386/avx-1.c @@ -152,7 +152,7 @@ #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) /* xmmintrin.h */ -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, A, _MM_HINT_NTA) +#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA) #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) #define __builtin_ia32_vec_set_v4hi(A, D, N) \ __builtin_ia32_vec_set_v4hi(A, D, 0) diff --git a/gcc/testsuite/gcc.target/i386/prefetchwt1-1.c b/gcc/testsuite/gcc.target/i386/prefetchwt1-1.c new file mode 100644 index 0000000..1b88516 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/prefetchwt1-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-mprefetchwt1 -O2" } */ +/* { dg-final { scan-assembler "\[ \\t\]+prefetchwt1\[ \\t\]+" } } */ + +#include + +void *p; + +void extern +prefetchw__test (void) +{ + _mm_prefetch (p, _MM_HINT_ET1); +} + diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c index cf91a9d..51de357 100644 --- a/gcc/testsuite/gcc.target/i386/sse-12.c +++ b/gcc/testsuite/gcc.target/i386/sse-12.c @@ -3,7 +3,7 @@ popcntintrin.h and mm_malloc.h are usable with -O -std=c89 -pedantic-errors. */ /* { dg-do compile } */ -/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha" } */ +/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1" } */ #include diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c index c0068a8..171e242 100644 --- a/gcc/testsuite/gcc.target/i386/sse-13.c +++ b/gcc/testsuite/gcc.target/i386/sse-13.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha" } */ +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1" } */ #include @@ -138,7 +138,7 @@ #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) /* xmmintrin.h */ -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, A, _MM_HINT_NTA) +#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA) #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) #define __builtin_ia32_vec_set_v4hi(A, D, N) \ __builtin_ia32_vec_set_v4hi(A, D, 0) diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c index dbe05cb..10334a6 100644 --- a/gcc/testsuite/gcc.target/i386/sse-14.c +++ b/gcc/testsuite/gcc.target/i386/sse-14.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha" } */ +/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1" } */ #include diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c index 85a03da..51f04c2 100644 --- a/gcc/testsuite/gcc.target/i386/sse-22.c +++ b/gcc/testsuite/gcc.target/i386/sse-22.c @@ -99,7 +99,7 @@ #ifndef DIFFERENT_PRAGMAS -#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha") +#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1") #endif /* Following intrinsics require immediate arguments. They diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c index c02b151..5b24618 100644 --- a/gcc/testsuite/gcc.target/i386/sse-23.c +++ b/gcc/testsuite/gcc.target/i386/sse-23.c @@ -90,7 +90,7 @@ #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0) /* xmmintrin.h */ -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, A, _MM_HINT_NTA) +#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA) #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0) #define __builtin_ia32_vec_set_v4hi(A, D, N) \ __builtin_ia32_vec_set_v4hi(A, D, 0) @@ -385,7 +385,7 @@ /* shaintrin.h */ #define __builtin_ia32_sha1rnds4(A, B, C) __builtin_ia32_sha1rnds4(A, B, 1) -#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha") +#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1") #include #include #include