From patchwork Sun Jun 23 06:16:01 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 253457 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9C1E92C0421 for ; Sun, 23 Jun 2013 16:16:20 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=RT2VSod5S2y+L/V7BV SeCrB4JOV9CO33HeFpJGBSUUpZakD7E2v2Rf5VZIbcNjKdoJNwuJ8kntfqenJ8PK QrtkBU+G7llS+i08KnEgPluv6X5VRAjeM73SgBnNuz3F2aTWJGGKniZnzB/mGQo6 UDVcvWc23Gaod3JtpFQaRmBbE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=UjRkTJjkYtlq9GNoYza7v8EZ Qo0=; b=xKACbhGbPPh2FvBUTH+nf5G0b+g6cFmm3JlPFy4ei7RmWd4XIBIMO1hP bmKmiQNqq4XV5cslkBcYu2osE8tcWnK0sXddB7lUmFzxDhtM2+868RnBpzviV/UZ UMtE4+2MGNhNV0ggrUtbSGhEfiVg/ZtpcsjvDJzCT1KlstWUKgo= Received: (qmail 6249 invoked by alias); 23 Jun 2013 06:16:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6238 invoked by uid 89); 23 Jun 2013 06:16:11 -0000 X-Spam-SWARE-Status: No, score=-4.7 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD, SPF_PASS, TW_AV, TW_FX, TW_LZ, TW_XR, TW_ZC, TW_ZJ, TW_ZM autolearn=ham version=3.3.1 Received: from mail-ee0-f41.google.com (HELO mail-ee0-f41.google.com) (74.125.83.41) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Sun, 23 Jun 2013 06:16:04 +0000 Received: by mail-ee0-f41.google.com with SMTP id d17so5445720eek.28 for ; Sat, 22 Jun 2013 23:16:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=rk2B4qXkf3PM7XceuNNrVHq0yHsEtX+P3yLmGWuuRXg=; b=mM1n/QVt8Za7gvNeISik9s6EqicjvDhMJ0cfWgCus8fw14PoBPd1r0ilqkW1sgbK2G 9myqevdSD9y7U97/VL0L3h2sdj7WuZ92viJZ3nugPzQCSrNEpNT07ggKqK7q39M2RokZ 8P3v2p+nk6TAyORibde46JIsvMy2mptLuDfB9FFCbYjVVY3vYGvQecJNcsWxkTnUPMPR 9P1u0UqvRRFWMYiud9AMck2Jgfb7lJLwcqZ2Cui7kQ6UrwdPVKBytANE5bK35/2wDiXi PF7nWD8xOd5t69E/zDt7gL+VKpU0smzrA/fXwL/Byjz2NSbIy1bIrbD2PRTKMXa9mD1u gpwA== MIME-Version: 1.0 X-Received: by 10.15.64.202 with SMTP id o50mr19395672eex.44.1371968161632; Sat, 22 Jun 2013 23:16:01 -0700 (PDT) Received: by 10.14.141.200 with HTTP; Sat, 22 Jun 2013 23:16:01 -0700 (PDT) In-Reply-To: References: <20130514083913.GJ1377@tucnak.redhat.com> <20130514100419.GM1377@tucnak.redhat.com> <20130517054931.GM1377@tucnak.redhat.com> Date: Sat, 22 Jun 2013 23:16:01 -0700 Message-ID: Subject: Re: GCC does not support *mmintrin.h with function specific opts From: Sriraman Tallam To: Uros Bizjak Cc: Jakub Jelinek , GCC Patches , "H.J. Lu" , "Joseph S. Myers" , Diego Novillo , David Li X-Gm-Message-State: ALoCoQmQUnLVLXM9khaJn31LXuU6HVuFEJwzJCF4tGV49c+5taOvB7q0+9ePhK1bzjV80TbMmU+57m7lt3F0mYGnMvFaF7LRxCnLF1cN708SfYefK+J0TddKKKW/t2pkwvusMURivbnJIoA2psjHoJ2rX6msezKiWq9Qd5Bb80jwaIZyeBO6RZipnAXn9XgbETiQggU2ZglOA+BIElrSsOtNfuJHoz2SEg== X-Virus-Found: No X-IsSubscribed: yes On Sun, May 19, 2013 at 3:39 AM, Uros Bizjak wrote: > On Sat, May 18, 2013 at 6:00 AM, Sriraman Tallam wrote: > >>>> > I don't really understand why you made the change to x86intrin.h instead of >>>> > making it inside each *mmintrin.h header. The code would be the same size, >>>> > it would let us include smmintrin.h directly if we wanted to, and >>>> > x86intrin.h would also automatically work. >>>> >>>> Right, I should have done that instead! >>> >>> Yeah, definitely. For the standalone headers, which have currently >>> ____ guards inside of it, please replace it by the larger snippets >>> involving #pragma, and in the x86intrin.h/immintrin.h headers include those >>> unconditionally, instead of just if ____ is defined. >>> For the non-standalone headers (newer ones like avxintrin.h), replace >>> the #ifdef ____ in immintrin.h/x86intrin.h with larger snippets. >> >> >> * I did mostly as suggested except that even for avx and avx2 headers >> I did not see the harm in doing it in the header itself. AVX header >> did not have the "#ifndef _AVXINTRIN_H_INCLUDED" which I added before >> doing this. I have added test cases to show it is doing the right >> thing for avx. >> * I also found that when the caller to these intrinsics do not have >> the right target attribute, an error is raised in -O2 mode but not in >> -O0 mode. I have fixed this with a patch to ipa-inline.c, please see >> if this is alright. Test case intrinsics_5.c checks if an error is >> raised. >> * LZCNT needed to be handled which is done now. > > * common/config/i386/i386-common.c: Handle LZCNT. > > The above part is OK for mainline and release patches. Please commit > LZCNT part as a separate patch to mainline. > > Also, please get middle-end part reviewed and committed before we > proceed with target-dependent part. This is done now. I have committed the LZCNT patch separately to mainline. I have committed the middle-end part approved by Honza. Finally, I have also committed the target-dependent part to mainline after adding the tests Jakub noted and fixing a broken test, the fix was trivial (gcc.target/i386/avx-1.c). I have attached the patch for the target dependent part. Thanks Sri > > Thanks, > Uros. * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. Index: config/i386/xmmintrin.h ======================================================================= --- config/i386/xmmintrin.h (revision 200347) +++ config/i386/xmmintrin.h (working copy) @@ -27,16 +27,18 @@ #ifndef _XMMINTRIN_H_INCLUDED #define _XMMINTRIN_H_INCLUDED -#ifndef __SSE__ -# error "SSE instruction set not enabled" -#else - /* We need type definitions from the MMX header file. */ #include /* Get _mm_malloc () and _mm_free (). */ #include +#ifndef __SSE__ +#pragma GCC push_options +#pragma GCC target("sse") +#define __DISABLE_SSE__ +#endif /* __SSE__ */ + /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); @@ -1242,9 +1244,11 @@ do { \ } while (0) /* For backward source compatibility. */ -#ifdef __SSE2__ # include -#endif -#endif /* __SSE__ */ +#ifdef __DISABLE_SSE__ +#undef __DISABLE_SSE__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE__ */ + #endif /* _XMMINTRIN_H_INCLUDED */ Index: config/i386/mmintrin.h =================================================================== --- config/i386/mmintrin.h (revision 200347) +++ config/i386/mmintrin.h (working copy) @@ -28,8 +28,11 @@ #define _MMINTRIN_H_INCLUDED #ifndef __MMX__ -# error "MMX instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("mmx") +#define __DISABLE_MMX__ +#endif /* __MMX__ */ + /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); @@ -303,13 +306,21 @@ _m_paddd (__m64 __m1, __m64 __m2) } /* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_add_si64 (__m64 __m1, __m64 __m2) { return (__m64) __builtin_ia32_paddq ((__v1di)__m1, (__v1di)__m2); } -#endif +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ /* Add the 8-bit values in M1 to the 8-bit values in M2 using signed saturated arithmetic. */ @@ -407,13 +418,21 @@ _m_psubd (__m64 __m1, __m64 __m2) } /* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_sub_si64 (__m64 __m1, __m64 __m2) { return (__m64) __builtin_ia32_psubq ((__v1di)__m1, (__v1di)__m2); } -#endif +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ /* Subtract the 8-bit values in M2 from the 8-bit values in M1 using signed saturating arithmetic. */ @@ -915,6 +934,9 @@ _mm_set1_pi8 (char __b) { return _mm_set_pi8 (__b, __b, __b, __b, __b, __b, __b, __b); } +#ifdef __DISABLE_MMX__ +#undef __DISABLE_MMX__ +#pragma GCC pop_options +#endif /* __DISABLE_MMX__ */ -#endif /* __MMX__ */ #endif /* _MMINTRIN_H_INCLUDED */ Index: config/i386/prfchwintrin.h =================================================================== --- config/i386/prfchwintrin.h (revision 200347) +++ config/i386/prfchwintrin.h (working copy) @@ -26,17 +26,24 @@ #endif -#if !defined (__PRFCHW__) && !defined (__3dNOW__) -# error "PRFCHW instruction not enabled" -#endif /* __PRFCHW__ or __3dNOW__*/ - #ifndef _PRFCHWINTRIN_H_INCLUDED #define _PRFCHWINTRIN_H_INCLUDED +#ifndef __PRFCHW__ +#pragma GCC push_options +#pragma GCC target("prfchw") +#define __DISABLE_PRFCHW__ +#endif /* __PRFCHW__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _m_prefetchw (void *__P) { __builtin_prefetch (__P, 1, 3 /* _MM_HINT_T0 */); } +#ifdef __DISABLE_PRFCHW__ +#undef __DISABLE_PRFCHW__ +#pragma GCC pop_options +#endif /* __DISABLE_PRFCHW__ */ + #endif /* _PRFCHWINTRIN_H_INCLUDED */ Index: config/i386/bmi2intrin.h =================================================================== --- config/i386/bmi2intrin.h (revision 200347) +++ config/i386/bmi2intrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _BMI2INTRIN_H_INCLUDED +#define _BMI2INTRIN_H_INCLUDED + #ifndef __BMI2__ -# error "BMI2 instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("bmi2") +#define __DISABLE_BMI2__ #endif /* __BMI2__ */ -#ifndef _BMI2INTRIN_H_INCLUDED -#define _BMI2INTRIN_H_INCLUDED - extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _bzhi_u32 (unsigned int __X, unsigned int __Y) @@ -99,4 +101,9 @@ _mulx_u32 (unsigned int __X, unsigned int __Y, uns #endif /* !__x86_64__ */ +#ifdef __DISABLE_BMI2__ +#undef __DISABLE_BMI2__ +#pragma GCC pop_options +#endif /* __DISABLE_BMI2__ */ + #endif /* _BMI2INTRIN_H_INCLUDED */ Index: config/i386/popcntintrin.h =================================================================== --- config/i386/popcntintrin.h (revision 200347) +++ config/i386/popcntintrin.h (working copy) @@ -21,13 +21,15 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see . */ +#ifndef _POPCNTINTRIN_H_INCLUDED +#define _POPCNTINTRIN_H_INCLUDED + #ifndef __POPCNT__ -# error "POPCNT instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("popcnt") +#define __DISABLE_POPCNT__ #endif /* __POPCNT__ */ -#ifndef _POPCNTINTRIN_H_INCLUDED -#define _POPCNTINTRIN_H_INCLUDED - /* Calculate a number of bits set to 1. */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_popcnt_u32 (unsigned int __X) @@ -43,4 +45,9 @@ _mm_popcnt_u64 (unsigned long long __X) } #endif +#ifdef __DISABLE_POPCNT__ +#undef __DISABLE_POPCNT__ +#pragma GCC pop_options +#endif /* __DISABLE_POPCNT__ */ + #endif /* _POPCNTINTRIN_H_INCLUDED */ Index: config/i386/ammintrin.h =================================================================== --- config/i386/ammintrin.h (revision 200347) +++ config/i386/ammintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _AMMINTRIN_H_INCLUDED #define _AMMINTRIN_H_INCLUDED -#ifndef __SSE4A__ -# error "SSE4A instruction set not enabled" -#else - /* We need definitions from the SSE3, SSE2 and SSE header files*/ #include +#ifndef __SSE4A__ +#pragma GCC push_options +#pragma GCC target("sse4a") +#define __DISABLE_SSE4A__ +#endif /* __SSE4A__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_stream_sd (double * __P, __m128d __Y) { @@ -83,6 +85,9 @@ _mm_inserti_si64(__m128i __X, __m128i __Y, unsigne (unsigned int)(I), (unsigned int)(L))) #endif -#endif /* __SSE4A__ */ +#ifdef __DISABLE_SSE4A__ +#undef __DISABLE_SSE4A__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4A__ */ #endif /* _AMMINTRIN_H_INCLUDED */ Index: config/i386/emmintrin.h =================================================================== --- config/i386/emmintrin.h (revision 200347) +++ config/i386/emmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ -# error "SSE2 instruction set not enabled" -#else - /* We need definitions from the SSE header files*/ #include +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + /* SSE2 */ typedef double __v2df __attribute__ ((__vector_size__ (16))); typedef long long __v2di __attribute__ ((__vector_size__ (16))); @@ -1515,6 +1517,9 @@ _mm_castsi128_pd(__m128i __A) return (__m128d) __A; } -#endif /* __SSE2__ */ +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ #endif /* _EMMINTRIN_H_INCLUDED */ Index: config/i386/immintrin.h =================================================================== --- config/i386/immintrin.h (revision 200347) +++ config/i386/immintrin.h (working copy) @@ -24,71 +24,43 @@ #ifndef _IMMINTRIN_H_INCLUDED #define _IMMINTRIN_H_INCLUDED -#ifdef __MMX__ #include -#endif -#ifdef __SSE__ #include -#endif -#ifdef __SSE2__ #include -#endif -#ifdef __SSE3__ #include -#endif -#ifdef __SSSE3__ #include -#endif -#if defined (__SSE4_2__) || defined (__SSE4_1__) #include -#endif -#if defined (__AES__) || defined (__PCLMUL__) #include -#endif -#ifdef __AVX__ #include -#endif -#ifdef __AVX2__ #include -#endif -#ifdef __LZCNT__ #include -#endif -#ifdef __BMI__ #include -#endif -#ifdef __BMI2__ #include -#endif -#ifdef __FMA__ #include -#endif -#ifdef __F16C__ #include -#endif -#ifdef __RTM__ #include -#endif -#ifdef __RTM__ #include -#endif -#ifdef __RDRND__ +#ifndef __RDRND__ +#pragma GCC push_options +#pragma GCC target("rdrnd") +#define __DISABLE_RDRND__ +#endif /* __RDRND__ */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _rdrand16_step (unsigned short *__P) @@ -102,10 +74,18 @@ _rdrand32_step (unsigned int *__P) { return __builtin_ia32_rdrand32_step (__P); } -#endif /* __RDRND__ */ +#ifdef __DISABLE_RDRND__ +#undef __DISABLE_RDRND__ +#pragma GCC pop_options +#endif /* __DISABLE_RDRND__ */ #ifdef __x86_64__ -#ifdef __FSGSBASE__ + +#ifndef __FSGSBASE__ +#pragma GCC push_options +#pragma GCC target("fsgsbase") +#define __DISABLE_FSGSBASE__ +#endif /* __FSGSBASE__ */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _readfsbase_u32 (void) @@ -161,16 +141,27 @@ _writegsbase_u64 (unsigned long long __B) { __builtin_ia32_wrgsbase64 (__B); } -#endif /* __FSGSBASE__ */ +#ifdef __DISABLE_FSGSBASE__ +#undef __DISABLE_FSGSBASE__ +#pragma GCC pop_options +#endif /* __DISABLE_FSGSBASE__ */ -#ifdef __RDRND__ +#ifndef __RDRND__ +#pragma GCC push_options +#pragma GCC target("rdrnd") +#define __DISABLE_RDRND__ +#endif /* __RDRND__ */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _rdrand64_step (unsigned long long *__P) { return __builtin_ia32_rdrand64_step (__P); } -#endif /* __RDRND__ */ +#ifdef __DISABLE_RDRND__ +#undef __DISABLE_RDRND__ +#pragma GCC pop_options +#endif /* __DISABLE_RDRND__ */ + #endif /* __x86_64__ */ #endif /* _IMMINTRIN_H_INCLUDED */ Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 200347) +++ config/i386/i386-protos.h (working copy) @@ -40,6 +40,8 @@ extern void ix86_output_addr_diff_elt (FILE *, int extern enum calling_abi ix86_cfun_abi (void); extern enum calling_abi ix86_function_type_abi (const_tree); +extern void ix86_reset_previous_fndecl (void); + #ifdef RTX_CODE extern int standard_80387_constant_p (rtx); extern const char *standard_80387_constant_opcode (rtx); Index: config/i386/lwpintrin.h =================================================================== --- config/i386/lwpintrin.h (revision 200347) +++ config/i386/lwpintrin.h (working copy) @@ -29,8 +29,10 @@ #define _LWPINTRIN_H_INCLUDED #ifndef __LWP__ -# error "LWP instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("lwp") +#define __DISABLE_LWP__ +#endif /* __LWP__ */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __llwpcb (void *pcbAddress) @@ -95,6 +97,9 @@ __lwpins64 (unsigned long long data2, unsigned int #endif #endif -#endif /* __LWP__ */ +#ifdef __DISABLE_LWP__ +#undef __DISABLE_LWP__ +#pragma GCC pop_options +#endif /* __DISABLE_LWP__ */ #endif /* _LWPINTRIN_H_INCLUDED */ Index: config/i386/xopintrin.h =================================================================== --- config/i386/xopintrin.h (revision 200347) +++ config/i386/xopintrin.h (working copy) @@ -28,12 +28,14 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED +#include + #ifndef __XOP__ -# error "XOP instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("xop") +#define __DISABLE_XOP__ +#endif /* __XOP__ */ -#include - /* Integer multiply/add intructions. */ extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_maccs_epi16(__m128i __A, __m128i __B, __m128i __C) @@ -830,6 +832,9 @@ _mm256_permute2_ps (__m256 __X, __m256 __Y, __m256 (int)(I))) #endif /* __OPTIMIZE__ */ -#endif /* __XOP__ */ +#ifdef __DISABLE_XOP__ +#undef __DISABLE_XOP__ +#pragma GCC pop_options +#endif /* __DISABLE_XOP__ */ #endif /* _XOPMMINTRIN_H_INCLUDED */ Index: config/i386/rdseedintrin.h =================================================================== --- config/i386/rdseedintrin.h (revision 200347) +++ config/i386/rdseedintrin.h (working copy) @@ -25,12 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _RDSEEDINTRIN_H_INCLUDED +#define _RDSEEDINTRIN_H_INCLUDED + #ifndef __RDSEED__ -# error "RDSEED instruction not enabled" +#pragma GCC push_options +#pragma GCC target("rdseed") +#define __DISABLE_RDSEED__ #endif /* __RDSEED__ */ -#ifndef _RDSEEDINTRIN_H_INCLUDED -#define _RDSEEDINTRIN_H_INCLUDED extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -55,4 +58,9 @@ _rdseed64_step (unsigned long long *p) } #endif +#ifdef __DISABLE_RDSEED__ +#undef __DISABLE_RDSEED__ +#pragma GCC pop_options +#endif /* __DISABLE_RDSEED__ */ + #endif /* _RDSEEDINTRIN_H_INCLUDED */ Index: config/i386/xsaveintrin.h =================================================================== --- config/i386/xsaveintrin.h (revision 200347) +++ config/i386/xsaveintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _XSAVEINTRIN_H_INCLUDED #define _XSAVEINTRIN_H_INCLUDED +#ifndef __XSAVE__ +#pragma GCC push_options +#pragma GCC target("xsave") +#define __DISABLE_XSAVE__ +#endif /* __XSAVE__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _xsave (void *__P, long long __M) @@ -58,4 +64,9 @@ _xrstor64 (void *__P, long long __M) } #endif +#ifdef __DISABLE_XSAVE__ +#undef __DISABLE_XSAVE__ +#pragma GCC pop_options +#endif /* __DISABLE_XSAVE__ */ + #endif /* _XSAVEINTRIN_H_INCLUDED */ Index: config/i386/avxintrin.h =================================================================== --- config/i386/avxintrin.h (revision 200347) +++ config/i386/avxintrin.h (working copy) @@ -28,6 +28,15 @@ # error "Never use directly; include instead." #endif +#ifndef _AVXINTRIN_H_INCLUDED +#define _AVXINTRIN_H_INCLUDED + +#ifndef __AVX__ +#pragma GCC push_options +#pragma GCC target("avx") +#define __DISABLE_AVX__ +#endif /* __AVX__ */ + /* Internal data types for implementing the intrinsics. */ typedef double __v4df __attribute__ ((__vector_size__ (32))); typedef float __v8sf __attribute__ ((__vector_size__ (32))); @@ -1424,3 +1433,10 @@ _mm256_castsi128_si256 (__m128i __A) { return (__m256i) __builtin_ia32_si256_si ((__v4si)__A); } + +#ifdef __DISABLE_AVX__ +#undef __DISABLE_AVX__ +#pragma GCC pop_options +#endif /* __DISABLE_AVX__ */ + +#endif /* _AVXINTRIN_H_INCLUDED */ Index: config/i386/rtmintrin.h =================================================================== --- config/i386/rtmintrin.h (revision 200347) +++ config/i386/rtmintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _RTMINTRIN_H_INCLUDED +#define _RTMINTRIN_H_INCLUDED + #ifndef __RTM__ -# error "RTM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("rtm") +#define __DISABLE_RTM__ #endif /* __RTM__ */ -#ifndef _RTMINTRIN_H_INCLUDED -#define _RTMINTRIN_H_INCLUDED - #define _XBEGIN_STARTED (~0u) #define _XABORT_EXPLICIT (1 << 0) #define _XABORT_RETRY (1 << 1) @@ -74,4 +76,9 @@ _xabort (const unsigned int imm) #define _xabort(N) __builtin_ia32_xabort (N) #endif /* __OPTIMIZE__ */ +#ifdef __DISABLE_RTM__ +#undef __DISABLE_RTM__ +#pragma GCC pop_options +#endif /* __DISABLE_RTM__ */ + #endif /* _RTMINTRIN_H_INCLUDED */ Index: config/i386/fmaintrin.h =================================================================== --- config/i386/fmaintrin.h (revision 200347) +++ config/i386/fmaintrin.h (working copy) @@ -29,8 +29,10 @@ #define _FMAINTRIN_H_INCLUDED #ifndef __FMA__ -# error "FMA instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("fma") +#define __DISABLE_FMA__ +#endif /* __FMA__ */ extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -292,6 +294,9 @@ _mm256_fmsubadd_ps (__m256 __A, __m256 __B, __m256 -(__v8sf)__C); } -#endif +#ifdef __DISABLE_FMA__ +#undef __DISABLE_FMA__ +#pragma GCC pop_options +#endif /* __DISABLE_FMA__ */ #endif Index: config/i386/bmiintrin.h =================================================================== --- config/i386/bmiintrin.h (revision 200347) +++ config/i386/bmiintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _BMIINTRIN_H_INCLUDED +#define _BMIINTRIN_H_INCLUDED + #ifndef __BMI__ -# error "BMI instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("bmi") +#define __DISABLE_BMI__ #endif /* __BMI__ */ -#ifndef _BMIINTRIN_H_INCLUDED -#define _BMIINTRIN_H_INCLUDED - extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u16 (unsigned short __X) { @@ -116,4 +118,9 @@ __tzcnt_u64 (unsigned long long __X) #endif /* __x86_64__ */ +#ifdef __DISABLE_BMI__ +#undef __DISABLE_BMI__ +#pragma GCC pop_options +#endif /* __DISABLE_BMI__ */ + #endif /* _BMIINTRIN_H_INCLUDED */ Index: config/i386/xtestintrin.h =================================================================== --- config/i386/xtestintrin.h (revision 200347) +++ config/i386/xtestintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _XTESTINTRIN_H_INCLUDED +#define _XTESTINTRIN_H_INCLUDED + #ifndef __RTM__ -# error "RTM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("rtm") +#define __DISABLE_RTM__ #endif /* __RTM__ */ -#ifndef _XTESTINTRIN_H_INCLUDED -#define _XTESTINTRIN_H_INCLUDED - /* Return non-zero if the instruction executes inside an RTM or HLE code region. Return zero otherwise. */ extern __inline int @@ -41,4 +43,9 @@ _xtest (void) return __builtin_ia32_xtest (); } +#ifdef __DISABLE_RTM__ +#undef __DISABLE_RTM__ +#pragma GCC pop_options +#endif /* __DISABLE_RTM__ */ + #endif /* _XTESTINTRIN_H_INCLUDED */ Index: config/i386/nmmintrin.h =================================================================== --- config/i386/nmmintrin.h (revision 200347) +++ config/i386/nmmintrin.h (working copy) @@ -27,11 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ -# error "SSE4.2 instruction set not enabled" -#else /* We just include SSE4.1 header file. */ #include -#endif /* __SSE4_2__ */ #endif /* _NMMINTRIN_H_INCLUDED */ Index: config/i386/lzcntintrin.h =================================================================== --- config/i386/lzcntintrin.h (revision 200347) +++ config/i386/lzcntintrin.h (working copy) @@ -25,13 +25,16 @@ # error "Never use directly; include instead." #endif -#ifndef __LZCNT__ -# error "LZCNT instruction is not enabled" -#endif /* __LZCNT__ */ #ifndef _LZCNTINTRIN_H_INCLUDED #define _LZCNTINTRIN_H_INCLUDED +#ifndef __LZCNT__ +#pragma GCC push_options +#pragma GCC target("lzcnt") +#define __DISABLE_LZCNT__ +#endif /* __LZCNT__ */ + extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __lzcnt16 (unsigned short __X) { @@ -64,4 +67,9 @@ _lzcnt_u64 (unsigned long long __X) } #endif +#ifdef __DISABLE_LZCNT__ +#undef __DISABLE_LZCNT__ +#pragma GCC pop_options +#endif /* __DISABLE_LZCNT__ */ + #endif /* _LZCNTINTRIN_H_INCLUDED */ Index: config/i386/xsaveoptintrin.h =================================================================== --- config/i386/xsaveoptintrin.h (revision 200347) +++ config/i386/xsaveoptintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _XSAVEOPTINTRIN_H_INCLUDED #define _XSAVEOPTINTRIN_H_INCLUDED +#ifndef __XSAVEOPT__ +#pragma GCC push_options +#pragma GCC target("xsaveopt") +#define __DISABLE_XSAVEOPT__ +#endif /* __XSAVEOPT__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _xsaveopt (void *__P, long long __M) @@ -44,4 +50,9 @@ _xsaveopt64 (void *__P, long long __M) } #endif +#ifdef __DISABLE_XSAVEOPT__ +#undef __DISABLE_XSAVEOPT__ +#pragma GCC pop_options +#endif /* __DISABLE_XSAVEOPT__ */ + #endif /* _XSAVEOPTINTRIN_H_INCLUDED */ Index: config/i386/i386-c.c =================================================================== --- config/i386/i386-c.c (revision 200347) +++ config/i386/i386-c.c (working copy) @@ -376,20 +376,23 @@ ix86_pragma_target_parse (tree args, tree pop_targ if (! args) { - cur_tree = ((pop_target) - ? pop_target - : target_option_default_node); + cur_tree = (pop_target ? pop_target : target_option_default_node); cl_target_option_restore (&global_options, TREE_TARGET_OPTION (cur_tree)); } else { cur_tree = ix86_valid_target_attribute_tree (args); - if (!cur_tree) - return false; + if (!cur_tree || cur_tree == error_mark_node) + { + cl_target_option_restore (&global_options, + TREE_TARGET_OPTION (prev_tree)); + return false; + } } target_option_current_node = cur_tree; + ix86_reset_previous_fndecl (); /* Figure out the previous/current isa, arch, tune and the differences. */ prev_opt = TREE_TARGET_OPTION (prev_tree); Index: config/i386/tbmintrin.h =================================================================== --- config/i386/tbmintrin.h (revision 200347) +++ config/i386/tbmintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _TBMINTRIN_H_INCLUDED +#define _TBMINTRIN_H_INCLUDED + #ifndef __TBM__ -# error "TBM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("tbm") +#define __DISABLE_TBM__ #endif /* __TBM__ */ -#ifndef _TBMINTRIN_H_INCLUDED -#define _TBMINTRIN_H_INCLUDED - #ifdef __OPTIMIZE__ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __bextri_u32 (unsigned int __X, const unsigned int __I) @@ -169,4 +171,10 @@ __tzmsk_u64 (unsigned long long __X) #endif /* __x86_64__ */ + +#ifdef __DISABLE_TBM__ +#undef __DISABLE_TBM__ +#pragma GCC pop_options +#endif /* __DISABLE_TBM__ */ + #endif /* _TBMINTRIN_H_INCLUDED */ Index: config/i386/fma4intrin.h =================================================================== --- config/i386/fma4intrin.h (revision 200347) +++ config/i386/fma4intrin.h (working copy) @@ -28,13 +28,15 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ -# error "FMA4 instruction set not enabled" -#else - /* We need definitions from the SSE4A, SSE3, SSE2 and SSE header files. */ #include +#ifndef __FMA4__ +#pragma GCC push_options +#pragma GCC target("fma4") +#define __DISABLE_FMA4__ +#endif /* __FMA4__ */ + /* 128b Floating point multiply/add type instructions. */ extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_macc_ps (__m128 __A, __m128 __B, __m128 __C) @@ -231,6 +233,9 @@ _mm256_msubadd_pd (__m256d __A, __m256d __B, __m25 return (__m256d) __builtin_ia32_vfmaddsubpd256 ((__v4df)__A, (__v4df)__B, -(__v4df)__C); } -#endif +#ifdef __DISABLE_FMA4__ +#undef __DISABLE_FMA4__ +#pragma GCC pop_options +#endif /* __DISABLE_FMA4__ */ #endif Index: config/i386/smmintrin.h =================================================================== --- config/i386/smmintrin.h (revision 200347) +++ config/i386/smmintrin.h (working copy) @@ -27,14 +27,16 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ -# error "SSE4.1 instruction set not enabled" -#else - /* We need definitions from the SSSE3, SSE3, SSE2 and SSE header files. */ #include +#ifndef __SSE4_1__ +#pragma GCC push_options +#pragma GCC target("sse4.1") +#define __DISABLE_SSE4_1__ +#endif /* __SSE4_1__ */ + /* Rounding mode macros. */ #define _MM_FROUND_TO_NEAREST_INT 0x00 #define _MM_FROUND_TO_NEG_INF 0x01 @@ -582,7 +584,11 @@ _mm_stream_load_si128 (__m128i *__X) return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ /* These macros specify the source data format. */ #define _SIDD_UBYTE_OPS 0x00 @@ -792,10 +798,30 @@ _mm_cmpgt_epi64 (__m128i __X, __m128i __Y) return (__m128i) __builtin_ia32_pcmpgtq ((__v2di)__X, (__v2di)__Y); } -#ifdef __POPCNT__ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + +#ifdef __DISABLE_SSE4_1__ +#undef __DISABLE_SSE4_1__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_1__ */ + #include -#endif +#ifndef __SSE4_1__ +#pragma GCC push_options +#pragma GCC target("sse4.1") +#define __DISABLE_SSE4_1__ +#endif /* __SSE4_1__ */ + +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_1__ */ + /* Accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_crc32_u8 (unsigned int __C, unsigned char __V) @@ -823,8 +849,14 @@ _mm_crc32_u64 (unsigned long long __C, unsigned lo } #endif -#endif /* __SSE4_2__ */ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ -#endif /* __SSE4_1__ */ +#ifdef __DISABLE_SSE4_1__ +#undef __DISABLE_SSE4_1__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_1__ */ #endif /* _SMMINTRIN_H_INCLUDED */ Index: config/i386/ia32intrin.h =================================================================== --- config/i386/ia32intrin.h (revision 200347) +++ config/i386/ia32intrin.h (working copy) @@ -49,7 +49,12 @@ __bswapd (int __X) return __builtin_bswap32 (__X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ + /* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -71,8 +76,12 @@ __crc32d (unsigned int __C, unsigned int __V) { return __builtin_ia32_crc32si (__C, __V); } -#endif /* SSE4.2 */ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + /* 32bit popcnt */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -186,7 +195,12 @@ __bswapq (long long __X) return __builtin_bswap64 (__X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ + /* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -194,8 +208,12 @@ __crc32q (unsigned long long __C, unsigned long lo { return __builtin_ia32_crc32di (__C, __V); } -#endif +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + /* 64bit popcnt */ extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) Index: config/i386/wmmintrin.h =================================================================== --- config/i386/wmmintrin.h (revision 200347) +++ config/i386/wmmintrin.h (working copy) @@ -30,13 +30,14 @@ /* We need definitions from the SSE2 header file. */ #include -#if !defined (__AES__) && !defined (__PCLMUL__) -# error "AES/PCLMUL instructions not enabled" -#else - /* AES */ -#ifdef __AES__ +#ifndef __AES__ +#pragma GCC push_options +#pragma GCC target("aes") +#define __DISABLE_AES__ +#endif /* __AES__ */ + /* Performs 1 round of AES decryption of the first m128i using the second m128i as a round key. */ extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -92,11 +93,20 @@ _mm_aeskeygenassist_si128 (__m128i __X, const int ((__m128i) __builtin_ia32_aeskeygenassist128 ((__v2di)(__m128i)(X), \ (int)(C))) #endif -#endif /* __AES__ */ +#ifdef __DISABLE_AES__ +#undef __DISABLE_AES__ +#pragma GCC pop_options +#endif /* __DISABLE_AES__ */ + /* PCLMUL */ -#ifdef __PCLMUL__ +#ifndef __PCLMUL__ +#pragma GCC push_options +#pragma GCC target("pclmul") +#define __DISABLE_PCLMUL__ +#endif /* __PCLMUL__ */ + /* Performs carry-less integer multiplication of 64-bit halves of 128-bit input operands. The third parameter inducates which 64-bit haves of the input parameters v1 and v2 should be used. It must be @@ -113,8 +123,10 @@ _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, co ((__m128i) __builtin_ia32_pclmulqdq128 ((__v2di)(__m128i)(X), \ (__v2di)(__m128i)(Y), (int)(I))) #endif -#endif /* __PCLMUL__ */ -#endif /* __AES__/__PCLMUL__ */ +#ifdef __DISABLE_PCLMUL__ +#undef __DISABLE_PCLMUL__ +#pragma GCC pop_options +#endif /* __DISABLE_PCLMUL__ */ #endif /* _WMMINTRIN_H_INCLUDED */ Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 200347) +++ config/i386/i386.c (working copy) @@ -4649,6 +4649,13 @@ ix86_can_inline_p (tree caller, tree callee) /* Remember the last target of ix86_set_current_function. */ static GTY(()) tree ix86_previous_fndecl; +/* Invalidate ix86_previous_fndecl cache. */ +void +ix86_reset_previous_fndecl (void) +{ + ix86_previous_fndecl = NULL_TREE; +} + /* Establish appropriate back-end context for processing the function FNDECL. The argument might be NULL to indicate processing at top level, outside of any function scope. */ Index: config/i386/avx2intrin.h =================================================================== --- config/i386/avx2intrin.h (revision 200347) +++ config/i386/avx2intrin.h (working copy) @@ -25,6 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _AVX2INTRIN_H_INCLUDED +#define _AVX2INTRIN_H_INCLUDED + +#ifndef __AVX2__ +#pragma GCC push_options +#pragma GCC target("avx2") +#define __DISABLE_AVX2__ +#endif /* __AVX2__ */ + /* Sum absolute 8-bit integer difference of adjacent groups of 4 byte integers in the first 2 operands. Starting offsets within operands are determined by the 3rd mask operand. */ @@ -1871,3 +1880,10 @@ _mm256_mask_i64gather_epi32 (__m128i src, int cons (__v4si)(__m128i)MASK, \ (int)SCALE) #endif /* __OPTIMIZE__ */ + +#ifdef __DISABLE_AVX2__ +#undef __DISABLE_AVX2__ +#pragma GCC pop_options +#endif /* __DISABLE_AVX2__ */ + +#endif /* _AVX2INTRIN_H_INCLUDED */ Index: config/i386/fxsrintrin.h =================================================================== --- config/i386/fxsrintrin.h (revision 200347) +++ config/i386/fxsrintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _FXSRINTRIN_H_INCLUDED #define _FXSRINTRIN_H_INCLUDED +#ifndef __FXSR__ +#pragma GCC push_options +#pragma GCC target("fxsr") +#define __DISABLE_FXSR__ +#endif /* __FXSR__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _fxsave (void *__P) @@ -58,4 +64,10 @@ _fxrstor64 (void *__P) } #endif +#ifdef __DISABLE_FXSR__ +#undef __DISABLE_FXSR__ +#pragma GCC pop_options +#endif /* __DISABLE_FXSR__ */ + + #endif /* _FXSRINTRIN_H_INCLUDED */ Index: config/i386/x86intrin.h =================================================================== --- config/i386/x86intrin.h (revision 200347) +++ config/i386/x86intrin.h (working copy) @@ -26,96 +26,52 @@ #include -#ifdef __MMX__ #include -#endif -#ifdef __SSE__ #include -#endif -#ifdef __SSE2__ #include -#endif -#ifdef __SSE3__ #include -#endif -#ifdef __SSSE3__ #include -#endif -#ifdef __SSE4A__ #include -#endif -#if defined (__SSE4_2__) || defined (__SSE4_1__) #include -#endif -#if defined (__AES__) || defined (__PCLMUL__) #include -#endif /* For including AVX instructions */ #include -#ifdef __3dNOW__ #include -#endif -#ifdef __FMA4__ #include -#endif -#ifdef __XOP__ #include -#endif -#ifdef __LWP__ #include -#endif -#ifdef __BMI__ #include -#endif -#ifdef __BMI2__ #include -#endif -#ifdef __TBM__ #include -#endif -#ifdef __LZCNT__ #include -#endif -#ifdef __POPCNT__ #include -#endif -#ifdef __RDSEED__ #include -#endif -#ifdef __PRFCHW__ #include -#endif -#ifdef __FXSR__ #include -#endif -#ifdef __XSAVE__ #include -#endif -#ifdef __XSAVEOPT__ #include -#endif #include Index: config/i386/pmmintrin.h =================================================================== --- config/i386/pmmintrin.h (revision 200347) +++ config/i386/pmmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ -# error "SSE3 instruction set not enabled" -#else - /* We need definitions from the SSE2 and SSE header files*/ #include +#ifndef __SSE3__ +#pragma GCC push_options +#pragma GCC target("sse3") +#define __DISABLE_SSE3__ +#endif /* __SSE3__ */ + /* Additional bits in the MXCSR. */ #define _MM_DENORMALS_ZERO_MASK 0x0040 #define _MM_DENORMALS_ZERO_ON 0x0040 @@ -122,6 +124,9 @@ _mm_mwait (unsigned int __E, unsigned int __H) __builtin_ia32_mwait (__E, __H); } -#endif /* __SSE3__ */ +#ifdef __DISABLE_SSE3__ +#undef __DISABLE_SSE3__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE3__ */ #endif /* _PMMINTRIN_H_INCLUDED */ Index: config/i386/tmmintrin.h =================================================================== --- config/i386/tmmintrin.h (revision 200347) +++ config/i386/tmmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ -# error "SSSE3 instruction set not enabled" -#else - /* We need definitions from the SSE3, SSE2 and SSE header files*/ #include +#ifndef __SSSE3__ +#pragma GCC push_options +#pragma GCC target("ssse3") +#define __DISABLE_SSSE3__ +#endif /* __SSSE3__ */ + extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_hadd_epi16 (__m128i __X, __m128i __Y) { @@ -239,6 +241,9 @@ _mm_abs_pi32 (__m64 __X) return (__m64) __builtin_ia32_pabsd ((__v2si)__X); } -#endif /* __SSSE3__ */ +#ifdef __DISABLE_SSSE3__ +#undef __DISABLE_SSSE3__ +#pragma GCC pop_options +#endif /* __DISABLE_SSSE3__ */ #endif /* _TMMINTRIN_H_INCLUDED */ Index: config/i386/f16cintrin.h =================================================================== --- config/i386/f16cintrin.h (revision 200347) +++ config/i386/f16cintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include or instead." #endif -#ifndef __F16C__ -# error "F16C instruction set not enabled" -#else - #ifndef _F16CINTRIN_H_INCLUDED #define _F16CINTRIN_H_INCLUDED +#ifndef __F16C__ +#pragma GCC push_options +#pragma GCC target("f16c") +#define __DISABLE_F16C__ +#endif /* __F16C__ */ + extern __inline float __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _cvtsh_ss (unsigned short __S) { @@ -88,5 +90,9 @@ _mm256_cvtps_ph (__m256 __A, const int __I) ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I))) #endif /* __OPTIMIZE */ +#ifdef __DISABLE_F16C__ +#undef __DISABLE_F16C__ +#pragma GCC pop_options +#endif /* __DISABLE_F16C__ */ + #endif /* _F16CINTRIN_H_INCLUDED */ -#endif /* __F16C__ */ Index: config/i386/mm3dnow.h =================================================================== --- config/i386/mm3dnow.h (revision 200347) +++ config/i386/mm3dnow.h (working copy) @@ -27,11 +27,15 @@ #ifndef _MM3DNOW_H_INCLUDED #define _MM3DNOW_H_INCLUDED -#ifdef __3dNOW__ - #include #include +#ifndef __3dNOW__ +#pragma GCC push_options +#pragma GCC target("3dnow") +#define __DISABLE_3dNOW__ +#endif /* __3dNOW__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _m_femms (void) { @@ -205,6 +209,10 @@ _m_pswapd (__m64 __A) } #endif /* __3dNOW_A__ */ -#endif /* __3dNOW__ */ +#ifdef __DISABLE_3dNOW__ +#undef __DISABLE_3dNOW__ +#pragma GCC pop_options +#endif /* __DISABLE_3dNOW__ */ + #endif /* _MM3DNOW_H_INCLUDED */ Index: testsuite/gcc.target/i386/avx-1.c =================================================================== --- testsuite/gcc.target/i386/avx-1.c (revision 200347) +++ testsuite/gcc.target/i386/avx-1.c (working copy) @@ -159,6 +159,13 @@ #define __builtin_ia32_vec_ext_v4hi(A, N) __builtin_ia32_vec_ext_v4hi(A, 0) #define __builtin_ia32_shufps(A, B, N) __builtin_ia32_shufps(A, B, 0) +/* f16cintrin.h */ +#define __builtin_ia32_vcvtps2ph(A, I) __builtin_ia32_vcvtps2ph(A, 0) +#define __builtin_ia32_vcvtps2ph256(A, I) __builtin_ia32_vcvtps2ph256(A, 0) + +/* rtmintrin.h */ +#define __builtin_ia32_xabort(I) __builtin_ia32_xabort(0) + #include #include #include Index: testsuite/gcc.target/i386/intrinsics_4.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_4.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_4.c (revision 0) @@ -0,0 +1,14 @@ +/* Test case to check if AVX intrinsics and function specific target + optimizations work together. Check by including immintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-avx" } */ + +#include + +__m256 a[10], b[10], c[10]; +void __attribute__((target ("avx"))) +foo (void) +{ + a[0] = _mm256_and_ps (b[0], c[0]); +} Index: testsuite/gcc.target/i386/intrinsics_1.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_1.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_1.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check by including x86intrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2" } */ + +#include + +__attribute__((target("sse4.2"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_5.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_5.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_5.c (revision 0) @@ -0,0 +1,16 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check if an error is issued in + -O2 mode when foo calls an intrinsic without the right target + attribute. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2" } */ + +#include + +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); /* { dg-error "called from here" } */ +} + +/* { dg-prune-output ".*inlining failed.*" } */ Index: testsuite/gcc.target/i386/intrinsics_2.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_2.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_2.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check by including immintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include + +__attribute__((target("sse4.2"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_6.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_6.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_6.c (revision 0) @@ -0,0 +1,16 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check if an error is issued in + -O0 mode when foo calls an intrinsic without the right target + attribute. */ + +/* { dg-do compile } */ +/* { dg-options "-O0 -msse -mno-sse4.1 -mno-sse4.2" } */ + +#include + +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); /* { dg-error "called from here" } */ +} + +/* { dg-prune-output ".*inlining failed.*" } */ Index: testsuite/gcc.target/i386/intrinsics_3.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_3.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_3.c (revision 0) @@ -0,0 +1,15 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check if the POPCNT specific intrinsics + in included with popcntintrin.h get enabled by directly including + popcntintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2 -mno-popcnt" } */ + +#include + +__attribute__((target("popcnt"))) +long long foo(unsigned long long X) +{ + return _mm_popcnt_u64 (X); +}