Message ID | CAAs8Hmxybjtoo0fUTPYWNqF8+SUTLQQAb4TmQdycB1WkccAgkg@mail.gmail.com |
---|---|
State | New |
Headers | show |
+HJ On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam <tmsriram@google.com> wrote: > Hi, > > I have attached an updated patch that addresses all the comments raised. > > On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek <jakub@redhat.com> wrote: >> On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: >>> I have attached a patch that fixes this. I have added an option >>> "-mgenerate-builtins" that will do two things. It will define a macro >>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also >>> expose all the target specific builtins. -mgenerate-builtins will not >>> affect code generation. >> >> 1) this shouldn't be an option, either it can be made to work reliably, >> then it should be done always, or it can't, then it shouldn't be done > > Ok, it is on by default now. There is a way to turn it off, with > -mno-generate-builtins. > >> 2) have you verified that if you always generate all builtins, that the >> builtins not supported by the ISA selected from the command line are >> created with the right vector modes? > > This issue does not arise. When the target builtin is expanded, it is > checked if the ISA support is there, either via function specific > target opts or global target opts. If not, an error is raised. Test > case added for this, please see intrinsic_4.c in patch. > >> 3) the *intrin.h headers in the case where the guarding macro isn't defined >> should be surrounded by something like >> #ifndef __FMA4__ >> #pragma GCC push options >> #pragma GCC target("fma4") >> #endif >> ... >> #ifndef __FMA4__ >> #pragma GCC pop options >> #endif >> so that everything that is in the headers is compiled with the ISA >> in question > > I do not think this should be done because it will break the inlining > ability of the header function and cause issues if the caller does not > specify the required ISA. The fact that the header functions are > marked extern __inline, with gnu_inline guarantees that a body will > not be generated and they will be inlined. If the caller does not > have the required ISA, appropriate errors will be raised. Test cases > added, see intrinsics_1.c, intrinsics_2.c > >> 4) what happens if you use the various vector types typedefed in the >> *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE >> for VECTOR_TYPE is a function call, perhaps it will just be handled as >> generic BLKmode vectors, which is desirable I think > > I checked some tests here. With -mno-sse for instance, vector types > are not permitted in function arguments and return values and gcc > raises a warning/error in each case. With return values, gcc always > gives an error if a SSE register is required in a return value. I > even fixed this message to not do it for functions marked as extern > inline, with "gnu_inline" keyword as a body for them will not be > generated. > > >> 5) what happens if you use a target builtin in a function not supporting >> the corresponding ISA, do you get proper error explaining what you are >> doing wrong? > > Yes, please sse intrinsic_4.c test in patch. > >> 6) what happens if you use some intrinsics in a function not supporting >> the corresponding ISA? Dunno if the inliner chooses not to inline it >> and error out because it is always_inline, or what exactly will happen >> then > > Same deal here. The intrinsic function will, guaranteed, to be inlined > into the caller which will be a corresponding builtin call. That > builtin call will trigger an error if the ISA is not supported. > > Thanks > Sri > >> >> For all this you certainly need testcases. >> >> Jakub
Ping. On Wed, Apr 17, 2013 at 7:13 PM, Sriraman Tallam <tmsriram@google.com> wrote: > +HJ > > On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam <tmsriram@google.com> wrote: >> Hi, >> >> I have attached an updated patch that addresses all the comments raised. >> >> On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek <jakub@redhat.com> wrote: >>> On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: >>>> I have attached a patch that fixes this. I have added an option >>>> "-mgenerate-builtins" that will do two things. It will define a macro >>>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also >>>> expose all the target specific builtins. -mgenerate-builtins will not >>>> affect code generation. >>> >>> 1) this shouldn't be an option, either it can be made to work reliably, >>> then it should be done always, or it can't, then it shouldn't be done >> >> Ok, it is on by default now. There is a way to turn it off, with >> -mno-generate-builtins. >> >>> 2) have you verified that if you always generate all builtins, that the >>> builtins not supported by the ISA selected from the command line are >>> created with the right vector modes? >> >> This issue does not arise. When the target builtin is expanded, it is >> checked if the ISA support is there, either via function specific >> target opts or global target opts. If not, an error is raised. Test >> case added for this, please see intrinsic_4.c in patch. >> >>> 3) the *intrin.h headers in the case where the guarding macro isn't defined >>> should be surrounded by something like >>> #ifndef __FMA4__ >>> #pragma GCC push options >>> #pragma GCC target("fma4") >>> #endif >>> ... >>> #ifndef __FMA4__ >>> #pragma GCC pop options >>> #endif >>> so that everything that is in the headers is compiled with the ISA >>> in question >> >> I do not think this should be done because it will break the inlining >> ability of the header function and cause issues if the caller does not >> specify the required ISA. The fact that the header functions are >> marked extern __inline, with gnu_inline guarantees that a body will >> not be generated and they will be inlined. If the caller does not >> have the required ISA, appropriate errors will be raised. Test cases >> added, see intrinsics_1.c, intrinsics_2.c >> >>> 4) what happens if you use the various vector types typedefed in the >>> *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE >>> for VECTOR_TYPE is a function call, perhaps it will just be handled as >>> generic BLKmode vectors, which is desirable I think >> >> I checked some tests here. With -mno-sse for instance, vector types >> are not permitted in function arguments and return values and gcc >> raises a warning/error in each case. With return values, gcc always >> gives an error if a SSE register is required in a return value. I >> even fixed this message to not do it for functions marked as extern >> inline, with "gnu_inline" keyword as a body for them will not be >> generated. >> >> >>> 5) what happens if you use a target builtin in a function not supporting >>> the corresponding ISA, do you get proper error explaining what you are >>> doing wrong? >> >> Yes, please sse intrinsic_4.c test in patch. >> >>> 6) what happens if you use some intrinsics in a function not supporting >>> the corresponding ISA? Dunno if the inliner chooses not to inline it >>> and error out because it is always_inline, or what exactly will happen >>> then >> >> Same deal here. The intrinsic function will, guaranteed, to be inlined >> into the caller which will be a corresponding builtin call. That >> builtin call will trigger an error if the ISA is not supported. >> >> Thanks >> Sri >> >>> >>> For all this you certainly need testcases. >>> >>> Jakub
On Tue, 16 Apr 2013, Sriraman Tallam wrote: > Ok, it is on by default now. There is a way to turn it off, with > -mno-generate-builtins. Any new option needs documenting in invoke.texi.
Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 197691) +++ config/i386/i386.c (working copy) @@ -6370,8 +6370,13 @@ construct_container (enum machine_mode mode, enum return NULL; /* We allowed the user to turn off SSE for kernel mode. Don't crash if - some less clueful developer tries to use floating-point anyway. */ - if (needed_sseregs && !TARGET_SSE) + some less clueful developer tries to use floating-point anyway. It is + alright if this is in a extern "gnu_inline" function, as it is the + caller that matters in this case. */ + if (needed_sseregs && !TARGET_SSE + && !(DECL_EXTERNAL (current_function_decl) + && lookup_attribute ("gnu_inline", + DECL_ATTRIBUTES (current_function_decl)) != NULL)) { if (in_return) { @@ -26813,7 +26818,8 @@ def_builtin (HOST_WIDE_INT mask, const char *name, ix86_builtins_isa[(int) code].isa = mask; mask &= ~OPTION_MASK_ISA_64BIT; - if (mask == 0 + if (generate_target_builtins + || mask == 0 || (mask & ix86_isa_flags) != 0 || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) Index: config/i386/i386.opt =================================================================== --- config/i386/i386.opt (revision 197691) +++ config/i386/i386.opt (working copy) @@ -626,3 +626,7 @@ Split 32-byte AVX unaligned store mrtm Target Report Mask(ISA_RTM) Var(ix86_isa_flags) Save Support RTM built-in functions and code generation + +mgenerate-builtins +Target Report Var(generate_target_builtins) Save Init(1) +Generate all target builtins that are otherwise only generated when the approrpriate ISA is turned on. Index: config/i386/i386-c.c =================================================================== --- config/i386/i386-c.c (revision 197691) +++ config/i386/i386-c.c (working copy) @@ -54,6 +54,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_fla int last_arch_char = ix86_arch_string[arch_len - 1]; int last_tune_char = ix86_tune_string[tune_len - 1]; + if (generate_target_builtins) + def_or_undef (parse_in, "__ALL_ISA__"); + /* Built-ins based on -march=. */ switch (arch) { Index: testsuite/gcc.target/i386/intrinsics_4.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_4.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_4.c (revision 0) @@ -0,0 +1,11 @@ +/* Test to check if a target specific builtin used in a function without the + appropriate ISA support generates an error. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-sse4.1" } */ + +#include <smmintrin.h> +__m128i foo(__m128i *V) +{ + return __builtin_ia32_movntdqa (V); /* { dg-error "'__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1" } */ +} Index: testsuite/gcc.target/i386/intrinsics_1.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_1.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_1.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include <smmintrin.h> + +__attribute__((target("sse4.1"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_2.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_2.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_2.c (revision 0) @@ -0,0 +1,19 @@ +/* Ok, to have SSE return in non-SSE functions marked as + extern, "gnu_inline". */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include <smmintrin.h> + +extern __inline __attribute__ ((__gnu_inline__)) +__m128i bar (__m128i *V) +{ + return _mm_stream_load_si128(V); +} + +__attribute__((target("sse4.1"))) +__m128i foo(__m128i *V) +{ + return bar (V); +} Index: testsuite/gcc.target/i386/intrinsics_3.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_3.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_3.c (revision 0) @@ -0,0 +1,11 @@ +/* Using vector types without SSE enabled should generate an error. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-sse" } */ + +typedef long long _m128i __attribute__((vector_size(16),__may_alias__)); + +int foo (_m128i V) /* { dg-warning "SSE vector argument without SSE enabled changes the ABI" } */ +{ + return 0; +} Index: testsuite/gcc.target/i386/intrinsics_5.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_5.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_5.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if -mno-generate-builtins will break use of intrinsics + when the appropriate ISA is not specified. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-generate-builtins -mno-sse4.1" } */ + +#include <smmintrin.h> +__m128i foo(__m128i *V) /* { dg-error "unknown type name" } */ +{ + return V; +} + +/* { dg-excess-errors "\"SSE4.1 instruction set not enabled\"" } */ Index: config/i386/lzcntintrin.h =================================================================== --- config/i386/lzcntintrin.h (revision 197691) +++ config/i386/lzcntintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use <lzcntintrin.h> directly; include <x86intrin.h> instead." #endif -#ifndef __LZCNT__ +#if !defined (__LZCNT__) && !defined (__ALL_ISA__) # error "LZCNT instruction is not enabled" #endif /* __LZCNT__ */ Index: config/i386/lwpintrin.h =================================================================== --- config/i386/lwpintrin.h (revision 197691) +++ config/i386/lwpintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _LWPINTRIN_H_INCLUDED #define _LWPINTRIN_H_INCLUDED -#ifndef __LWP__ +#if !defined (__LWP__) && !defined (__ALL_ISA__) # error "LWP instruction set not enabled" #else Index: config/i386/xopintrin.h =================================================================== --- config/i386/xopintrin.h (revision 197691) +++ config/i386/xopintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED -#ifndef __XOP__ +#if !defined (__XOP__) && !defined (__ALL_ISA__) # error "XOP instruction set not enabled" #else Index: config/i386/fmaintrin.h =================================================================== --- config/i386/fmaintrin.h (revision 197691) +++ config/i386/fmaintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMAINTRIN_H_INCLUDED #define _FMAINTRIN_H_INCLUDED -#ifndef __FMA__ +#if !defined (__FMA__) && !defined (__ALL_ISA__) # error "FMA instruction set not enabled" #else Index: config/i386/bmiintrin.h =================================================================== --- config/i386/bmiintrin.h (revision 197691) +++ config/i386/bmiintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use <bmiintrin.h> directly; include <x86intrin.h> instead." #endif -#ifndef __BMI__ +#if !defined (__BMI__) && !defined (__ALL_ISA__) # error "BMI instruction set not enabled" #endif /* __BMI__ */ Index: config/i386/fma4intrin.h =================================================================== --- config/i386/fma4intrin.h (revision 197691) +++ config/i386/fma4intrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ +#if !defined (__FMA4__) && !defined (__ALL_ISA__) # error "FMA4 instruction set not enabled" #else Index: config/i386/nmmintrin.h =================================================================== --- config/i386/nmmintrin.h (revision 197691) +++ config/i386/nmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ +#if !defined (__SSE4_2__) && !defined (__ALL_ISA__) # error "SSE4.2 instruction set not enabled" #else /* We just include SSE4.1 header file. */ Index: config/i386/tbmintrin.h =================================================================== --- config/i386/tbmintrin.h (revision 197691) +++ config/i386/tbmintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use <tbmintrin.h> directly; include <x86intrin.h> instead." #endif -#ifndef __TBM__ +#if !defined (__TBM__) && !defined (__ALL_ISA__) # error "TBM instruction set not enabled" #endif /* __TBM__ */ Index: config/i386/smmintrin.h =================================================================== --- config/i386/smmintrin.h (revision 197691) +++ config/i386/smmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ +#if !defined (__SSE4_1__) && !defined (__ALL_ISA__) # error "SSE4.1 instruction set not enabled" #else Index: config/i386/wmmintrin.h =================================================================== --- config/i386/wmmintrin.h (revision 197691) +++ config/i386/wmmintrin.h (working copy) @@ -30,7 +30,7 @@ /* We need definitions from the SSE2 header file. */ #include <emmintrin.h> -#if !defined (__AES__) && !defined (__PCLMUL__) +#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__) # error "AES/PCLMUL instructions not enabled" #else Index: config/i386/popcntintrin.h =================================================================== --- config/i386/popcntintrin.h (revision 197691) +++ config/i386/popcntintrin.h (working copy) @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ -#ifndef __POPCNT__ +#if !defined (__POPCNT__) && !defined (__ALL_ISA__) # error "POPCNT instruction set not enabled" #endif /* __POPCNT__ */ Index: config/i386/f16cintrin.h =================================================================== --- config/i386/f16cintrin.h (revision 197691) +++ config/i386/f16cintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use <f16intrin.h> directly; include <x86intrin.h> or <immintrin.h> instead." #endif -#ifndef __F16C__ +#if !defined (__F16C__) && !defined (__ALL_ISA__) # error "F16C instruction set not enabled" #else Index: config/i386/pmmintrin.h =================================================================== --- config/i386/pmmintrin.h (revision 197691) +++ config/i386/pmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ +#if !defined (__SSE3__) && !defined (__ALL_ISA__) # error "SSE3 instruction set not enabled" #else Index: config/i386/bmi2intrin.h =================================================================== --- config/i386/bmi2intrin.h (revision 197691) +++ config/i386/bmi2intrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use <bmi2intrin.h> directly; include <x86intrin.h> instead." #endif -#ifndef __BMI2__ +#if !defined (__BMI2__) && !defined (__ALL_ISA__) # error "BMI2 instruction set not enabled" #endif /* __BMI2__ */ Index: config/i386/tmmintrin.h =================================================================== --- config/i386/tmmintrin.h (revision 197691) +++ config/i386/tmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ +#if !defined (__SSSE3__) && !defined (__ALL_ISA__) # error "SSSE3 instruction set not enabled" #else Index: config/i386/xmmintrin.h =================================================================== --- config/i386/xmmintrin.h (revision 197691) +++ config/i386/xmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _XMMINTRIN_H_INCLUDED #define _XMMINTRIN_H_INCLUDED -#ifndef __SSE__ +#if !defined (__SSE__) && !defined (__ALL_ISA__) # error "SSE instruction set not enabled" #else Index: config/i386/mmintrin.h =================================================================== --- config/i386/mmintrin.h (revision 197691) +++ config/i386/mmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _MMINTRIN_H_INCLUDED #define _MMINTRIN_H_INCLUDED -#ifndef __MMX__ +#if !defined (__MMX__) && !defined (__ALL_ISA__) # error "MMX instruction set not enabled" #else /* The Intel API is flexible enough that we must allow aliasing with other Index: config/i386/ammintrin.h =================================================================== --- config/i386/ammintrin.h (revision 197691) +++ config/i386/ammintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _AMMINTRIN_H_INCLUDED #define _AMMINTRIN_H_INCLUDED -#ifndef __SSE4A__ +#if !defined (__SSE4A__) && !defined (__ALL_ISA__) # error "SSE4A instruction set not enabled" #else Index: config/i386/emmintrin.h =================================================================== --- config/i386/emmintrin.h (revision 197691) +++ config/i386/emmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ +#if !defined (__SSE2__) && !defined (__ALL_ISA__) # error "SSE2 instruction set not enabled" #else