Message ID | CAAs8HmywEfzR3CxOUQSQ3z1LC2xvucR--acW0e3Bavy0jO9J1w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Sat, Jun 29, 2013 at 12:55 AM, Sriraman Tallam <tmsriram@google.com> wrote: > Inlining sse* functions into avx is broken? Here is an example: > > __attribute__((always_inline,target("sse3"))) > inline int callee () > { > return 0; > } > > __attribute__((target("avx"))) > inline int caller () > { > return callee (); > } > > main () > { > return caller (); > } > > $ g++ -O2 foo.cc > error: inlining failed in call to always_inline 'int callee()': target > specific option mismatch > > This patch fixes the problem: > > Index: gcc/config/i386/i386.c > =================================================================== > --- gcc/config/i386/i386.c (revision 200497) > +++ gcc/config/i386/i386.c (working copy) > @@ -4520,8 +4520,9 @@ ix86_can_inline_p (tree caller, tree callee) > != callee_opts->x_ix86_isa_flags) > ret = false; > > - /* See if we have the same non-isa options. */ > - else if (caller_opts->x_target_flags != callee_opts->x_target_flags) > + /* Callee's non-isa options should be a subset of the caller's. */ > + else if ((caller_opts->x_target_flags & callee_opts->x_target_flags) > + != callee_opts->x_target_flags) > ret = false; > > > Setting avx as ISA has side-effects on some target flags and directly > comparing them seems incorrect. This is also breaking the usage of > mmintrinsic headers. An intrinsic in avxintrin.h is calling an > intrinsic in another header. The call is not inlinable for the same > reason. > > Is this ok? This looks correct to me. Do we need to backport this change? Uros.
> On Sat, Jun 29, 2013 at 12:55 AM, Sriraman Tallam <tmsriram@google.com> wrote: > > > Inlining sse* functions into avx is broken? Here is an example: > > > > __attribute__((always_inline,target("sse3"))) > > inline int callee () > > { > > return 0; > > } > > > > __attribute__((target("avx"))) > > inline int caller () > > { > > return callee (); > > } > > > > main () > > { > > return caller (); > > } > > > > $ g++ -O2 foo.cc > > error: inlining failed in call to always_inline 'int callee()': target > > specific option mismatch > > > > This patch fixes the problem: > > > > Index: gcc/config/i386/i386.c > > =================================================================== > > --- gcc/config/i386/i386.c (revision 200497) > > +++ gcc/config/i386/i386.c (working copy) > > @@ -4520,8 +4520,9 @@ ix86_can_inline_p (tree caller, tree callee) > > != callee_opts->x_ix86_isa_flags) > > ret = false; > > > > - /* See if we have the same non-isa options. */ > > - else if (caller_opts->x_target_flags != callee_opts->x_target_flags) > > + /* Callee's non-isa options should be a subset of the caller's. */ > > + else if ((caller_opts->x_target_flags & callee_opts->x_target_flags) > > + != callee_opts->x_target_flags) > > ret = false; > > > > > > Setting avx as ISA has side-effects on some target flags and directly > > comparing them seems incorrect. This is also breaking the usage of What target flags are enabled by AVX? Assumming that all target flags are positive seems incorrect to me (like -mno-red-zone function can not be inlined into -mred-zone). Does those conditionally enabled AVX codegen flags have any effect when AVX is disabled? Perhaps we can set them unconditionally? Honza > > mmintrinsic headers. An intrinsic in avxintrin.h is calling an > > intrinsic in another header. The call is not inlinable for the same > > reason. > > > > Is this ok? > > This looks correct to me. > > Do we need to backport this change? > > Uros.
> > What target flags are enabled by AVX? Assumming that all target flags are > positive seems incorrect to me (like -mno-red-zone function can not be inlined > into -mred-zone). Does those conditionally enabled AVX codegen flags have any Actually -mred-zone seems right, but stuff like -msseregparm will probably break? Honza > effect when AVX is disabled? Perhaps we can set them unconditionally? > > Honza > > > mmintrinsic headers. An intrinsic in avxintrin.h is calling an > > > intrinsic in another header. The call is not inlinable for the same > > > reason. > > > > > > Is this ok? > > > > This looks correct to me. > > > > Do we need to backport this change? > > > > Uros.
On Sun, Jun 30, 2013 at 11:47 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> >> What target flags are enabled by AVX? Assumming that all target flags are >> positive seems incorrect to me (like -mno-red-zone function can not be inlined >> into -mred-zone). Does those conditionally enabled AVX codegen flags have any > > Actually -mred-zone seems right, but stuff like -msseregparm will probably break? > > Honza > >> effect when AVX is disabled? Perhaps we can set them unconditionally? The issue is with (config/i386.c, ix86_option_override_internal): if (TARGET_AVX) { /* When not optimize for size, enable vzeroupper optimization for TARGET_AVX with -fexpensive-optimizations and split 32-byte AVX unaligned load/store. */ if (!optimize_size) { if (flag_expensive_optimizations && !(target_flags_explicit & MASK_VZEROUPPER)) target_flags |= MASK_VZEROUPPER; if ((x86_avx256_split_unaligned_load & ix86_tune_mask) && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD; if ((x86_avx256_split_unaligned_store & ix86_tune_mask) && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE; /* Enable 128-bit AVX instruction generation for the auto-vectorizer. */ if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128)) target_flags |= MASK_PREFER_AVX128; } These are all tuning flags that are applicable to AVX only. They depend on AVX, so can be probably enabled unconditionally. Uros.
Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 200497) +++ gcc/config/i386/i386.c (working copy) @@ -4520,8 +4520,9 @@ ix86_can_inline_p (tree caller, tree callee) != callee_opts->x_ix86_isa_flags) ret = false; - /* See if we have the same non-isa options. */ - else if (caller_opts->x_target_flags != callee_opts->x_target_flags) + /* Callee's non-isa options should be a subset of the caller's. */ + else if ((caller_opts->x_target_flags & callee_opts->x_target_flags) + != callee_opts->x_target_flags) ret = false;