Message ID | 56A239F6.30102@foss.arm.com |
---|---|
State | New |
Headers | show |
Hi Kyrill, On 01/22/2016 03:17 PM, Kyrill Tkachov wrote: > Hi Christian, > > On 22/01/16 14:07, Christian Bruel wrote: >> Hi Kyrill, >> >> On 01/21/2016 01:22 PM, Kyrill Tkachov wrote: >>> Hi Christian, >>> >>> On 21/01/16 10:36, Christian Bruel wrote: >>>> The current arm_set_current_function was both awkward and buggy. For instance using partially set TARGET_OPTION set from pragma_parse, while restore_target_globalsnor arm_option_params_internal was not reset. Another issue is that in some >>>> paths, target_reinit was not called due to old cached target_option_current_node value. for instance with >>>> >>>> foo{} >>>> #pragma GCC target ... >>>> >>>> foo was called with global_options set from old GCC target (which was wrong) and correct rtl values. >>>> >>>> This is a reimplementation of the function. Hoping to be easier to read (and maintain). Solves the current issues seen so far. >>>> >>>> regtested for arm-linux-gnueabi -mfpu=vfp, -mfpu=neon,-mfpu=neon-fp-armv8 >>>> >>> Thanks for the patch, I'll try it out. >>> In the meantime there's a couple of style and typo nits... >>> >>> + /* Make sure that target_reinit is called for next function, since >>> + TREE_TARGET_OPTION might change with the #pragma even if there are >>> + no target attribute attached to the function. */ >>> >>> s/attribute/attributes >>> >>> - arm_previous_fndecl = fndecl; >>> + /* if no attribute, use the mode set by the current pragma target. */ >>> + if (! new_tree) >>> + new_tree = target_option_current_node; >>> + >>> >>> s/if/If/ >>> >>> + /* now target_reinit can save the state for later. */ >>> + TREE_TARGET_GLOBALS (new_tree) >>> + = save_target_globals_default_opts (); >>> >>> s/now/Now/ >>> >> While playing on my side. I realized that we could simplify the patch further by removing the need to set and use target_option_current_node, since this is redundant with what handle_pragma_push/pop_options does. >> Also since that the functions inside a pragma GCC target region will have DECL_FUNCTION_SPECIFIC_TARGET set already, we don't seem to need a special case for those. >> >> With this V2, arm_set_current_function is becoming more minimalist and still fixes the current issues. Could you test this version instead ? >> > Thanks, I'll check this out instead. > I've played a bit with your previous version and the effect on the testcases looked ok, but I have a couple of > comments on the testcase in the meantime > > Index: gcc/testsuite/gcc.target/arm/pr69245.c > =================================================================== > --- gcc/testsuite/gcc.target/arm/pr69245.c (revision 0) > +++ gcc/testsuite/gcc.target/arm/pr69245.c (working copy) > @@ -0,0 +1,24 @@ > +/* PR target/69245 */ > +/* Test that pop_options restores the vfp fpu mode. */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_fp_ok } */ > +/* { dg-add-options arm_fp } */ > + > +#pragma GCC target "fpu=vfp" > + > +#pragma GCC push_options > +#pragma GCC target "fpu=neon" > +int a, c, d; > +float b; > +static int fn1 () > +{ > + return 0; > +} > +#pragma GCC pop_options > + > +void fn2 () > +{ > + d = b * c + a; > +} > + > +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ > > > PR 69245 is an ICE whereas your testcase doesn't ICE without the patch, it just fails the > scan-assembler check. I'd like to have the testcase trigger the ICE without your patch. > For that we need -O2 in dg-options. > Also, the "fpu=vfp" pragma you put in the beginning doesn't allow the ICE to trigger, presumably > because it triggers a different path through the pragma option popping code. > So removing that pragma and instead changing the dg-add-options from arm_fp to arm_vfp3 (which is > floating-point without the vfma instruction causes the ICE) does the trick for me. > Also the "fpu=neon" pragma should also be changed to be "fpu=neon-vfpv4" because that setting allows > the vfma instruction which is being wrongly considered in fn2(). > I suppose you'll then want to change the scan-assembler directive to look for \.fpu vfp3. ah yes ! OK for -O2, I thought I had it, must have been deleted somewhere :-( I added the #pragma GCC target "fpu=vfp" to have some kind of deterministic checks to guard against the options permutations that Christophe stresses during his validations. so for instance the ".fpu scan-assembler would change depending on the default options... so the following test should ICE with the all configurations (!-mfloat-abi=soft) in -O2 #pragma GCC target "fpu=vfp" #pragma GCC push_options #pragma GCC target "fpu=neon-vfpv4" int a, c, d; float b; static int fn1 () { return 0; } #pragma GCC pop_options void fn2 () { d = b * c + a; } I'm also continuing to playing on my side with different scenarios for mixtures of push/pop/reset attributes. thanks Christian > > Thanks, > Kyrill >
On 22/01/16 14:51, Christian Bruel wrote: > Hi Kyrill, > > On 01/22/2016 03:17 PM, Kyrill Tkachov wrote: >> Hi Christian, >> >> On 22/01/16 14:07, Christian Bruel wrote: >>> Hi Kyrill, >>> >>> On 01/21/2016 01:22 PM, Kyrill Tkachov wrote: >>>> Hi Christian, >>>> >>>> On 21/01/16 10:36, Christian Bruel wrote: >>>>> The current arm_set_current_function was both awkward and buggy. For instance using partially set TARGET_OPTION set from pragma_parse, while restore_target_globalsnor arm_option_params_internal was not reset. Another issue is that in >>>>> some >>>>> paths, target_reinit was not called due to old cached target_option_current_node value. for instance with >>>>> >>>>> foo{} >>>>> #pragma GCC target ... >>>>> >>>>> foo was called with global_options set from old GCC target (which was wrong) and correct rtl values. >>>>> >>>>> This is a reimplementation of the function. Hoping to be easier to read (and maintain). Solves the current issues seen so far. >>>>> >>>>> regtested for arm-linux-gnueabi -mfpu=vfp, -mfpu=neon,-mfpu=neon-fp-armv8 >>>>> >>>> Thanks for the patch, I'll try it out. >>>> In the meantime there's a couple of style and typo nits... >>>> >>>> + /* Make sure that target_reinit is called for next function, since >>>> + TREE_TARGET_OPTION might change with the #pragma even if there are >>>> + no target attribute attached to the function. */ >>>> >>>> s/attribute/attributes >>>> >>>> - arm_previous_fndecl = fndecl; >>>> + /* if no attribute, use the mode set by the current pragma target. */ >>>> + if (! new_tree) >>>> + new_tree = target_option_current_node; >>>> + >>>> >>>> s/if/If/ >>>> >>>> + /* now target_reinit can save the state for later. */ >>>> + TREE_TARGET_GLOBALS (new_tree) >>>> + = save_target_globals_default_opts (); >>>> >>>> s/now/Now/ >>>> >>> While playing on my side. I realized that we could simplify the patch further by removing the need to set and use target_option_current_node, since this is redundant with what handle_pragma_push/pop_options does. >>> Also since that the functions inside a pragma GCC target region will have DECL_FUNCTION_SPECIFIC_TARGET set already, we don't seem to need a special case for those. >>> >>> With this V2, arm_set_current_function is becoming more minimalist and still fixes the current issues. Could you test this version instead ? >>> >> Thanks, I'll check this out instead. >> I've played a bit with your previous version and the effect on the testcases looked ok, but I have a couple of >> comments on the testcase in the meantime >> >> Index: gcc/testsuite/gcc.target/arm/pr69245.c >> =================================================================== >> --- gcc/testsuite/gcc.target/arm/pr69245.c (revision 0) >> +++ gcc/testsuite/gcc.target/arm/pr69245.c (working copy) >> @@ -0,0 +1,24 @@ >> +/* PR target/69245 */ >> +/* Test that pop_options restores the vfp fpu mode. */ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target arm_fp_ok } */ >> +/* { dg-add-options arm_fp } */ >> + >> +#pragma GCC target "fpu=vfp" >> + >> +#pragma GCC push_options >> +#pragma GCC target "fpu=neon" >> +int a, c, d; >> +float b; >> +static int fn1 () >> +{ >> + return 0; >> +} >> +#pragma GCC pop_options >> + >> +void fn2 () >> +{ >> + d = b * c + a; >> +} >> + >> +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ >> >> >> PR 69245 is an ICE whereas your testcase doesn't ICE without the patch, it just fails the >> scan-assembler check. I'd like to have the testcase trigger the ICE without your patch. >> For that we need -O2 in dg-options. >> Also, the "fpu=vfp" pragma you put in the beginning doesn't allow the ICE to trigger, presumably >> because it triggers a different path through the pragma option popping code. >> So removing that pragma and instead changing the dg-add-options from arm_fp to arm_vfp3 (which is >> floating-point without the vfma instruction causes the ICE) does the trick for me. >> Also the "fpu=neon" pragma should also be changed to be "fpu=neon-vfpv4" because that setting allows >> the vfma instruction which is being wrongly considered in fn2(). >> I suppose you'll then want to change the scan-assembler directive to look for \.fpu vfp3. > > ah yes ! OK for -O2, I thought I had it, must have been deleted somewhere :-( > > I added the #pragma GCC target "fpu=vfp" to have some kind of deterministic checks to guard against the options permutations that Christophe stresses during his validations. so for instance the ".fpu scan-assembler would change depending > on the default options... > > so the following test should ICE with the all configurations (!-mfloat-abi=soft) in -O2 > > #pragma GCC target "fpu=vfp" > > #pragma GCC push_options > #pragma GCC target "fpu=neon-vfpv4" > int a, c, d; > float b; > static int fn1 () > { > return 0; > } > > #pragma GCC pop_options > void fn2 () > { > d = b * c + a; > } > > Ah ok, I needed to update my tree to include your other midend fixes in this area. I played around with the patch and gave it a bootstrap as well. I wanted to make a sanity check on compile-time performance for files using arm_neon.h and I didn't spot any measurable regressions. So this is ok for trunk with the testcase changed as discussed above and using -O2 optimisation level and with a couple comment fixes below. - arm_previous_fndecl = fndecl; + /* if no attribute, but there was one. Use the default mode. */ + if (! new_tree && old_tree) + new_tree = target_option_default_node; + Start with capital 'I'. Maybe phrase it as: "If current function has no attributes but previous one did, use the default node." ? + /* Call target_reinit and save the state for TARGET_GLOBALS. */ + TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); Two spaces between full stop and comment closure. Thanks, Kyrill
Index: gcc/testsuite/gcc.target/arm/pr69245.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr69245.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr69245.c (working copy) @@ -0,0 +1,24 @@ +/* PR target/69245 */ +/* Test that pop_options restores the vfp fpu mode. */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_fp_ok } */ +/* { dg-add-options arm_fp } */ + +#pragma GCC target "fpu=vfp" + +#pragma GCC push_options +#pragma GCC target "fpu=neon" +int a, c, d; +float b; +static int fn1 () +{ + return 0; +} +#pragma GCC pop_options + +void fn2 () +{ + d = b * c + a; +} + +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */