Message ID | or1v3txrto.fsf@livre.localdomain |
---|---|
State | New |
Headers | show |
On Mon, Jan 31, 2011 at 11:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Jan 21, 2011, Alexandre Oliva <aoliva@redhat.com> wrote: > >> we seem to be looking at default-initialized (not recomputed) >> information, a possibility that IIUC you and honza were concered about >> when reviewing the original patch, but that went way over the top of >> my head :-( > > I ended up introducing some infrastructure to check that we don't use > uninitialized “used” in var_ann, and that helped me catch a few > problems. > > First, it reported that var decls created corresponding to result decls > of inlined functions were added twice to the LOCAL_DECLs list. The > first patch below fixes that. > > Second, it showed that we completely failed to compute used flags after > inlining. At first, I split the computation of used flags out of > remove_unused_locals() and refrain from actually removing variables, but > then I decided calling removed_unused_locals() after inlining might get > us smaller functions and more accurate computations for additional > inlining. > > But that was not enough to ensure all accesses were properly > initialized. Temporaries created after the referenced_vars gimple pass > are not recorded in referenced_vars, so the resetting of the used flag > in remove_unused_vars doesn't apply to them. I considered going over > LOCAL_DECLS, but instead I decided to arrange for variables to be added > to referenced_vars as they are created. I hope that's not just ok, but > also desirable. > > With these fixes, in the second patch, I could get a full bootstrap and > library build on x86_64-linux-gnu, fixing both PR 47106 bug and the > regression it caused. > > > The third patch introduces an abstract interface to set and clear the > used flag, so that introducing the checks could be done with a simpler > patch, and the fourth patch introduces the rejection of uses before > initialization of this flag using a 3-state variable rather than a > boolean, and additional checks to verify that the flags are at the > expected states before we start computing what locals are unused. I > don't expect these two patches to be installed, and I'm only posting > them for the record, but if you think they'd useful (say, the abstract > interface is desirable, or the additional checks should be enabled given > some compile-time --enable-checking flag), let me know and I'll prepare > and submit the patches for inclusion. > > > I'm regstrapping them all together on x86_64-linux-gnu and > i686-linux-gnu, at -O1 and -O3 in addition to the regular -O2 bootstrap > (with -fcompare-debug at stage3, just because), and then I'll regstrap > only the first two. This will take some time, because of the hardware > failures I experienced, but I wanted to post the patches early so that, > if they require changes or aren't acceptable, I can save some > no-longer-copious machine time and devote it to the other bugs that are > on my plate. Thanks for your understanding. > > Ok to install the first 2 if they pass regstrap? Opinions on the other 2? > > > 1st: fix duplicate LOCAL_DECLs for inlined result decls Ok (I think the assert is superfluous, a lot of things would already be broken if that didn't hold ...). A further patch might simply pass a struct function to add_local_decl instead. > 2nd: fix bug and regression: remove unused vars after inlining and add > newly-created variables to referenced_vars Err, you add remove-unused-vars in function versioning. Which means if that is useful we copied a function with unused locals - that's the place this should (eventually) be fixed, not this one. The 2nd part of the patch is ok (the gimple-low.c change). > > > 3rd: introduce abstract interface to clear and test the used flag +/* Clear VAR's used flag, so that it may be discarded during rtl + expansion. */ + +static inline void +clear_is_used (tree var) I think this should just say "Clear VAR's used flag.". If it is just used during RTL expansion that pass should simply compute it and keep a local array/bitmap, not using var_ann (which really should go away). That said, the patch is ok. > > > 4th: check that used flag is never accessed before initialization, and > that it's uninitialized after inlining and cleared after clearing it for > all referenced_vars This is not ok (at this stage anyway). I think we should compute "used" where we need it (like remove-unused-vars does). If expand needs it we should compute it there (I think there are no other users). Richard.
On Jan 31, 2011, Richard Guenther <richard.guenther@gmail.com> wrote: >> Second, it showed that we completely failed to compute used flags after >> inlining. At first, I split the computation of used flags out of >> remove_unused_locals() and refrain from actually removing variables, but >> then I decided calling removed_unused_locals() after inlining might get >> us smaller functions and more accurate computations for additional >> inlining. > Err, you add remove-unused-vars in function versioning. Which means > if that is useful we copied a function with unused locals - that's the place > this should (eventually) be fixed, not this one. No, it just means we need part of the computation fo remove_unused_vars(). Maybe my assumption above, that it would make a difference in terms of function size and accuracy, is wrong, but we still need to compute what is used and what isn't. Simply setting newly-copied variables as used doesn't do it: if the original function has unused user variables preserved for debug info, we want them preserved as unused, rather than having them set to used as in the previous version of the patch did. I suppose we could simply copy the state of the used flag when duplicating a variable. I haven't tried that, but it's probabl worth a shot, to avoid having to compute used/unused variables after versioning. >> 4th: check that used flag is never accessed before initialization, and >> that it's uninitialized after inlining and cleared after clearing it for >> all referenced_vars > This is not ok (at this stage anyway) It wasn't meant to be installed anyway, it was just for the record. Like the 3rd patch. Now, if you find this useful for the next stage, you think it should be enabled given some (new?) checking option? > I think we should compute "used" where we need it (like > remove-unused-vars does). If expand needs it we should compute it > there (I think there are no other users). One of the problems is that we only compute used for referenced vars, but I found we sometimes used it for LOCAL_DECLs or block-scope decls that were not in referenced vars, so their flags weren't cleared or recomputed.
On Tue, Feb 1, 2011 at 11:25 PM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Jan 31, 2011, Richard Guenther <richard.guenther@gmail.com> wrote: > >>> Second, it showed that we completely failed to compute used flags after >>> inlining. At first, I split the computation of used flags out of >>> remove_unused_locals() and refrain from actually removing variables, but >>> then I decided calling removed_unused_locals() after inlining might get >>> us smaller functions and more accurate computations for additional >>> inlining. > >> Err, you add remove-unused-vars in function versioning. Which means >> if that is useful we copied a function with unused locals - that's the place >> this should (eventually) be fixed, not this one. > > No, it just means we need part of the computation fo > remove_unused_vars(). Maybe my assumption above, that it would make a > difference in terms of function size and accuracy, is wrong, but we > still need to compute what is used and what isn't. > > Simply setting newly-copied variables as used doesn't do it: if the > original function has unused user variables preserved for debug info, we > want them preserved as unused, rather than having them set to used as in > the previous version of the patch did. > > I suppose we could simply copy the state of the used flag when > duplicating a variable. I haven't tried that, but it's probabl worth a > shot, to avoid having to compute used/unused variables after versioning. > >>> 4th: check that used flag is never accessed before initialization, and >>> that it's uninitialized after inlining and cleared after clearing it for >>> all referenced_vars > >> This is not ok (at this stage anyway) > > It wasn't meant to be installed anyway, it was just for the record. > Like the 3rd patch. Now, if you find this useful for the next stage, > you think it should be enabled given some (new?) checking option? > >> I think we should compute "used" where we need it (like >> remove-unused-vars does). If expand needs it we should compute it >> there (I think there are no other users). > > One of the problems is that we only compute used for referenced vars, > but I found we sometimes used it for LOCAL_DECLs or block-scope decls > that were not in referenced vars, so their flags weren't cleared or > recomputed. Did you spot any other users than remove-unused-locals and expand? What I said above is that expand should compute a used flag for the variables it cares about (not necessarily restricted to referenced-vars), and _not_ use var_ann ()->used for this (but for example a bitmap of UIDs). var_ann () should go away. eventually. Richard. > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist Red Hat Brazil Compiler Engineer >
On Feb 1, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
> Did you spot any other users than remove-unused-locals and expand?
Yeah, the just-introduced use that uses the used flag to estimate stack
sizes for inlining.
Do you suggest we should compute the global usedness of variables of
each function every time we're about to decide on inlining? This sounds
very expensive to me. Perhaps we could compute usedness for all
variables before going through global inlining/versioning decisions, but
then we might have to update it locally, for the inlined-into function
after each inlining, and for the new function version after it's
created, no?
On Wed, Feb 2, 2011 at 1:42 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Feb 1, 2011, Richard Guenther <richard.guenther@gmail.com> wrote: > >> Did you spot any other users than remove-unused-locals and expand? > > Yeah, the just-introduced use that uses the used flag to estimate stack > sizes for inlining. > > Do you suggest we should compute the global usedness of variables of > each function every time we're about to decide on inlining? This sounds > very expensive to me. Perhaps we could compute usedness for all > variables before going through global inlining/versioning decisions, but > then we might have to update it locally, for the inlined-into function > after each inlining, and for the new function version after it's > created, no? No. It's an estimate only and thus can even use a simpler implementation compared to that in cfgexpand. Also for inline parameter computation we already look at all statements, so the cost of determining used variables looks cheap if it is done at the same time. Richard.
On Feb 2, 2011, Richard Guenther <richard.guenther@gmail.com> wrote: > No. It's an estimate only and thus can even use a simpler implementation > compared to that in cfgexpand. Ok, but it can't be so simple that it gives different results depending on whether or not unused variables were retained to debug information, and this is what's hitting us ATM. Perhaps relying on the used flag and going over all scope blocks wasn't such a good idea, after all; I suppose we could get something more reliable using referenced_vars only. Although I'm finding it difficult to figure out whether presence in referenced_vars should ever be different from having the used flag marked, it appears that presence in referenced_vars is maintained more accurately, even during inlining. My earlier patch, that calls remove_unused_locals after versioning, and an earlier version, that split out the part of remove_unused_locals that only computed used and called just that, both work, even when checking for uninitialized uses of used, but my attempts to copy the used flag from the inlined function are proving to be too much of a headache. I think I learned to not try to fit a round block in a square hole when I was very, very young ;-) I'm going to trying to fix the problem of stack size estimation based on referenced_vars only.
Index: gcc/tree-flow-inline.h =================================================================== --- gcc/tree-flow-inline.h.orig 2011-01-29 23:00:20.738072144 -0200 +++ gcc/tree-flow-inline.h 2011-01-29 23:00:21.006072156 -0200 @@ -569,7 +569,7 @@ static inline void set_is_used (tree var) { var_ann_t ann = get_var_ann (var); - ann->used = true; + ann->used3 = TRI_TRUE; } /* Clear VAR's used flag, so that it may be discarded during rtl @@ -579,7 +579,7 @@ static inline void clear_is_used (tree var) { var_ann_t ann = var_ann (var); - ann->used = false; + ann->used3 = TRI_FALSE; } /* Return true if VAR is marked as used. */ @@ -588,7 +588,8 @@ static inline bool is_used_p (tree var) { var_ann_t ann = var_ann (var); - return ann->used; + gcc_checking_assert (ann->used3 != TRI_UNINIT); + return ann->used3 == TRI_TRUE; } /* Return true if T (assumed to be a DECL) is a global variable. Index: gcc/tree-flow.h =================================================================== --- gcc/tree-flow.h.orig 2011-01-29 22:17:07.349555724 -0200 +++ gcc/tree-flow.h 2011-01-29 23:00:21.023072157 -0200 @@ -159,13 +159,15 @@ enum need_phi_state { }; +enum tristate { TRI_FALSE = -1, TRI_UNINIT = 0, TRI_TRUE = 1 }; + struct GTY(()) var_ann_d { /* Used when building base variable structures in a var_map. */ unsigned base_var_processed : 1; /* Nonzero if this variable was used after SSA optimizations were applied. We set this when translating out of SSA form. */ - unsigned used : 1; + ENUM_BITFIELD (tristate) used3 : 2; /* This field indicates whether or not the variable may need PHI nodes. See the enum's definition for more detailed information about the Index: gcc/tree-ssa-live.c =================================================================== --- gcc/tree-ssa-live.c.orig 2011-01-29 23:00:20.780072146 -0200 +++ gcc/tree-ssa-live.c 2011-01-29 23:08:00.188089207 -0200 @@ -636,8 +636,8 @@ dump_scope_block (FILE *file, int indent var_ann_t ann; if ((ann = var_ann (var)) - && ann->used) - used = true; + && ann->used3) + used = ann->used3 == TRI_TRUE; fprintf (file, "%*s",indent, ""); print_generic_decl (file, var, flags); @@ -709,6 +709,14 @@ remove_unused_locals (void) /* Assume all locals are unused. */ FOR_EACH_REFERENCED_VAR (t, rvi) clear_is_used (t); + FOR_EACH_LOCAL_DECL (cfun, num, t) + { + var_ann_t ann = var_ann (t); + if (!ann || is_global_var (t)) + continue; + + gcc_checking_assert (ann->used3 == TRI_FALSE); + } /* Walk the CFG marking all referenced symbols. */ FOR_EACH_BB (bb) Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c.orig 2011-01-29 23:00:20.493072133 -0200 +++ gcc/tree-inline.c 2011-01-29 23:00:21.091072160 -0200 @@ -5251,6 +5251,17 @@ tree_function_versioning (tree old_decl, gcc_assert (!id.debug_stmts); VEC_free (gimple, heap, init_stmts); +#ifdef ENABLE_CHECKING + FOR_EACH_LOCAL_DECL (cfun, i, p) + { + var_ann_t ann = var_ann (p); + if (!ann || is_global_var (p)) + continue; + + gcc_checking_assert (ann->used3 == TRI_UNINIT); + } +#endif + remove_unused_locals (); pop_cfun ();