Message ID | or1su83ah3.fsf@lxoliva.fsfla.org |
---|---|
State | New |
Headers | show |
On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > Don't let pointer randomization change the order in which we process > store chains. This may cause SSA_NAMEs to be released in different > order, and if they're reused later, they may cause differences in SSA > partitioning, leading to differences in expand, and ultimately to > different code. > > bootstrap-debug-lean (-fcompare-debug) on i686-linux-gnu has failed in > haifa-sched.c since r245196 exposed the latent ordering problem in > store merging. In this case, the IR differences (different SSA names > selected for copies in out-of-SSA, resulting in some off-by-one > differences in pseudos) were not significant enough to be visible in > the compiler output. > > Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Any reason you chose to maintain a new hash-map instead of at interesting places gather to-be-processed chains into a vector and sort that after IDs? Traversing the hash-map still doesn't get you a reliable order but only one depending on the chain creation and thus stmt walk order - plus the hash function and size - which may be good enough to fix the issue at hand of course. But it will for example still make testcase reduction fragile if shrinking the hash-map by removing unrelated code ends up processing things in different order. So - if we fix this can we fix it by properly sorting things? Thanks, Richard. > for gcc/ChangeLog > > * gimple-ssa-store-merging.c (INCLUDE_MAP): Define. > (struct imm_store_chain_info): Add seqno field and ctor parm. > (class pass_store_merging): Add m_store_seq field. > (pass_store_merging::terminate_and_process_all_chains): > Iterate over m_store_seq. > (pass_store_merging::terminate_all_aliasing_chains): > Likewise. > (pass_store_merging::terminate_and_release_chain): > Erase the m_store_seq entry. > (pass_store_merging::execute): Check for debug stmts first. > Compute seqno, add the m_store_seq entry. > --- > gcc/gimple-ssa-store-merging.c | 58 +++++++++++++++++++++++++++++----------- > 1 file changed, 42 insertions(+), 16 deletions(-) > > diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c > index 17ac94a..0bf7d89 100644 > --- a/gcc/gimple-ssa-store-merging.c > +++ b/gcc/gimple-ssa-store-merging.c > @@ -103,6 +103,7 @@ > [p + 4B] (16-bit) := 0xabcd; // val & 0x00000000ffff; */ > > #include "config.h" > +#define INCLUDE_MAP > #include "system.h" > #include "coretypes.h" > #include "backend.h" > @@ -675,11 +676,12 @@ merged_store_group::apply_stores () > > struct imm_store_chain_info > { > + unsigned seqno; > tree base_addr; > auto_vec<struct store_immediate_info *> m_store_info; > auto_vec<merged_store_group *> m_merged_store_groups; > > - imm_store_chain_info (tree b_a) : base_addr (b_a) {} > + imm_store_chain_info (unsigned seq, tree b_a) : seqno (seq), base_addr (b_a) {} > bool terminate_and_process_chain (); > bool coalesce_immediate_stores (); > bool output_merged_store (merged_store_group *); > @@ -718,6 +720,18 @@ public: > private: > hash_map<tree_operand_hash, struct imm_store_chain_info *> m_stores; > > + /* Use m_store_seq to order elements in m_stores, and iterate over > + them in a predictable way. It maps sequence numbers to the base > + addresses stored as keys in m_stores. Using this order avoids > + extraneous differences in the compiler output just because of > + tree pointer variations (e.g. different chains end up in > + different positions of m_stores, so they are handled in different > + orders, so they allocate or release SSA names in different > + orders, and when they get reused, subsequent passes end up > + getting different SSA names, which may ultimately change > + decisions when going out of SSA). */ > + std::map<unsigned, tree> m_store_seq; > + > bool terminate_and_process_all_chains (); > bool terminate_all_aliasing_chains (imm_store_chain_info **, > bool, gimple *); > @@ -730,11 +744,16 @@ private: > bool > pass_store_merging::terminate_and_process_all_chains () > { > - hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter > - = m_stores.begin (); > bool ret = false; > - for (; iter != m_stores.end (); ++iter) > - ret |= terminate_and_release_chain ((*iter).second); > + for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (), > + iter = next; iter != m_store_seq.end (); iter = next) > + { > + next++; > + tree base_addr = (*iter).second; > + ret |= terminate_and_release_chain (*m_stores.get (base_addr)); > + } > + gcc_assert (m_stores.elements () == 0); > + gcc_assert (m_store_seq.empty ()); > > return ret; > } > @@ -799,15 +818,17 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info > } > } > > - hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter > - = m_stores.begin (); > - > /* Check for aliasing with all other store chains. */ > - for (; iter != m_stores.end (); ++iter) > + for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (), > + iter = next; iter != m_store_seq.end (); iter = next) > { > + next++; > + unsigned seqno = (*iter).first; > + tree base_addr = (*iter).second; > + > /* We already checked all the stores in chain_info and terminated the > chain if necessary. Skip it here. */ > - if (chain_info && (*chain_info) == (*iter).second) > + if (chain_info && (*chain_info)->seqno == seqno) > continue; > > /* We can't use the base object here as that does not reliably exist. > @@ -815,11 +836,11 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info > minimum and maximum offset and the maximum size we could improve > things here). */ > ao_ref chain_ref; > - ao_ref_init_from_ptr_and_size (&chain_ref, (*iter).first, NULL_TREE); > + ao_ref_init_from_ptr_and_size (&chain_ref, base_addr, NULL_TREE); > if (ref_maybe_used_by_stmt_p (stmt, &chain_ref) > || stmt_may_clobber_ref_p_1 (stmt, &chain_ref)) > { > - terminate_and_release_chain ((*iter).second); > + terminate_and_release_chain (*m_stores.get (base_addr)); > ret = true; > } > } > @@ -835,6 +856,7 @@ bool > pass_store_merging::terminate_and_release_chain (imm_store_chain_info *chain_info) > { > bool ret = chain_info->terminate_and_process_chain (); > + m_store_seq.erase (chain_info->seqno); > m_stores.remove (chain_info->base_addr); > delete chain_info; > return ret; > @@ -1336,6 +1358,9 @@ pass_store_merging::execute (function *fun) > { > gimple *stmt = gsi_stmt (gsi); > > + if (is_gimple_debug (stmt)) > + continue; > + > if (gimple_has_volatile_ops (stmt)) > { > /* Terminate all chains. */ > @@ -1346,9 +1371,6 @@ pass_store_merging::execute (function *fun) > continue; > } > > - if (is_gimple_debug (stmt)) > - continue; > - > if (gimple_assign_single_p (stmt) && gimple_vdef (stmt) > && !stmt_can_throw_internal (stmt) > && lhs_valid_for_store_merging_p (gimple_assign_lhs (stmt))) > @@ -1454,13 +1476,17 @@ pass_store_merging::execute (function *fun) > > /* Store aliases any existing chain? */ > terminate_all_aliasing_chains (chain_info, false, stmt); > + unsigned next_seqno = m_store_seq.empty () ? 0 > + : m_store_seq.rbegin()->first + 1; > /* Start a new chain. */ > struct imm_store_chain_info *new_chain > - = new imm_store_chain_info (base_addr); > + = new imm_store_chain_info (next_seqno, base_addr); > info = new store_immediate_info (bitsize, bitpos, > stmt, 0); > new_chain->m_store_info.safe_push (info); > m_stores.put (base_addr, new_chain); > + m_store_seq.insert (m_store_seq.end (), > + std::make_pair (new_chain->seqno, base_addr)); > if (dump_file && (dump_flags & TDF_DETAILS)) > { > fprintf (dump_file, > > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
On Mar 8, 2017, Richard Biener <richard.guenther@gmail.com> wrote: > On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote: >> Don't let pointer randomization change the order in which we process >> store chains. This may cause SSA_NAMEs to be released in different >> order, and if they're reused later, they may cause differences in SSA >> partitioning, leading to differences in expand, and ultimately to >> different code. > Any reason you chose to maintain a new hash-map instead of at > interesting places gather to-be-processed chains into a vector and > sort that after IDs? Where would we get such ids? AFAICT base_addrs don't have to be decls, they can be arbitrary expressions. > Traversing the hash-map still doesn't get you > a reliable order but only one depending on the chain creation and > thus stmt walk order True. That's what all we need in general: output depends on stable inputs only (relative order of stmts within BBs), not on random flukes like pointer values within the compiler process. > But it will for example still make testcase reduction fragile if > shrinking the hash-map by removing unrelated code ends up processing > things in different order. True, if you remove or move the stmts that initiate a chain within a BB, you will change the processing order. There are many passes that will process stmts or insns in the order they appear, so shuffling them will yield different SSA version assignments, pseudo numbers and whatnot. Why should this pass be held to a higher standard? > So - if we fix this can we fix it by properly sorting things? I don't see a way to do better, considering there's no real ID we could use for sorting. Do you? Even if we were to use decl IDs rather than pointers in the tree hashing, that would still make for impredictable ordering, because decl ids are not necessarily stable across e.g. debug and nondebug compilations. Their order is, but if they were to appear within more complex exprs, as we may have in this pass, hash computation would not ensure decl id ordering is preserved.
On Fri, Mar 10, 2017 at 12:02 AM, Alexandre Oliva <aoliva@redhat.com> wrote: > On Mar 8, 2017, Richard Biener <richard.guenther@gmail.com> wrote: > >> On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote: >>> Don't let pointer randomization change the order in which we process >>> store chains. This may cause SSA_NAMEs to be released in different >>> order, and if they're reused later, they may cause differences in SSA >>> partitioning, leading to differences in expand, and ultimately to >>> different code. > >> Any reason you chose to maintain a new hash-map instead of at >> interesting places gather to-be-processed chains into a vector and >> sort that after IDs? > > Where would we get such ids? AFAICT base_addrs don't have to be decls, > they can be arbitrary expressions. > >> Traversing the hash-map still doesn't get you >> a reliable order but only one depending on the chain creation and >> thus stmt walk order > > True. That's what all we need in general: output depends on stable > inputs only (relative order of stmts within BBs), not on random flukes > like pointer values within the compiler process. > >> But it will for example still make testcase reduction fragile if >> shrinking the hash-map by removing unrelated code ends up processing >> things in different order. > > True, if you remove or move the stmts that initiate a chain within a BB, > you will change the processing order. There are many passes that will > process stmts or insns in the order they appear, so shuffling them will > yield different SSA version assignments, pseudo numbers and whatnot. > Why should this pass be held to a higher standard? > >> So - if we fix this can we fix it by properly sorting things? > > I don't see a way to do better, considering there's no real ID we could > use for sorting. Do you? Just use the ID you use but instead of maintaining a hash-map and traverse that collect work items from the other hash-table into a vec and then sort after the ID. I was just thinking that if we're going all the way of having IDs then walking a hash-map of those IDs is not as good as we can do with similar cost? Richard. > > Even if we were to use decl IDs rather than pointers in the tree > hashing, that would still make for impredictable ordering, because decl > ids are not necessarily stable across e.g. debug and nondebug > compilations. Their order is, but if they were to appear within more > complex exprs, as we may have in this pass, hash computation would not > ensure decl id ordering is preserved. > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
On Mar 10, 2017, Richard Biener <richard.guenther@gmail.com> wrote: >>> So - if we fix this can we fix it by properly sorting things? >> I don't see a way to do better, considering there's no real ID we could >> use for sorting. Do you? > Just use the ID you use but instead of maintaining a hash-map and traverse > that collect work items from the other hash-table into a vec and then sort > after the ID. Why is that better? You wrote 'hash-map' again, so I won't assume it was a mistake any more. I'm using a map for the seqno-to-chain mapping; that's a lot more predictable performance- and size-wise. Iterating over a hashmap requires one to go over all of the holes, and you get shuffled results that you then have to sort every time you'll iterate over the entries. You may have to do so multiple times, and if the number of chains is large (is that relevant in practice?) you may have to resize it multiple times. Whereas with the map I proposed, we always insert at the end, we iterate efficiently without having to do additional sorting, we most often remove from the head, and in the exceptional aliasing cases in which we remove from the middle, we do so without much regard to the element count. Now, if there's something you dislike about maps, we could make it a doubly-linked list, or maybe even a singly-linked list. I could replace seqno with a pointer to the next entry in the seq list. We'd just have to keep a pointer to the first entry instead of the newly-added map. We don't even need a pointer to the last element: we could make it work like a stack, rather than a queue, since the sequence of elements doesn't matter much, as long as we make it consistent. That would amount to more boilerplace list-maintenance code, but it's certainly doable, it costs just a pointer vs an int in the chains and an additional pointer over your suggestion, but it saves the shuffling and sorting. > I was just thinking that if we're going all the way of having IDs then > walking a hash-map of those IDs is not as good as we can do > with similar cost? You seem to consider walking a hash_map (which we did before) pretty costly, but then I'm not proposin us to do so; the performance features of a map are not the same as those of a hash_map, and I was proposing a map. Please reassess under this light, and considering the revised suggestion above.
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c index 17ac94a..0bf7d89 100644 --- a/gcc/gimple-ssa-store-merging.c +++ b/gcc/gimple-ssa-store-merging.c @@ -103,6 +103,7 @@ [p + 4B] (16-bit) := 0xabcd; // val & 0x00000000ffff; */ #include "config.h" +#define INCLUDE_MAP #include "system.h" #include "coretypes.h" #include "backend.h" @@ -675,11 +676,12 @@ merged_store_group::apply_stores () struct imm_store_chain_info { + unsigned seqno; tree base_addr; auto_vec<struct store_immediate_info *> m_store_info; auto_vec<merged_store_group *> m_merged_store_groups; - imm_store_chain_info (tree b_a) : base_addr (b_a) {} + imm_store_chain_info (unsigned seq, tree b_a) : seqno (seq), base_addr (b_a) {} bool terminate_and_process_chain (); bool coalesce_immediate_stores (); bool output_merged_store (merged_store_group *); @@ -718,6 +720,18 @@ public: private: hash_map<tree_operand_hash, struct imm_store_chain_info *> m_stores; + /* Use m_store_seq to order elements in m_stores, and iterate over + them in a predictable way. It maps sequence numbers to the base + addresses stored as keys in m_stores. Using this order avoids + extraneous differences in the compiler output just because of + tree pointer variations (e.g. different chains end up in + different positions of m_stores, so they are handled in different + orders, so they allocate or release SSA names in different + orders, and when they get reused, subsequent passes end up + getting different SSA names, which may ultimately change + decisions when going out of SSA). */ + std::map<unsigned, tree> m_store_seq; + bool terminate_and_process_all_chains (); bool terminate_all_aliasing_chains (imm_store_chain_info **, bool, gimple *); @@ -730,11 +744,16 @@ private: bool pass_store_merging::terminate_and_process_all_chains () { - hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter - = m_stores.begin (); bool ret = false; - for (; iter != m_stores.end (); ++iter) - ret |= terminate_and_release_chain ((*iter).second); + for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (), + iter = next; iter != m_store_seq.end (); iter = next) + { + next++; + tree base_addr = (*iter).second; + ret |= terminate_and_release_chain (*m_stores.get (base_addr)); + } + gcc_assert (m_stores.elements () == 0); + gcc_assert (m_store_seq.empty ()); return ret; } @@ -799,15 +818,17 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info } } - hash_map<tree_operand_hash, struct imm_store_chain_info *>::iterator iter - = m_stores.begin (); - /* Check for aliasing with all other store chains. */ - for (; iter != m_stores.end (); ++iter) + for (std::map<unsigned, tree>::iterator next = m_store_seq.begin (), + iter = next; iter != m_store_seq.end (); iter = next) { + next++; + unsigned seqno = (*iter).first; + tree base_addr = (*iter).second; + /* We already checked all the stores in chain_info and terminated the chain if necessary. Skip it here. */ - if (chain_info && (*chain_info) == (*iter).second) + if (chain_info && (*chain_info)->seqno == seqno) continue; /* We can't use the base object here as that does not reliably exist. @@ -815,11 +836,11 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info minimum and maximum offset and the maximum size we could improve things here). */ ao_ref chain_ref; - ao_ref_init_from_ptr_and_size (&chain_ref, (*iter).first, NULL_TREE); + ao_ref_init_from_ptr_and_size (&chain_ref, base_addr, NULL_TREE); if (ref_maybe_used_by_stmt_p (stmt, &chain_ref) || stmt_may_clobber_ref_p_1 (stmt, &chain_ref)) { - terminate_and_release_chain ((*iter).second); + terminate_and_release_chain (*m_stores.get (base_addr)); ret = true; } } @@ -835,6 +856,7 @@ bool pass_store_merging::terminate_and_release_chain (imm_store_chain_info *chain_info) { bool ret = chain_info->terminate_and_process_chain (); + m_store_seq.erase (chain_info->seqno); m_stores.remove (chain_info->base_addr); delete chain_info; return ret; @@ -1336,6 +1358,9 @@ pass_store_merging::execute (function *fun) { gimple *stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + if (gimple_has_volatile_ops (stmt)) { /* Terminate all chains. */ @@ -1346,9 +1371,6 @@ pass_store_merging::execute (function *fun) continue; } - if (is_gimple_debug (stmt)) - continue; - if (gimple_assign_single_p (stmt) && gimple_vdef (stmt) && !stmt_can_throw_internal (stmt) && lhs_valid_for_store_merging_p (gimple_assign_lhs (stmt))) @@ -1454,13 +1476,17 @@ pass_store_merging::execute (function *fun) /* Store aliases any existing chain? */ terminate_all_aliasing_chains (chain_info, false, stmt); + unsigned next_seqno = m_store_seq.empty () ? 0 + : m_store_seq.rbegin()->first + 1; /* Start a new chain. */ struct imm_store_chain_info *new_chain - = new imm_store_chain_info (base_addr); + = new imm_store_chain_info (next_seqno, base_addr); info = new store_immediate_info (bitsize, bitpos, stmt, 0); new_chain->m_store_info.safe_push (info); m_stores.put (base_addr, new_chain); + m_store_seq.insert (m_store_seq.end (), + std::make_pair (new_chain->seqno, base_addr)); if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file,