Message ID | 20241028205719.685557-2-ak@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | [v2,1/3] Disable -fbit-tests and -fjump-tables at -O0 | expand |
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen <ak@linux.intel.com> wrote: > > From: Andi Kleen <ak@gcc.gnu.org> > > The bit cluster code generation strategy is only beneficial when > multiple case labels point to the same code. Do a quick check if > that is the case before trying to cluster. > > This fixes the switch part of PR117091 where all case labels are unique > however it doesn't address the performance problems for non unique > cases. OK. Thanks, Richard. > gcc/ChangeLog: > > PR middle-end/117091 > * gimple-if-to-switch.cc (if_chain::is_beneficial): Update > find_bit_test call. > * tree-switch-conversion.cc (bit_test_cluster::find_bit_tests): > Get max_c argument and bail out early if all case labels are > unique. > (switch_decision_tree::compute_cases_per_edge): Record number of > targets per label and return. > (switch_decision_tree::analyze_switch_statement): ... pass to > find_bit_tests. > * tree-switch-conversion.h: Update prototypes. > --- > gcc/gimple-if-to-switch.cc | 2 +- > gcc/tree-switch-conversion.cc | 23 ++++++++++++++++------- > gcc/tree-switch-conversion.h | 5 +++-- > 3 files changed, 20 insertions(+), 10 deletions(-) > > diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc > index 96ce1c380a59..4151d1bb520e 100644 > --- a/gcc/gimple-if-to-switch.cc > +++ b/gcc/gimple-if-to-switch.cc > @@ -254,7 +254,7 @@ if_chain::is_beneficial () > else > output.release (); > > - output = bit_test_cluster::find_bit_tests (filtered_clusters); > + output = bit_test_cluster::find_bit_tests (filtered_clusters, 2); > r = output.length () < filtered_clusters.length (); > if (r) > dump_clusters (&output, "BT can be built"); > diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc > index 00426d400006..3436c2a8b98c 100644 > --- a/gcc/tree-switch-conversion.cc > +++ b/gcc/tree-switch-conversion.cc > @@ -1772,12 +1772,13 @@ jump_table_cluster::is_beneficial (const vec<cluster *> &, > } > > /* Find bit tests of given CLUSTERS, where all members of the vector > - are of type simple_cluster. New clusters are returned. */ > + are of type simple_cluster. MAX_C is the approx max number of cases per > + label. New clusters are returned. */ > > vec<cluster *> > -bit_test_cluster::find_bit_tests (vec<cluster *> &clusters) > +bit_test_cluster::find_bit_tests (vec<cluster *> &clusters, int max_c) > { > - if (!is_enabled ()) > + if (!is_enabled () || max_c == 1) > return clusters.copy (); > > unsigned l = clusters.length (); > @@ -2206,18 +2207,26 @@ bit_test_cluster::hoist_edge_and_branch_if_true (gimple_stmt_iterator *gsip, > } > > /* Compute the number of case labels that correspond to each outgoing edge of > - switch statement. Record this information in the aux field of the edge. */ > + switch statement. Record this information in the aux field of the edge. > + Return the approx max number of cases per edge. */ > > -void > +int > switch_decision_tree::compute_cases_per_edge () > { > + int max_c = 0; > reset_out_edges_aux (m_switch); > int ncases = gimple_switch_num_labels (m_switch); > for (int i = ncases - 1; i >= 1; --i) > { > edge case_edge = gimple_switch_edge (cfun, m_switch, i); > case_edge->aux = (void *) ((intptr_t) (case_edge->aux) + 1); > + /* For a range case add one extra. That's enough for the bit > + cluster heuristic. */ > + if ((intptr_t)case_edge->aux > max_c) > + max_c = (intptr_t)case_edge->aux + > + !!CASE_HIGH (gimple_switch_label (m_switch, i)); > } > + return max_c; > } > > /* Analyze switch statement and return true when the statement is expanded > @@ -2235,7 +2244,7 @@ switch_decision_tree::analyze_switch_statement () > m_case_bbs.reserve (l); > m_case_bbs.quick_push (default_bb); > > - compute_cases_per_edge (); > + int max_c = compute_cases_per_edge (); > > for (unsigned i = 1; i < l; i++) > { > @@ -2256,7 +2265,7 @@ switch_decision_tree::analyze_switch_statement () > reset_out_edges_aux (m_switch); > > /* Find bit-test clusters. */ > - vec<cluster *> output = bit_test_cluster::find_bit_tests (clusters); > + vec<cluster *> output = bit_test_cluster::find_bit_tests (clusters, max_c); > > /* Find jump table clusters. */ > vec<cluster *> output2; > diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h > index 6468995eb316..e6a85fa60258 100644 > --- a/gcc/tree-switch-conversion.h > +++ b/gcc/tree-switch-conversion.h > @@ -399,7 +399,7 @@ public: > > /* Find bit tests of given CLUSTERS, where all members of the vector > are of type simple_cluster. New clusters are returned. */ > - static vec<cluster *> find_bit_tests (vec<cluster *> &clusters); > + static vec<cluster *> find_bit_tests (vec<cluster *> &clusters, int max_c); > > /* Return true when RANGE of case values with UNIQ labels > can build a bit test. */ > @@ -576,8 +576,9 @@ public: > bool try_switch_expansion (vec<cluster *> &clusters); > /* Compute the number of case labels that correspond to each outgoing edge of > switch statement. Record this information in the aux field of the edge. > + Returns approx max number of cases per edge. > */ > - void compute_cases_per_edge (); > + int compute_cases_per_edge (); > > /* Before switch transformation, record all SSA_NAMEs defined in switch BB > and used in a label basic block. */ > -- > 2.46.2 >
diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc index 96ce1c380a59..4151d1bb520e 100644 --- a/gcc/gimple-if-to-switch.cc +++ b/gcc/gimple-if-to-switch.cc @@ -254,7 +254,7 @@ if_chain::is_beneficial () else output.release (); - output = bit_test_cluster::find_bit_tests (filtered_clusters); + output = bit_test_cluster::find_bit_tests (filtered_clusters, 2); r = output.length () < filtered_clusters.length (); if (r) dump_clusters (&output, "BT can be built"); diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index 00426d400006..3436c2a8b98c 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -1772,12 +1772,13 @@ jump_table_cluster::is_beneficial (const vec<cluster *> &, } /* Find bit tests of given CLUSTERS, where all members of the vector - are of type simple_cluster. New clusters are returned. */ + are of type simple_cluster. MAX_C is the approx max number of cases per + label. New clusters are returned. */ vec<cluster *> -bit_test_cluster::find_bit_tests (vec<cluster *> &clusters) +bit_test_cluster::find_bit_tests (vec<cluster *> &clusters, int max_c) { - if (!is_enabled ()) + if (!is_enabled () || max_c == 1) return clusters.copy (); unsigned l = clusters.length (); @@ -2206,18 +2207,26 @@ bit_test_cluster::hoist_edge_and_branch_if_true (gimple_stmt_iterator *gsip, } /* Compute the number of case labels that correspond to each outgoing edge of - switch statement. Record this information in the aux field of the edge. */ + switch statement. Record this information in the aux field of the edge. + Return the approx max number of cases per edge. */ -void +int switch_decision_tree::compute_cases_per_edge () { + int max_c = 0; reset_out_edges_aux (m_switch); int ncases = gimple_switch_num_labels (m_switch); for (int i = ncases - 1; i >= 1; --i) { edge case_edge = gimple_switch_edge (cfun, m_switch, i); case_edge->aux = (void *) ((intptr_t) (case_edge->aux) + 1); + /* For a range case add one extra. That's enough for the bit + cluster heuristic. */ + if ((intptr_t)case_edge->aux > max_c) + max_c = (intptr_t)case_edge->aux + + !!CASE_HIGH (gimple_switch_label (m_switch, i)); } + return max_c; } /* Analyze switch statement and return true when the statement is expanded @@ -2235,7 +2244,7 @@ switch_decision_tree::analyze_switch_statement () m_case_bbs.reserve (l); m_case_bbs.quick_push (default_bb); - compute_cases_per_edge (); + int max_c = compute_cases_per_edge (); for (unsigned i = 1; i < l; i++) { @@ -2256,7 +2265,7 @@ switch_decision_tree::analyze_switch_statement () reset_out_edges_aux (m_switch); /* Find bit-test clusters. */ - vec<cluster *> output = bit_test_cluster::find_bit_tests (clusters); + vec<cluster *> output = bit_test_cluster::find_bit_tests (clusters, max_c); /* Find jump table clusters. */ vec<cluster *> output2; diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h index 6468995eb316..e6a85fa60258 100644 --- a/gcc/tree-switch-conversion.h +++ b/gcc/tree-switch-conversion.h @@ -399,7 +399,7 @@ public: /* Find bit tests of given CLUSTERS, where all members of the vector are of type simple_cluster. New clusters are returned. */ - static vec<cluster *> find_bit_tests (vec<cluster *> &clusters); + static vec<cluster *> find_bit_tests (vec<cluster *> &clusters, int max_c); /* Return true when RANGE of case values with UNIQ labels can build a bit test. */ @@ -576,8 +576,9 @@ public: bool try_switch_expansion (vec<cluster *> &clusters); /* Compute the number of case labels that correspond to each outgoing edge of switch statement. Record this information in the aux field of the edge. + Returns approx max number of cases per edge. */ - void compute_cases_per_edge (); + int compute_cases_per_edge (); /* Before switch transformation, record all SSA_NAMEs defined in switch BB and used in a label basic block. */
From: Andi Kleen <ak@gcc.gnu.org> The bit cluster code generation strategy is only beneficial when multiple case labels point to the same code. Do a quick check if that is the case before trying to cluster. This fixes the switch part of PR117091 where all case labels are unique however it doesn't address the performance problems for non unique cases. gcc/ChangeLog: PR middle-end/117091 * gimple-if-to-switch.cc (if_chain::is_beneficial): Update find_bit_test call. * tree-switch-conversion.cc (bit_test_cluster::find_bit_tests): Get max_c argument and bail out early if all case labels are unique. (switch_decision_tree::compute_cases_per_edge): Record number of targets per label and return. (switch_decision_tree::analyze_switch_statement): ... pass to find_bit_tests. * tree-switch-conversion.h: Update prototypes. --- gcc/gimple-if-to-switch.cc | 2 +- gcc/tree-switch-conversion.cc | 23 ++++++++++++++++------- gcc/tree-switch-conversion.h | 5 +++-- 3 files changed, 20 insertions(+), 10 deletions(-)