Message ID | 20241106234219.1503566-2-quic_apinski@quicinc.com |
---|---|
State | New |
Headers | show |
Series | VN predicate improvements | expand |
On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski <quic_apinski@quicinc.com> wrote: > > For `(a | b) == 0`, we can "assert" on the true edge that > both `a == 0` and `b == 0` but nothing on the false edge. > For `(a | b) != 0`, we can "assert" on the false edge that > both `a == 0` and `b == 0` but nothing on the true edge. > This adds that predicate and allows us to optimize f0, f1, > and f2 in fre-predicated-[12].c. > > Changes since v1: > * v2: Use vn_valueize. Also canonicalize the comparison > at the begining of insert_predicates_for_cond for > constants to be on the rhs. Return early for > non-ssa names on the lhs (after canonicalization). > > Bootstrapped and tested on x86_64-linux-gnu. > > PR tree-optimization/117414 > > gcc/ChangeLog: > > * tree-ssa-sccvn.cc (insert_predicates_for_cond): Canonicalize the comparison. > Don't insert anything if lhs is not a SSA_NAME. Handle `(a | b) !=/== 0`. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/fre-predicated-1.c: New test. > * gcc.dg/tree-ssa/fre-predicated-2.c: New test. > > Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> > --- > .../gcc.dg/tree-ssa/fre-predicated-1.c | 53 +++++++++++++++++++ > .../gcc.dg/tree-ssa/fre-predicated-2.c | 27 ++++++++++ > gcc/tree-ssa-sccvn.cc | 36 +++++++++++++ > 3 files changed, 116 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > new file mode 100644 > index 00000000000..d56952f5f24 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > @@ -0,0 +1,53 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +/* PR tree-optimization/117414 */ > + > +/* Fre1 should figure out that `*aaa != 0` > + For f0, f1, and f2. */ > + > + > +void foo(); > +int f0(int *aaa, int j, int t) > +{ > + int b = *aaa; > + int c = b != 0; > + int d = t != 0; > + if (d | c) > + return 0; > + for(int i = 0; i < j; i++) > + { > + if (*aaa) foo(); > + } > + return 0; > +} > + > +int f1(int *aaa, int j, int t) > +{ > + int b = *aaa; > + if (b != 0 || t != 0) > + return 0; > + for(int i = 0; i < j; i++) > + { > + if (*aaa) foo(); > + } > + return 0; > +} > + > + > +int f2(int *aaa, int j, int t) > +{ > + int b = *aaa; > + if (b != 0) > + return 0; > + if (t != 0) > + return 0; > + for(int i = 0; i < j; i++) > + { > + if (*aaa) foo(); > + } > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ > +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > new file mode 100644 > index 00000000000..0123a5b54f7 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +/* PR tree-optimization/117414 */ > + > +/* Fre1 should figure out that `*aaa != 0` > + For f0, f1, and f2. */ > + > + > +void foo(); > +int f0(int *aaa, int j, int t) > +{ > + int b = *aaa; > + int d = b | t; > + if (d == 0) > + ; > + else > + return 0; > + for(int i = 0; i < j; i++) > + { > + if (*aaa) foo(); > + } > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ > +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc > index a11bf968670..c6dddd0ba6d 100644 > --- a/gcc/tree-ssa-sccvn.cc > +++ b/gcc/tree-ssa-sccvn.cc > @@ -7901,6 +7901,21 @@ static void > insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, > edge true_e, edge false_e) > { > + /* If both edges are null, then there is nothing to be done. */ > + if (!true_e && !false_e) > + return; > + > + /* Canonicalize the comparison so the rhs are constants. */ > + if (CONSTANT_CLASS_P (lhs)) > + { > + std::swap (lhs, rhs); > + code = swap_tree_comparison (code); > + } > + > + /* If the lhs is not a ssa name, don't record anything. */ > + if (TREE_CODE (lhs) != SSA_NAME) > + return; > + > tree_code icode = invert_tree_comparison (code, HONOR_NANS (lhs)); > tree ops[2]; > ops[0] = lhs; > @@ -7929,6 +7944,27 @@ insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, > if (false_e) > insert_related_predicates_on_edge (icode, ops, false_e); > } > + if (integer_zerop (rhs) > + && (code == NE_EXPR || code == EQ_EXPR)) > + { > + gimple *def_stmt = SSA_NAME_DEF_STMT (lhs); > + /* (a | b) == 0 -> > + on true edge assert: a == 0 & b == 0. */ > + /* (a | b) != 0 -> > + on false edge assert: a == 0 & b == 0. */ > + if (is_gimple_assign (def_stmt) > + && gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR) > + { > + edge e = code == EQ_EXPR ? true_e : false_e; > + tree nlhs; > + > + nlhs = vn_valueize (gimple_assign_rhs1 (def_stmt)); Do we need to care for the case (do we want to avoid it) when both operands end up constant but we don't fold because of FP? (I'm not sure if that ever happens - the non-folding, that is) Anyway, the patch looks OK. Thanks, Richard. > + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); > + > + nlhs = vn_valueize (gimple_assign_rhs2 (def_stmt)); > + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); > + } > + } > } > > /* Main stmt worker for RPO VN, process BB. */ > -- > 2.43.0 >
On Thu, Nov 7, 2024 at 12:48 AM Richard Biener <richard.guenther@gmail.com> wrote: > > On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski <quic_apinski@quicinc.com> wrote: > > > > For `(a | b) == 0`, we can "assert" on the true edge that > > both `a == 0` and `b == 0` but nothing on the false edge. > > For `(a | b) != 0`, we can "assert" on the false edge that > > both `a == 0` and `b == 0` but nothing on the true edge. > > This adds that predicate and allows us to optimize f0, f1, > > and f2 in fre-predicated-[12].c. > > > > Changes since v1: > > * v2: Use vn_valueize. Also canonicalize the comparison > > at the begining of insert_predicates_for_cond for > > constants to be on the rhs. Return early for > > non-ssa names on the lhs (after canonicalization). > > > > Bootstrapped and tested on x86_64-linux-gnu. > > > > PR tree-optimization/117414 > > > > gcc/ChangeLog: > > > > * tree-ssa-sccvn.cc (insert_predicates_for_cond): Canonicalize the comparison. > > Don't insert anything if lhs is not a SSA_NAME. Handle `(a | b) !=/== 0`. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-ssa/fre-predicated-1.c: New test. > > * gcc.dg/tree-ssa/fre-predicated-2.c: New test. > > > > Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> > > --- > > .../gcc.dg/tree-ssa/fre-predicated-1.c | 53 +++++++++++++++++++ > > .../gcc.dg/tree-ssa/fre-predicated-2.c | 27 ++++++++++ > > gcc/tree-ssa-sccvn.cc | 36 +++++++++++++ > > 3 files changed, 116 insertions(+) > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > > new file mode 100644 > > index 00000000000..d56952f5f24 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c > > @@ -0,0 +1,53 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > > + > > +/* PR tree-optimization/117414 */ > > + > > +/* Fre1 should figure out that `*aaa != 0` > > + For f0, f1, and f2. */ > > + > > + > > +void foo(); > > +int f0(int *aaa, int j, int t) > > +{ > > + int b = *aaa; > > + int c = b != 0; > > + int d = t != 0; > > + if (d | c) > > + return 0; > > + for(int i = 0; i < j; i++) > > + { > > + if (*aaa) foo(); > > + } > > + return 0; > > +} > > + > > +int f1(int *aaa, int j, int t) > > +{ > > + int b = *aaa; > > + if (b != 0 || t != 0) > > + return 0; > > + for(int i = 0; i < j; i++) > > + { > > + if (*aaa) foo(); > > + } > > + return 0; > > +} > > + > > + > > +int f2(int *aaa, int j, int t) > > +{ > > + int b = *aaa; > > + if (b != 0) > > + return 0; > > + if (t != 0) > > + return 0; > > + for(int i = 0; i < j; i++) > > + { > > + if (*aaa) foo(); > > + } > > + return 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ > > +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > > new file mode 100644 > > index 00000000000..0123a5b54f7 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c > > @@ -0,0 +1,27 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > > + > > +/* PR tree-optimization/117414 */ > > + > > +/* Fre1 should figure out that `*aaa != 0` > > + For f0, f1, and f2. */ > > + > > + > > +void foo(); > > +int f0(int *aaa, int j, int t) > > +{ > > + int b = *aaa; > > + int d = b | t; > > + if (d == 0) > > + ; > > + else > > + return 0; > > + for(int i = 0; i < j; i++) > > + { > > + if (*aaa) foo(); > > + } > > + return 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ > > +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ > > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc > > index a11bf968670..c6dddd0ba6d 100644 > > --- a/gcc/tree-ssa-sccvn.cc > > +++ b/gcc/tree-ssa-sccvn.cc > > @@ -7901,6 +7901,21 @@ static void > > insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, > > edge true_e, edge false_e) > > { > > + /* If both edges are null, then there is nothing to be done. */ > > + if (!true_e && !false_e) > > + return; > > + > > + /* Canonicalize the comparison so the rhs are constants. */ > > + if (CONSTANT_CLASS_P (lhs)) > > + { > > + std::swap (lhs, rhs); > > + code = swap_tree_comparison (code); > > + } > > + > > + /* If the lhs is not a ssa name, don't record anything. */ > > + if (TREE_CODE (lhs) != SSA_NAME) > > + return; > > + > > tree_code icode = invert_tree_comparison (code, HONOR_NANS (lhs)); > > tree ops[2]; > > ops[0] = lhs; > > @@ -7929,6 +7944,27 @@ insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, > > if (false_e) > > insert_related_predicates_on_edge (icode, ops, false_e); > > } > > + if (integer_zerop (rhs) > > + && (code == NE_EXPR || code == EQ_EXPR)) > > + { > > + gimple *def_stmt = SSA_NAME_DEF_STMT (lhs); > > + /* (a | b) == 0 -> > > + on true edge assert: a == 0 & b == 0. */ > > + /* (a | b) != 0 -> > > + on false edge assert: a == 0 & b == 0. */ > > + if (is_gimple_assign (def_stmt) > > + && gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR) > > + { > > + edge e = code == EQ_EXPR ? true_e : false_e; > > + tree nlhs; > > + > > + nlhs = vn_valueize (gimple_assign_rhs1 (def_stmt)); > > Do we need to care for the case (do we want to avoid it) when both > operands end up constant but we don't fold because of FP? (I'm not > sure if that ever happens - the non-folding, that is) I think with this patch alone, we won't get `CONST CMP CONST` but with the patch that adds `(a CMP b) !=/== 0` might in the end get `CONST CMP CONST` (I thought I saw that case previously but I didn't do much debugging and just added the check and moved it to be part of this patch). Thanks, Andrew > > Anyway, the patch looks OK. > > Thanks, > Richard. > > > + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); > > + > > + nlhs = vn_valueize (gimple_assign_rhs2 (def_stmt)); > > + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); > > + } > > + } > > } > > > > /* Main stmt worker for RPO VN, process BB. */ > > -- > > 2.43.0 > >
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c new file mode 100644 index 00000000000..d56952f5f24 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c @@ -0,0 +1,53 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +/* PR tree-optimization/117414 */ + +/* Fre1 should figure out that `*aaa != 0` + For f0, f1, and f2. */ + + +void foo(); +int f0(int *aaa, int j, int t) +{ + int b = *aaa; + int c = b != 0; + int d = t != 0; + if (d | c) + return 0; + for(int i = 0; i < j; i++) + { + if (*aaa) foo(); + } + return 0; +} + +int f1(int *aaa, int j, int t) +{ + int b = *aaa; + if (b != 0 || t != 0) + return 0; + for(int i = 0; i < j; i++) + { + if (*aaa) foo(); + } + return 0; +} + + +int f2(int *aaa, int j, int t) +{ + int b = *aaa; + if (b != 0) + return 0; + if (t != 0) + return 0; + for(int i = 0; i < j; i++) + { + if (*aaa) foo(); + } + return 0; +} + +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c new file mode 100644 index 00000000000..0123a5b54f7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +/* PR tree-optimization/117414 */ + +/* Fre1 should figure out that `*aaa != 0` + For f0, f1, and f2. */ + + +void foo(); +int f0(int *aaa, int j, int t) +{ + int b = *aaa; + int d = b | t; + if (d == 0) + ; + else + return 0; + for(int i = 0; i < j; i++) + { + if (*aaa) foo(); + } + return 0; +} + +/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */ +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */ diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index a11bf968670..c6dddd0ba6d 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@ -7901,6 +7901,21 @@ static void insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, edge true_e, edge false_e) { + /* If both edges are null, then there is nothing to be done. */ + if (!true_e && !false_e) + return; + + /* Canonicalize the comparison so the rhs are constants. */ + if (CONSTANT_CLASS_P (lhs)) + { + std::swap (lhs, rhs); + code = swap_tree_comparison (code); + } + + /* If the lhs is not a ssa name, don't record anything. */ + if (TREE_CODE (lhs) != SSA_NAME) + return; + tree_code icode = invert_tree_comparison (code, HONOR_NANS (lhs)); tree ops[2]; ops[0] = lhs; @@ -7929,6 +7944,27 @@ insert_predicates_for_cond (tree_code code, tree lhs, tree rhs, if (false_e) insert_related_predicates_on_edge (icode, ops, false_e); } + if (integer_zerop (rhs) + && (code == NE_EXPR || code == EQ_EXPR)) + { + gimple *def_stmt = SSA_NAME_DEF_STMT (lhs); + /* (a | b) == 0 -> + on true edge assert: a == 0 & b == 0. */ + /* (a | b) != 0 -> + on false edge assert: a == 0 & b == 0. */ + if (is_gimple_assign (def_stmt) + && gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR) + { + edge e = code == EQ_EXPR ? true_e : false_e; + tree nlhs; + + nlhs = vn_valueize (gimple_assign_rhs1 (def_stmt)); + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); + + nlhs = vn_valueize (gimple_assign_rhs2 (def_stmt)); + insert_predicates_for_cond (EQ_EXPR, nlhs, rhs, e, nullptr); + } + } } /* Main stmt worker for RPO VN, process BB. */
For `(a | b) == 0`, we can "assert" on the true edge that both `a == 0` and `b == 0` but nothing on the false edge. For `(a | b) != 0`, we can "assert" on the false edge that both `a == 0` and `b == 0` but nothing on the true edge. This adds that predicate and allows us to optimize f0, f1, and f2 in fre-predicated-[12].c. Changes since v1: * v2: Use vn_valueize. Also canonicalize the comparison at the begining of insert_predicates_for_cond for constants to be on the rhs. Return early for non-ssa names on the lhs (after canonicalization). Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/117414 gcc/ChangeLog: * tree-ssa-sccvn.cc (insert_predicates_for_cond): Canonicalize the comparison. Don't insert anything if lhs is not a SSA_NAME. Handle `(a | b) !=/== 0`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/fre-predicated-1.c: New test. * gcc.dg/tree-ssa/fre-predicated-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> --- .../gcc.dg/tree-ssa/fre-predicated-1.c | 53 +++++++++++++++++++ .../gcc.dg/tree-ssa/fre-predicated-2.c | 27 ++++++++++ gcc/tree-ssa-sccvn.cc | 36 +++++++++++++ 3 files changed, 116 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fre-predicated-2.c