Message ID | DB6PR0801MB20531AD31F805B799B74E470834C0@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | Canonicalize constant multiplies in division | expand |
On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > This patch implements some of the optimizations discussed in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026. > > Canonicalize x / (C1 * y) into (x * C2) / y. > > This moves constant multiplies out of the RHS of a division in order > to allow further simplifications (such as (C1 * x) / (C2 * y) -> > (C3 * x) / y) and to enable more reciprocal CSEs. > > OK for commit? > > ChangeLog > 2017-10-17 Wilco Dijkstra <wdijkstr@arm.com> > Jackson Woodruff <jackson.woodruff@arm.com> > > gcc/ > PR 71026/tree-optimization > * match.pd: Canonicalize constant multiplies in division. > > gcc/testsuite/ > PR 71026/tree-optimization > * gcc.dg/cse_recip.c: New test. > -- > > diff --git a/gcc/match.pd b/gcc/match.pd > index ade851f78fb9ac6ce03b752f63e03f3b5a19cda9..532fabf51ce8a45d54147a3ae0b3917e22b1a4d0 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -342,10 +342,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (negate @0))) > > (if (flag_reciprocal_math) > - /* Convert (A/B)/C to A/(B*C) */ > + /* Convert (A/B)/C to A/(B*C). */ > (simplify > (rdiv (rdiv:s @0 @1) @2) > - (rdiv @0 (mult @1 @2))) > + (rdiv @0 (mult @1 @2))) > + > + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ > + (if (optimize) why if (optimize) here? The pattern you removed has no such check. As discussed this may undo CSE of C1 * y so please check for a single-use on the mult with :s > + (simplify > + (rdiv @0 (mult @1 REAL_CST@2)) > + (if (!real_zerop (@1)) why this check? The pattern below didn't have it. Richard. > + (with > + { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); } > + (if (tem) > + (rdiv (mult @0 { tem; } ) @1)))))) > > /* Convert A/(B/C) to (A/B)*C */ > (simplify > @@ -628,15 +638,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (tem) > (rdiv { tem; } @1))))) > > -/* Convert C1/(X*C2) into (C1/C2)/X */ > -(simplify > - (rdiv REAL_CST@0 (mult @1 REAL_CST@2)) > - (if (flag_reciprocal_math) > - (with > - { tree tem = const_binop (RDIV_EXPR, type, @0, @2); } > - (if (tem) > - (rdiv { tem; } @1))))) > - > /* Simplify ~X & X as zero. */ > (simplify > (bit_and:c (convert? @0) (convert? (bit_not @0))) > diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c > new file mode 100644 > index 0000000000000000000000000000000000000000..20ed529c33ebecc911fb540a8b2b597bba0023e6 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/cse_recip.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Ofast -fdump-tree-optimized" } */ > + > +void > +cse_recip (float x, float y, float *a) > +{ > + a[0] = y / (5 * x); > + a[1] = y / (3 * x); > + a[2] = y / x; > +} > + > +/* { dg-final { scan-tree-dump-times " / " 1 "optimized" } } */ >
Richard Biener wrote: > On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: >> (if (flag_reciprocal_math) >> - /* Convert (A/B)/C to A/(B*C) */ >> + /* Convert (A/B)/C to A/(B*C). */ >> (simplify >> (rdiv (rdiv:s @0 @1) @2) >> - (rdiv @0 (mult @1 @2))) >> + (rdiv @0 (mult @1 @2))) >> + >> + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ >> + (if (optimize) > > why if (optimize) here? The pattern you removed has no > such check. As discussed this may undo CSE of C1 * y > so please check for a single-use on the mult with :s I think that came from an earlier version of this patch. I've removed it and added a single use check. >> + (simplify >> + (rdiv @0 (mult @1 REAL_CST@2)) >> + (if (!real_zerop (@1)) > > why this check? The pattern below didn't have it. Presumably to avoid the change when dividing by zero. I've removed it, here is the updated version. This passes bootstrap and regress: ChangeLog 2017-11-15 Wilco Dijkstra <wdijkstr@arm.com> Jackson Woodruff <jackson.woodruff@arm.com> gcc/ PR 71026/tree-optimization * match.pd: Canonicalize constant multiplies in division. gcc/testsuite/ PR 71026/tree-optimization * gcc.dg/cse_recip.c: New test. -- diff --git a/gcc/match.pd b/gcc/match.pd index b5042b783c0830a2da08c44bed39842a17911844..ea7d90ed977cfff991d74bee54e91ecb209b6030 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -344,10 +344,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (negate @0))) (if (flag_reciprocal_math) - /* Convert (A/B)/C to A/(B*C) */ + /* Convert (A/B)/C to A/(B*C). */ (simplify (rdiv (rdiv:s @0 @1) @2) - (rdiv @0 (mult @1 @2))) + (rdiv @0 (mult @1 @2))) + + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ + (simplify + (rdiv @0 (mult:s @1 REAL_CST@2)) + (with + { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); } + (if (tem) + (rdiv (mult @0 { tem; } ) @1)))) /* Convert A/(B/C) to (A/B)*C */ (simplify @@ -646,15 +654,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tem) (rdiv { tem; } @1))))) -/* Convert C1/(X*C2) into (C1/C2)/X */ -(simplify - (rdiv REAL_CST@0 (mult @1 REAL_CST@2)) - (if (flag_reciprocal_math) - (with - { tree tem = const_binop (RDIV_EXPR, type, @0, @2); } - (if (tem) - (rdiv { tem; } @1))))) - /* Simplify ~X & X as zero. */ (simplify (bit_and:c (convert? @0) (convert? (bit_not @0))) diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c new file mode 100644 index 0000000000000000000000000000000000000000..88cba9930c0eb1fdee22a797eff110cd9a14fcda --- /dev/null +++ b/gcc/testsuite/gcc.dg/cse_recip.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-optimized-raw" } */ + +void +cse_recip (float x, float y, float *a) +{ + a[0] = y / (5 * x); + a[1] = y / (3 * x); + a[2] = y / x; +} + +/* { dg-final { scan-tree-dump-times "rdiv_expr" 1 "optimized" } } */
On Wed, Nov 15, 2017 at 3:39 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > Richard Biener wrote: >> On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > >>> (if (flag_reciprocal_math) >>> - /* Convert (A/B)/C to A/(B*C) */ >>> + /* Convert (A/B)/C to A/(B*C). */ >>> (simplify >>> (rdiv (rdiv:s @0 @1) @2) >>> - (rdiv @0 (mult @1 @2))) >>> + (rdiv @0 (mult @1 @2))) >>> + >>> + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ >>> + (if (optimize) >> >> why if (optimize) here? The pattern you removed has no >> such check. As discussed this may undo CSE of C1 * y >> so please check for a single-use on the mult with :s > > I think that came from an earlier version of this patch. I've removed it > and added a single use check. > >>> + (simplify >>> + (rdiv @0 (mult @1 REAL_CST@2)) >>> + (if (!real_zerop (@1)) >> >> why this check? The pattern below didn't have it. > > Presumably to avoid the change when dividing by zero. I've removed it, here is > the updated version. This passes bootstrap and regress: Ok. Richard. > > ChangeLog > 2017-11-15 Wilco Dijkstra <wdijkstr@arm.com> > Jackson Woodruff <jackson.woodruff@arm.com> > > gcc/ > PR 71026/tree-optimization > * match.pd: Canonicalize constant multiplies in division. > > gcc/testsuite/ > PR 71026/tree-optimization > * gcc.dg/cse_recip.c: New test. > -- > > diff --git a/gcc/match.pd b/gcc/match.pd > index b5042b783c0830a2da08c44bed39842a17911844..ea7d90ed977cfff991d74bee54e91ecb209b6030 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -344,10 +344,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (negate @0))) > > (if (flag_reciprocal_math) > - /* Convert (A/B)/C to A/(B*C) */ > + /* Convert (A/B)/C to A/(B*C). */ > (simplify > (rdiv (rdiv:s @0 @1) @2) > - (rdiv @0 (mult @1 @2))) > + (rdiv @0 (mult @1 @2))) > + > + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ > + (simplify > + (rdiv @0 (mult:s @1 REAL_CST@2)) > + (with > + { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); } > + (if (tem) > + (rdiv (mult @0 { tem; } ) @1)))) > > /* Convert A/(B/C) to (A/B)*C */ > (simplify > @@ -646,15 +654,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (tem) > (rdiv { tem; } @1))))) > > -/* Convert C1/(X*C2) into (C1/C2)/X */ > -(simplify > - (rdiv REAL_CST@0 (mult @1 REAL_CST@2)) > - (if (flag_reciprocal_math) > - (with > - { tree tem = const_binop (RDIV_EXPR, type, @0, @2); } > - (if (tem) > - (rdiv { tem; } @1))))) > - > /* Simplify ~X & X as zero. */ > (simplify > (bit_and:c (convert? @0) (convert? (bit_not @0))) > diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c > new file mode 100644 > index 0000000000000000000000000000000000000000..88cba9930c0eb1fdee22a797eff110cd9a14fcda > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/cse_recip.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Ofast -fdump-tree-optimized-raw" } */ > + > +void > +cse_recip (float x, float y, float *a) > +{ > + a[0] = y / (5 * x); > + a[1] = y / (3 * x); > + a[2] = y / x; > +} > + > +/* { dg-final { scan-tree-dump-times "rdiv_expr" 1 "optimized" } } */ > > > >
diff --git a/gcc/match.pd b/gcc/match.pd index ade851f78fb9ac6ce03b752f63e03f3b5a19cda9..532fabf51ce8a45d54147a3ae0b3917e22b1a4d0 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -342,10 +342,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (negate @0))) (if (flag_reciprocal_math) - /* Convert (A/B)/C to A/(B*C) */ + /* Convert (A/B)/C to A/(B*C). */ (simplify (rdiv (rdiv:s @0 @1) @2) - (rdiv @0 (mult @1 @2))) + (rdiv @0 (mult @1 @2))) + + /* Canonicalize x / (C1 * y) to (x * C2) / y. */ + (if (optimize) + (simplify + (rdiv @0 (mult @1 REAL_CST@2)) + (if (!real_zerop (@1)) + (with + { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); } + (if (tem) + (rdiv (mult @0 { tem; } ) @1)))))) /* Convert A/(B/C) to (A/B)*C */ (simplify @@ -628,15 +638,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tem) (rdiv { tem; } @1))))) -/* Convert C1/(X*C2) into (C1/C2)/X */ -(simplify - (rdiv REAL_CST@0 (mult @1 REAL_CST@2)) - (if (flag_reciprocal_math) - (with - { tree tem = const_binop (RDIV_EXPR, type, @0, @2); } - (if (tem) - (rdiv { tem; } @1))))) - /* Simplify ~X & X as zero. */ (simplify (bit_and:c (convert? @0) (convert? (bit_not @0))) diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c new file mode 100644 index 0000000000000000000000000000000000000000..20ed529c33ebecc911fb540a8b2b597bba0023e6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cse_recip.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-optimized" } */ + +void +cse_recip (float x, float y, float *a) +{ + a[0] = y / (5 * x); + a[1] = y / (3 * x); + a[2] = y / x; +} + +/* { dg-final { scan-tree-dump-times " / " 1 "optimized" } } */