Message ID | alpine.DEB.2.02.1604220507430.2722@laptop-mg.saclay.inria.fr |
---|---|
State | New |
Headers | show |
On Fri, Apr 22, 2016 at 5:29 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > Hello, > > this optimizes a common pattern for unsigned overflow detection, when one of > the arguments turns out to be a constant. There are more ways this could > look like, (a + 42 <= 41) in particular, but that'll be for another patch. This case is also covered by fold_comparison which should be re-written to match.pd patterns (and removed from fold-const.c). fold_binary also as a few interesting/similar equality compare cases like X +- Y CMP X to Y CMP 0 which look related. Also your case is in fold_binary for the case of undefined overflow: /* Transform comparisons of the form X +- C CMP X. */ if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR) && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0) && ((TREE_CODE (TREE_OPERAND (arg0, 1)) == REAL_CST && !HONOR_SNANS (arg0)) || (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1))))) { ... +/* When one argument is a constant, overflow detection can be simplified. + Currently restricted to single use so as not to interfere too much with + ADD_OVERFLOW detection in tree-ssa-math-opts.c. */ +(for cmp (lt le ge gt) + out (gt gt le le) + (simplify + (cmp (plus@2 @0 integer_nonzerop@1) @0) + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) + && TYPE_MAX_VALUE (TREE_TYPE (@0)) + && single_use (@2)) + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) +(for cmp (gt ge le lt) + out (gt gt le le) + (simplify + (cmp @0 (plus@2 @0 integer_nonzerop@1)) + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) + && TYPE_MAX_VALUE (TREE_TYPE (@0)) + && single_use (@2)) + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) please add a comment with the actual transform - A + CST CMP A -> A CMP' CST' As we are relying on twos-complement wrapping you shouldn't need TYPE_MAX_VALUE here but you can use wi::max_value (precision, sign). I'm not sure we have sensible TYPE_MAX_VALUE for vector or complex types - the accessor uses NUMERICAL_TYPE_CKECK and TYPE_OVERFLOW_WRAPS checks for ANY_INTEGRAL_TYPE. Thus I wonder if we should restrict this to INTEGRAL_TYPE_P (making the wi::max_value route valid). Thanks, Richard. > Bootstrap+regtest on powerpc64le-unknown-linux-gnu. > 2016-04-22 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * match.pd (X + CST CMP X): New transformation. > > gcc/testsuite/ > * gcc.dg/tree-ssa/overflow-1.c: New testcase. > > -- > Marc Glisse
On Fri, 22 Apr 2016, Richard Biener wrote: > On Fri, Apr 22, 2016 at 5:29 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >> Hello, >> >> this optimizes a common pattern for unsigned overflow detection, when one of >> the arguments turns out to be a constant. There are more ways this could >> look like, (a + 42 <= 41) in particular, but that'll be for another patch. > > This case is also covered by fold_comparison which should be re-written > to match.pd patterns (and removed from fold-const.c). > > fold_binary also as a few interesting/similar equality compare cases > like X +- Y CMP X to Y CMP 0 which look related. > > Also your case is in fold_binary for the case of undefined overflow: As far as I can tell, fold-const.c handles this kind of transformation strictly in the case of undefined overflow (or floats), while this is strictly in the case of unsigned with wrapping overflow. I thought it would be more readable to take advantage of the genmatch machinery and group the wrapping transforms in one place, and the undefined overflow ones in another place (they don't group the same way by operator, etc). If you prefer to group by pattern shape and port the related fold-const.c bit at the same time, I could try that... > +/* When one argument is a constant, overflow detection can be simplified. > + Currently restricted to single use so as not to interfere too much with > + ADD_OVERFLOW detection in tree-ssa-math-opts.c. */ > +(for cmp (lt le ge gt) > + out (gt gt le le) > + (simplify > + (cmp (plus@2 @0 integer_nonzerop@1) @0) > + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) > + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) > + && TYPE_MAX_VALUE (TREE_TYPE (@0)) > + && single_use (@2)) > + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) > +(for cmp (gt ge le lt) > + out (gt gt le le) > + (simplify > + (cmp @0 (plus@2 @0 integer_nonzerop@1)) > + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) > + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) > + && TYPE_MAX_VALUE (TREE_TYPE (@0)) > + && single_use (@2)) > + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) > > please add a comment with the actual transform - A + CST CMP A -> A CMP' CST' > > As we are relying on twos-complement wrapping you shouldn't need TYPE_MAX_VALUE > here but you can use wi::max_value (precision, sign). I'm not sure we > have sensible > TYPE_MAX_VALUE for vector or complex types - the accessor uses > NUMERICAL_TYPE_CKECK > and TYPE_OVERFLOW_WRAPS checks for ANY_INTEGRAL_TYPE. Thus I wonder > if we should restrict this to INTEGRAL_TYPE_P (making the > wi::max_value route valid). integer_nonzerop currently already restricts to INTEGER_CST or COMPLEX_CST, and I don't think complex can appear in a comparison. I'll go back to writing the more explicit INTEGER_CST in the pattern and I'll use wide_int. Thanks,
Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 235350) +++ gcc/match.pd (working copy) @@ -3071,10 +3071,32 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify /* signbit(x) -> 0 if x is nonnegative. */ (SIGNBIT tree_expr_nonnegative_p@0) { integer_zero_node; }) (simplify /* signbit(x) -> x<0 if x doesn't have signed zeros. */ (SIGNBIT @0) (if (!HONOR_SIGNED_ZEROS (@0)) (convert (lt @0 { build_real (TREE_TYPE (@0), dconst0); })))) + +/* When one argument is a constant, overflow detection can be simplified. + Currently restricted to single use so as not to interfere too much with + ADD_OVERFLOW detection in tree-ssa-math-opts.c. */ +(for cmp (lt le ge gt) + out (gt gt le le) + (simplify + (cmp (plus@2 @0 integer_nonzerop@1) @0) + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) + && TYPE_MAX_VALUE (TREE_TYPE (@0)) + && single_use (@2)) + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) +(for cmp (gt ge le lt) + out (gt gt le le) + (simplify + (cmp @0 (plus@2 @0 integer_nonzerop@1)) + (if (TYPE_UNSIGNED (TREE_TYPE (@0)) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) + && TYPE_MAX_VALUE (TREE_TYPE (@0)) + && single_use (@2)) + (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) Index: gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +int f(unsigned a){ + unsigned b=5; + unsigned c=a-b; + return c>a; +} +int g(unsigned a){ + unsigned b=32; + unsigned c=a+b; + return c<a; +} + +/* { dg-final { scan-tree-dump "a_\[0-9\]+.D. <= 4;" "optimized" } } */ +/* { dg-final { scan-tree-dump "a_\[0-9\]+.D. > 4294967263;" "optimized" } } */