Message ID | CA+=Sn1mdyqOy+6UuuTXJH6UvtdNT2AS_VLWCHU_kK3kXg+apBQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Sun, Jun 25, 2017 at 11:28 PM, Andrew Pinski <pinskia@gmail.com> wrote: > On Sun, Jun 25, 2017 at 11:18 AM, Andrew Pinski <pinskia@gmail.com> wrote: >> On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> +(for cmp (gt ge lt le) >>> + outp (convert convert negate negate) >>> + outn (negate negate convert convert) >>> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> + (simplify >>> + (cond (cmp @0 real_zerop) real_onep real_minus_onep) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >>> + && types_match (type, TREE_TYPE (@0))) >>> + (switch >>> + (if (types_match (type, float_type_node)) >>> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outp @0))) >>> + (if (types_match (type, double_type_node)) >>> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) >>> + (if (types_match (type, long_double_type_node)) >>> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0)))))) >>> >>> There is already a 1.0 of the right type in the input, it would be easier to >>> reuse it in the output than build a new one. >> >> Right. Fixed. >> >>> >>> Non-generic builtins like copysign are such a pain... We also end up missing >>> the 128-bit case that way (pre-existing problem, not your patch). We seem to >>> have a corresponding internal function, but apparently it is not used until >>> expansion (well, maybe during vectorization). >> >> Yes I noticed that while working on a different patch related to >> copysign; The generic version of a*copysign(1.0, b) [see the other >> thread where the ARM folks started a patch for it; yes it was by pure >> accident that I was working on this and really did not notice that >> thread until yesterday]. >> I was looking into a nice way of creating copysign without having to >> do the switch but I could not find one. In the end I copied was done >> already in a different location in match.pd; this is also the reason >> why I had the build_one_cst there. >> >>> >>> + /* Transform (X > 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ >>> + /* Transform (X >= 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ >>> + /* Transform (X < 0.0 ? -1.0 : 1.0) into copysign(1,X). */ >>> + /* Transform (X <= 0.0 ? -1.0 : 1.0) into copysign(1,X). */ >>> + (simplify >>> + (cond (cmp @0 real_zerop) real_minus_onep real_onep) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >>> + && types_match (type, TREE_TYPE (@0))) >>> + (switch >>> + (if (types_match (type, float_type_node)) >>> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outn @0))) >>> + (if (types_match (type, double_type_node)) >>> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outn @0))) >>> + (if (types_match (type, long_double_type_node)) >>> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outn @0))))))) >>> + >>> +/* Transform X * copysign (1.0, X) into abs(X). */ >>> +(simplify >>> + (mult:c @0 (COPYSIGN real_onep @0)) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) >>> + (abs @0))) >>> >>> I would have expected it do to the right thing for signed zero and qNaN. Can >>> you describe a case where it would give the wrong result, or are the >>> conditions just conservative? >> >> I was just being conservative; maybe too conservative but I was a bit >> worried I could get it incorrect. >> >>> >>> +/* Transform X * copysign (1.0, -X) into -abs(X). */ >>> +(simplify >>> + (mult:c @0 (COPYSIGN real_onep (negate @0))) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) >>> + (negate (abs @0)))) >>> + >>> +/* Transform copysign (-1.0, X) into copysign (1.0, X). */ >>> +(simplify >>> + (COPYSIGN real_minus_onep @0) >>> + (COPYSIGN { build_one_cst (type); } @0)) >>> >>> (simplify >>> (COPYSIGN REAL_CST@0 @1) >>> (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@0))) >>> (COPYSIGN (negate @0) @1))) >>> ? Or does that create trouble with sNaN and only the 1.0 case is worth >>> the trouble? >> >> No that is the correct way; I Noticed the other thread about copysign >> had something similar as what should be done too. >> >> I will send out a new patch after testing soon. > > New patch. > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Ok. We can see if the discussion around using internal functions can end up improving things here later. We've several cases in match.pd dealing with generating of {,f,l} variants. Richard. > Thanks, > Andrew Pinski > > ChangeLog: > * match.pd (X >/>=/</<= 0 ? 1.0 : -1.0): New patterns. > (X * copysign (1.0, X)): New pattern. > (X * copysign (1.0, -X)): New pattern. > (copysign (-1.0, CST)): New pattern. > > testsuite/ChangeLog: > * gcc.dg/tree-ssa/copy-sign-1.c: New testcase. > * gcc.dg/tree-ssa/copy-sign-2.c: New testcase. > * gcc.dg/tree-ssa/mult-abs-2.c: New testcase. > >> >> Thanks, >> Andrew >> >>> >>> -- >>> Marc Glisse
> >> +(for cmp (gt ge lt le) > >> + outp (convert convert negate negate) > >> + outn (negate negate convert convert) > >> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ > >> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ > >> + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ > >> + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ > >> +(simplify > >> + (cond (cmp @0 real_zerop) real_onep real_minus_onep) > >> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) > >> + && types_match (type, TREE_TYPE (@0))) > >> + (switch > >> + (if (types_match (type, float_type_node)) > >> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outp @0))) > >> + (if (types_match (type, double_type_node)) > >> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) > >> + (if (types_match (type, long_double_type_node)) > >> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0)))))) > >> Hi, Out of curiosity is there any reason why this transformation can't be more general? e.g. Transform (X > 0.0 ? CST : -CST) into copysign(CST, X). we would at the very least avoid a csel or a branch then. Regards, Tamar
On June 27, 2017 4:52:28 PM GMT+02:00, Tamar Christina <Tamar.Christina@arm.com> wrote: >> >> +(for cmp (gt ge lt le) >> >> + outp (convert convert negate negate) >> >> + outn (negate negate convert convert) >> >> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >> >> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >> >> + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >> >> + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >> >> +(simplify >> >> + (cond (cmp @0 real_zerop) real_onep real_minus_onep) >> >> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >> >> + && types_match (type, TREE_TYPE (@0))) >> >> + (switch >> >> + (if (types_match (type, float_type_node)) >> >> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outp @0))) >> >> + (if (types_match (type, double_type_node)) >> >> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) >> >> + (if (types_match (type, long_double_type_node)) >> >> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0)))))) >> >> > >Hi, > >Out of curiosity is there any reason why this transformation can't be >more general? > >e.g. Transform (X > 0.0 ? CST : -CST) into copysign(CST, X). That's also possible, yes. >we would at the very least avoid a csel or a branch then. > >Regards, >Tamar
On Tue, Jun 27, 2017 at 7:56 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On June 27, 2017 4:52:28 PM GMT+02:00, Tamar Christina <Tamar.Christina@arm.com> wrote: >>> >> +(for cmp (gt ge lt le) >>> >> + outp (convert convert negate negate) >>> >> + outn (negate negate convert convert) >>> >> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> >> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> >> + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> >> + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> >> +(simplify >>> >> + (cond (cmp @0 real_zerop) real_onep real_minus_onep) >>> >> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >>> >> + && types_match (type, TREE_TYPE (@0))) >>> >> + (switch >>> >> + (if (types_match (type, float_type_node)) >>> >> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outp @0))) >>> >> + (if (types_match (type, double_type_node)) >>> >> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) >>> >> + (if (types_match (type, long_double_type_node)) >>> >> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0)))))) >>> >> >> >>Hi, >> >>Out of curiosity is there any reason why this transformation can't be >>more general? >> >>e.g. Transform (X > 0.0 ? CST : -CST) into copysign(CST, X). > > That's also possible, yes. I will be implementing that latter today. Thanks, Andrew Pinski > >>we would at the very least avoid a csel or a branch then. >> >>Regards, >>Tamar >
On Sun, Jun 25, 2017 at 2:28 PM, Andrew Pinski <pinskia@gmail.com> wrote: > On Sun, Jun 25, 2017 at 11:18 AM, Andrew Pinski <pinskia@gmail.com> wrote: >> On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> +(for cmp (gt ge lt le) >>> + outp (convert convert negate negate) >>> + outn (negate negate convert convert) >>> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ >>> + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ >>> + (simplify >>> + (cond (cmp @0 real_zerop) real_onep real_minus_onep) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >>> + && types_match (type, TREE_TYPE (@0))) >>> + (switch >>> + (if (types_match (type, float_type_node)) >>> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outp @0))) >>> + (if (types_match (type, double_type_node)) >>> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) >>> + (if (types_match (type, long_double_type_node)) >>> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0)))))) >>> >>> There is already a 1.0 of the right type in the input, it would be easier to >>> reuse it in the output than build a new one. >> >> Right. Fixed. >> >>> >>> Non-generic builtins like copysign are such a pain... We also end up missing >>> the 128-bit case that way (pre-existing problem, not your patch). We seem to >>> have a corresponding internal function, but apparently it is not used until >>> expansion (well, maybe during vectorization). >> >> Yes I noticed that while working on a different patch related to >> copysign; The generic version of a*copysign(1.0, b) [see the other >> thread where the ARM folks started a patch for it; yes it was by pure >> accident that I was working on this and really did not notice that >> thread until yesterday]. >> I was looking into a nice way of creating copysign without having to >> do the switch but I could not find one. In the end I copied was done >> already in a different location in match.pd; this is also the reason >> why I had the build_one_cst there. >> >>> >>> + /* Transform (X > 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ >>> + /* Transform (X >= 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ >>> + /* Transform (X < 0.0 ? -1.0 : 1.0) into copysign(1,X). */ >>> + /* Transform (X <= 0.0 ? -1.0 : 1.0) into copysign(1,X). */ >>> + (simplify >>> + (cond (cmp @0 real_zerop) real_minus_onep real_onep) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) >>> + && types_match (type, TREE_TYPE (@0))) >>> + (switch >>> + (if (types_match (type, float_type_node)) >>> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (outn @0))) >>> + (if (types_match (type, double_type_node)) >>> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outn @0))) >>> + (if (types_match (type, long_double_type_node)) >>> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outn @0))))))) >>> + >>> +/* Transform X * copysign (1.0, X) into abs(X). */ >>> +(simplify >>> + (mult:c @0 (COPYSIGN real_onep @0)) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) >>> + (abs @0))) >>> >>> I would have expected it do to the right thing for signed zero and qNaN. Can >>> you describe a case where it would give the wrong result, or are the >>> conditions just conservative? >> >> I was just being conservative; maybe too conservative but I was a bit >> worried I could get it incorrect. >> >>> >>> +/* Transform X * copysign (1.0, -X) into -abs(X). */ >>> +(simplify >>> + (mult:c @0 (COPYSIGN real_onep (negate @0))) >>> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) >>> + (negate (abs @0)))) >>> + >>> +/* Transform copysign (-1.0, X) into copysign (1.0, X). */ >>> +(simplify >>> + (COPYSIGN real_minus_onep @0) >>> + (COPYSIGN { build_one_cst (type); } @0)) >>> >>> (simplify >>> (COPYSIGN REAL_CST@0 @1) >>> (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@0))) >>> (COPYSIGN (negate @0) @1))) >>> ? Or does that create trouble with sNaN and only the 1.0 case is worth >>> the trouble? >> >> No that is the correct way; I Noticed the other thread about copysign >> had something similar as what should be done too. >> >> I will send out a new patch after testing soon. > > New patch. > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > Thanks, > Andrew Pinski > > ChangeLog: > * match.pd (X >/>=/</<= 0 ? 1.0 : -1.0): New patterns. > (X * copysign (1.0, X)): New pattern. > (X * copysign (1.0, -X)): New pattern. > (copysign (-1.0, CST)): New pattern. > > testsuite/ChangeLog: > * gcc.dg/tree-ssa/copy-sign-1.c: New testcase. > * gcc.dg/tree-ssa/copy-sign-2.c: New testcase. > * gcc.dg/tree-ssa/mult-abs-2.c: New testcase. > This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81255
Index: match.pd =================================================================== --- match.pd (revision 249626) +++ match.pd (working copy) @@ -155,6 +155,58 @@ || !COMPLEX_FLOAT_TYPE_P (type))) (negate @0))) +(for cmp (gt ge lt le) + outp (convert convert negate negate) + outn (negate negate convert convert) + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */ + /* Transform (X < 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ + /* Transform (X <= 0.0 ? 1.0 : -1.0) into copysign(1,-X). */ + (simplify + (cond (cmp @0 real_zerop) real_onep@1 real_minus_onep) + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) + && types_match (type, TREE_TYPE (@0))) + (switch + (if (types_match (type, float_type_node)) + (BUILT_IN_COPYSIGNF @1 (outp @0))) + (if (types_match (type, double_type_node)) + (BUILT_IN_COPYSIGN @1 (outp @0))) + (if (types_match (type, long_double_type_node)) + (BUILT_IN_COPYSIGNL @1 (outp @0)))))) + /* Transform (X > 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ + /* Transform (X >= 0.0 ? -1.0 : 1.0) into copysign(1,-X). */ + /* Transform (X < 0.0 ? -1.0 : 1.0) into copysign(1,X). */ + /* Transform (X <= 0.0 ? -1.0 : 1.0) into copysign(1,X). */ + (simplify + (cond (cmp @0 real_zerop) real_minus_onep real_onep@1) + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type) + && types_match (type, TREE_TYPE (@0))) + (switch + (if (types_match (type, float_type_node)) + (BUILT_IN_COPYSIGNF @1 (outn @0))) + (if (types_match (type, double_type_node)) + (BUILT_IN_COPYSIGN @1 (outn @0))) + (if (types_match (type, long_double_type_node)) + (BUILT_IN_COPYSIGNL @1 (outn @0))))))) + +/* Transform X * copysign (1.0, X) into abs(X). */ +(simplify + (mult:c @0 (COPYSIGN real_onep @0)) + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) + (abs @0))) + +/* Transform X * copysign (1.0, -X) into -abs(X). */ +(simplify + (mult:c @0 (COPYSIGN real_onep (negate @0))) + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) + (negate (abs @0)))) + +/* Transform copysign (CST, X) into copysign (ABS(CST), X). */ +(simplify + (COPYSIGN REAL_CST@0 @1) + (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@0))) + (COPYSIGN (negate @0) @1))) + /* X * 1, X / 1 -> X. */ (for op (mult trunc_div ceil_div floor_div round_div exact_div) (simplify Index: testsuite/gcc.dg/tree-ssa/copy-sign-1.c =================================================================== --- testsuite/gcc.dg/tree-ssa/copy-sign-1.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/copy-sign-1.c (working copy) @@ -0,0 +1,35 @@ +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */ +/* { dg-do compile } */ +float f(float x) +{ + return (x > 0.f ? -1.f : 1.f); +} +float f1(float x) +{ + return (x > 0.f ? 1.f : -1.f); +} +float g(float x) +{ + return (x >= 0.f ? -1.f : 1.f); +} +float g1(float x) +{ + return (x >= 0.f ? 1.f : -1.f); +} +float h(float x) +{ + return (x < 0.f ? -1.f : 1.f); +} +float h1(float x) +{ + return (x < 0.f ? 1.f : -1.f); +} +float i(float x) +{ + return (x <= 0.f ? -1.f : 1.f); +} +float i1(float x) +{ + return (x <= 0.f ? 1.f : -1.f); +} +/* { dg-final { scan-tree-dump-times "copysign" 8 "gimple"} } */ Index: testsuite/gcc.dg/tree-ssa/copy-sign-2.c =================================================================== --- testsuite/gcc.dg/tree-ssa/copy-sign-2.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/copy-sign-2.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */ +/* { dg-do compile } */ +float f(float x) +{ + float t = __builtin_copysignf (1.0f, x); + return x * t; +} +float f1(float x) +{ + float t = __builtin_copysignf (1.0f, -x); + return x * t; +} +/* { dg-final { scan-tree-dump-times "ABS" 2 "optimized"} } */ Index: testsuite/gcc.dg/tree-ssa/mult-abs-2.c =================================================================== --- testsuite/gcc.dg/tree-ssa/mult-abs-2.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/mult-abs-2.c (working copy) @@ -0,0 +1,35 @@ +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */ +/* { dg-do compile } */ +float f(float x) +{ + return x * (x > 0.f ? -1.f : 1.f); +} +float f1(float x) +{ + return x * (x > 0.f ? 1.f : -1.f); +} +float g(float x) +{ + return x * (x >= 0.f ? -1.f : 1.f); +} +float g1(float x) +{ + return x * (x >= 0.f ? 1.f : -1.f); +} +float h(float x) +{ + return x * (x < 0.f ? -1.f : 1.f); +} +float h1(float x) +{ + return x * (x < 0.f ? 1.f : -1.f); +} +float i(float x) +{ + return x * (x <= 0.f ? -1.f : 1.f); +} +float i1(float x) +{ + return x * (x <= 0.f ? 1.f : -1.f); +} +/* { dg-final { scan-tree-dump-times "ABS" 8 "gimple"} } */