Message ID | DB6PR0801MB2053D71B2E2778D10399091E83830@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On Thu, 17 Aug 2017, Wilco Dijkstra wrote:
> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
return 1.0 in that case, but x * C1 == NaN). There's another existing
transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
not a new problem.
The whole set of these match.pd transforms is guarded by
flag_unsafe_math_optimizations, which is a bit strange, on the one hand
it does not include -ffinite-math-only, but on the other hand it's
defined broadly enough to imply that.
(to be clear, I'm not objecting to the patch, just pointing out an
existing inconsistency in case someone can offer clarifications)
Alexander
On Thu, Aug 17, 2017 at 5:43 PM, Alexander Monakov <amonakov@ispras.ru> wrote: > On Thu, 17 Aug 2017, Wilco Dijkstra wrote: > >> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C). > > Note this changes the outcome for C == +Inf, x == 0 (pow is specified to > return 1.0 in that case, but x * C1 == NaN). There's another existing > transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is > not a new problem. > > The whole set of these match.pd transforms is guarded by > flag_unsafe_math_optimizations, which is a bit strange, on the one hand > it does not include -ffinite-math-only, but on the other hand it's > defined broadly enough to imply that. No, flag_unsafe_math_optimization doesn't imply -ffinite-math-only. -funsafe-math-optimization these days should guard transforms that affect the value of the result (but not its classification). There are more specific variants like -fassociative-math which also affects results but in a more specific way. > (to be clear, I'm not objecting to the patch, just pointing out an > existing inconsistency in case someone can offer clarifications) We shouldn't add such issue knowingly, I guess for this case it's easy to check that C is not +Inf. For the existing transforms with the same issue guarding with -ffinite-math-only is an appropriate fix. Richard. > Alexander
On 08/17/2017 07:56 AM, Wilco Dijkstra wrote: > This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C). > Do this only for fast-math as accuracy is reduced. This is much faster > since pow is more complex than exp - with current GLIBC the speedup is > more than 7 times for this transformation. Right. exp is painful in glibc, but pow is *dramatically* more painful and likely always will be. Siddhesh did some great work in bringing those costs down in glibc but the more code we can reasonably shunt into exp instead of pow, the better. It's likely pow will always be significantly more expensive than exp. It's also likely that predicting when these functions are going to go off the fast paths is painful. I think Richi already ack'd the V2 patch. Just wanted to chime in given I've been fairly deep on this in the past with customers on the glibc side. jeff
On 08/17/2017 09:43 AM, Alexander Monakov wrote: > On Thu, 17 Aug 2017, Wilco Dijkstra wrote: > >> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C). > > Note this changes the outcome for C == +Inf, x == 0 (pow is specified to > return 1.0 in that case, but x * C1 == NaN). There's another existing > transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is > not a new problem. That may be part of the reason why they're guarded with -ffast-math. In addition to the changes in accuracy. I haven't followed those transformations on the GCC side to know with any certainty. jeff
Jeff Law wrote: > Right. exp is painful in glibc, but pow is *dramatically* more painful > and likely always will be. > > Siddhesh did some great work in bringing those costs down in glibc but > the more code we can reasonably shunt into exp instead of pow, the better. > > It's likely pow will always be significantly more expensive than exp. > It's also likely that predicting when these functions are going to go > off the fast paths is painful. With a modern implementation there won't be any slow path - it's completely unnecessary, and you can get 100x speedup by simply doing things in a sane way. Szabolc's version of powf is almost literally doing exp(log(x) * y), so exp is about twice as fast as pow. Wilco
diff --git a/gcc/match.pd b/gcc/match.pd index 0e36f46b914bc63c257cef47152ab1aa507963e5..a614917c8c9f24bba20c521e9b5f558be4886813 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3622,6 +3622,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (logs (pows @0 @1)) (mult @1 (logs @0)))) + /* pow(C,x) -> exp(log(C)*x) if C > 0. */ + (for pows (POW) + exps (EXP) + logs (LOG) + (simplify + (pows REAL_CST@0 @1) + (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0)) + (exps (mult (logs @0) @1))))) + (for sqrts (SQRT) cbrts (CBRT) pows (POW)