diff mbox

[v2] Simplify pow with constant

Message ID DB6PR0801MB2053D71B2E2778D10399091E83830@DB6PR0801MB2053.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra Aug. 17, 2017, 1:56 p.m. UTC
This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
Do this only for fast-math as accuracy is reduced.  This is much faster
since pow is more complex than exp - with current GLIBC the speedup is
more than 7 times for this transformation.

The worst-case ULP error of the transformation for powf (10.0, x) in SPEC
was 2.5.  If we allow use of exp10 in match.pd, the ULP error would be lower.

OK for commit?

ChangeLog:
2017-08-17  Wilco Dijkstra  <wdijkstr@arm.com>

	* match.pd: Add pow (C, x) simplification.
--

Comments

Alexander Monakov Aug. 17, 2017, 3:43 p.m. UTC | #1
On Thu, 17 Aug 2017, Wilco Dijkstra wrote:

> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).

Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
return 1.0 in that case, but x * C1 == NaN).  There's another existing
transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
not a new problem.

The whole set of these match.pd transforms is guarded by
flag_unsafe_math_optimizations, which is a bit strange, on the one hand
it does not include -ffinite-math-only, but on the other hand it's
defined broadly enough to imply that.

(to be clear, I'm not objecting to the patch, just pointing out an
existing inconsistency in case someone can offer clarifications)

Alexander
Richard Biener Aug. 18, 2017, 8:02 a.m. UTC | #2
On Thu, Aug 17, 2017 at 5:43 PM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Thu, 17 Aug 2017, Wilco Dijkstra wrote:
>
>> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
>
> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
> return 1.0 in that case, but x * C1 == NaN).  There's another existing
> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
> not a new problem.
>
> The whole set of these match.pd transforms is guarded by
> flag_unsafe_math_optimizations, which is a bit strange, on the one hand
> it does not include -ffinite-math-only, but on the other hand it's
> defined broadly enough to imply that.

No, flag_unsafe_math_optimization doesn't imply -ffinite-math-only.
-funsafe-math-optimization these days should guard transforms that
affect the value of the result (but not its classification).  There are
more specific variants like -fassociative-math which also affects
results but in a more specific way.

> (to be clear, I'm not objecting to the patch, just pointing out an
> existing inconsistency in case someone can offer clarifications)

We shouldn't add such issue knowingly, I guess for this case it's easy
to check that C is not +Inf.

For the existing transforms with the same issue guarding with
-ffinite-math-only is an appropriate fix.

Richard.

> Alexander
Jeff Law Aug. 24, 2017, 10:22 p.m. UTC | #3
On 08/17/2017 07:56 AM, Wilco Dijkstra wrote:
> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
> Do this only for fast-math as accuracy is reduced.  This is much faster
> since pow is more complex than exp - with current GLIBC the speedup is
> more than 7 times for this transformation.
Right.  exp is painful in glibc, but pow is *dramatically* more painful
and likely always will be.

Siddhesh did some great work in bringing those costs down in glibc but
the more code we can reasonably shunt into exp instead of pow, the better.

It's likely pow will always be significantly more expensive than exp.
It's also likely that predicting when these functions are going to go
off the fast paths is painful.

I think Richi already ack'd the V2 patch.  Just wanted to chime in given
I've been fairly deep on this in the past with customers on the glibc side.

jeff
Jeff Law Aug. 24, 2017, 10:22 p.m. UTC | #4
On 08/17/2017 09:43 AM, Alexander Monakov wrote:
> On Thu, 17 Aug 2017, Wilco Dijkstra wrote:
> 
>> This patch simplifies pow (C, x) into exp (x * C1) if C > 0, C1 = log (C).
> 
> Note this changes the outcome for C == +Inf, x == 0 (pow is specified to
> return 1.0 in that case, but x * C1 == NaN).  There's another existing
> transform with the same issue, 'pow(expN(x), y) -> expN(x*y)', so this is
> not a new problem.
That may be part of the reason why they're guarded with -ffast-math.  In
addition to the changes in accuracy.  I haven't followed those
transformations on the GCC side to know with any certainty.

jeff
Wilco Dijkstra Aug. 25, 2017, 1:23 p.m. UTC | #5
Jeff Law wrote:
> Right.  exp is painful in glibc, but pow is *dramatically* more painful
> and likely always will be.
>
> Siddhesh did some great work in bringing those costs down in glibc but
> the more code we can reasonably shunt into exp instead of pow, the better.
>
> It's likely pow will always be significantly more expensive than exp.
> It's also likely that predicting when these functions are going to go
> off the fast paths is painful.

With a modern implementation there won't be any slow path - it's completely
unnecessary, and you can get 100x speedup by simply doing things in a
sane way.

Szabolc's version of powf is almost literally doing exp(log(x) * y), so exp is
about twice as fast as pow.

Wilco
diff mbox

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 0e36f46b914bc63c257cef47152ab1aa507963e5..a614917c8c9f24bba20c521e9b5f558be4886813 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3622,6 +3622,15 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (logs (pows @0 @1))
    (mult @1 (logs @0))))
 
+ /* pow(C,x) -> exp(log(C)*x) if C > 0.  */
+ (for pows (POW)
+      exps (EXP)
+      logs (LOG)
+  (simplify
+   (pows REAL_CST@0 @1)
+    (if (real_compare (GT_EXPR, TREE_REAL_CST_PTR (@0), &dconst0))
+     (exps (mult (logs @0) @1)))))
+
  (for sqrts (SQRT)
       cbrts (CBRT)
       pows (POW)