Message ID | 56685C62.9070705@arm.com |
---|---|
State | New |
Headers | show |
On 12/09/2015 10:52 AM, Kyrill Tkachov wrote: > Hi Evandro, > > On 08/12/15 21:35, Evandro Menezes wrote: >> Emit square root using the Newton series >> >> 2015-12-03 Evandro Menezes <e.menezes@samsung.com> >> >> gcc/ >> * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt): >> Declare new >> function. >> * config/aarch64/aarch64-simd.md (sqrt<mode>2): New >> expansion and >> insn definitions. >> * config/aarch64/aarch64-tuning-flags.def >> (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro. >> * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define >> new function. >> * config/aarch64/aarch64.md (sqrt<mode>2): New expansion >> and insn >> definitions. >> * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt): >> Expand option >> description. >> * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise. >> >> This patch extends the patch that added support for implementing >> x^-1/2 using the Newton series by adding support for x^1/2 as well. >> >> Is it OK at this point of stage 3? >> > > A comment on the patch itself from me... > > diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def > b/gcc/config/aarch64/aarch64-tuning-flags.def > index 6f7dbce..11c6c9a 100644 > --- a/gcc/config/aarch64/aarch64-tuning-flags.def > +++ b/gcc/config/aarch64/aarch64-tuning-flags.def > @@ -30,4 +30,4 @@ > > AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) > AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT) > - > +AARCH64_EXTRA_TUNING_OPTION ("fast_sqrt", FAST_SQRT) > > That seems like a misleading name to me. > If we're doing this, that means that the sqrt instruction is not faster > than doing the inverse sqrt estimation followed by a multiply. > I think a name like "synth_sqrt" or "estimate_sqrt" or something along > those lines > is more appropriate. Unfortunately, this is the case on Exynos M1: the series is faster than the instruction. :-( So, other targets when this is also true, using the "fast_sqrt" option might make sense. Thank you,
diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def index 6f7dbce..11c6c9a 100644 --- a/gcc/config/aarch64/aarch64-tuning-flags.def +++ b/gcc/config/aarch64/aarch64-tuning-flags.def @@ -30,4 +30,4 @@ AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT) - +AARCH64_EXTRA_TUNING_OPTION ("fast_sqrt", FAST_SQRT)