Message ID | 54B3EDC9.7080004@arm.com |
---|---|
State | New |
Headers | show |
On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Hi all, > > As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and > discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a > call to the library sqrt at -O0. To avoid that this patch uses a target > builtin for sqrt on DF mode and uses that to implement the intrinsic. > > With this patch I don't see sqrt calls being created at -O0 on a large > arm_neon.h testcase where they were generated before. > aarch64-none-elf testing and the intrinsics testsuite in particular are > clean. > Ok for trunk? Maybe have a target fold which folds this into sqrt if -fno-math-errno is supplied. This might be useful the -ffast-math case. Maybe also fold it when a constant is supplied too. Thanks, Andrew > > Thanks, > Kyrill > > 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. > * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf > instead of __builtin_sqrt.
On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote: > On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > > Hi all, > > > > As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and > > discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a > > call to the library sqrt at -O0. To avoid that this patch uses a target > > builtin for sqrt on DF mode and uses that to implement the intrinsic. > > > > With this patch I don't see sqrt calls being created at -O0 on a large > > arm_neon.h testcase where they were generated before. > > aarch64-none-elf testing and the intrinsics testsuite in particular are > > clean. > > Ok for trunk? > > Maybe have a target fold which folds this into sqrt if -fno-math-errno > is supplied. This might be useful the -ffast-math case. > Maybe also fold it when a constant is supplied too. Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0 in the way Kyrill proposed than languishing on a TODO list. Though an IOU ticket on bugzilla for the missed optimization seems a good idea to me. Unless Kyrill already has something in the works to address your comment, this looks like the right short-term solution to me (Though Marcus/Richard will have to approve it). Thanks, James > > 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > > > * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. > > * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf > > instead of __builtin_sqrt. >
On 19/01/15 15:44, James Greenhalgh wrote: > On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote: >> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >>> Hi all, >>> >>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and >>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a >>> call to the library sqrt at -O0. To avoid that this patch uses a target >>> builtin for sqrt on DF mode and uses that to implement the intrinsic. >>> >>> With this patch I don't see sqrt calls being created at -O0 on a large >>> arm_neon.h testcase where they were generated before. >>> aarch64-none-elf testing and the intrinsics testsuite in particular are >>> clean. >>> Ok for trunk? >> Maybe have a target fold which folds this into sqrt if -fno-math-errno >> is supplied. This might be useful the -ffast-math case. >> Maybe also fold it when a constant is supplied too. > Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0 > in the way Kyrill proposed than languishing on a TODO list. Though an > IOU ticket on bugzilla for the missed optimization seems a good idea > to me. > > Unless Kyrill already has something in the works to address your > comment, this looks like the right short-term solution to me > (Though Marcus/Richard will have to approve it). Sorry, this slipped through the cracks. I agree with James. A missed-optimization issue on bugzilla would be helpful to keep track of this. Kyrill > > Thanks, > James > >>> 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. >>> * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf >>> instead of __builtin_sqrt.
On 19/01/15 15:46, Kyrill Tkachov wrote: > On 19/01/15 15:44, James Greenhalgh wrote: >> On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote: >>> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >>>> Hi all, >>>> >>>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and >>>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a >>>> call to the library sqrt at -O0. To avoid that this patch uses a target >>>> builtin for sqrt on DF mode and uses that to implement the intrinsic. >>>> >>>> With this patch I don't see sqrt calls being created at -O0 on a large >>>> arm_neon.h testcase where they were generated before. >>>> aarch64-none-elf testing and the intrinsics testsuite in particular are >>>> clean. >>>> Ok for trunk? >>> Maybe have a target fold which folds this into sqrt if -fno-math-errno >>> is supplied. This might be useful the -ffast-math case. >>> Maybe also fold it when a constant is supplied too. >> Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0 >> in the way Kyrill proposed than languishing on a TODO list. Though an >> IOU ticket on bugzilla for the missed optimization seems a good idea >> to me. >> >> Unless Kyrill already has something in the works to address your >> comment, this looks like the right short-term solution to me >> (Though Marcus/Richard will have to approve it). > Sorry, this slipped through the cracks. > I agree with James. A missed-optimization issue on bugzilla would be > helpful to keep track of this. I've filed PR 64821 to keep track of this for GCC 6. Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then? It's a regression fix at -O0 so should be appropriate for stage4 Thanks, Kyrill > > Kyrill > >> Thanks, >> James >> >>>> 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>> >>>> * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. >>>> * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf >>>> instead of __builtin_sqrt. > >
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html Thanks, Kyrill On 27/01/15 09:45, Kyrill Tkachov wrote: > On 19/01/15 15:46, Kyrill Tkachov wrote: >> On 19/01/15 15:44, James Greenhalgh wrote: >>> On Mon, Jan 12, 2015 at 05:30:46PM +0000, Andrew Pinski wrote: >>>> On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >>>>> Hi all, >>>>> >>>>> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and >>>>> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a >>>>> call to the library sqrt at -O0. To avoid that this patch uses a target >>>>> builtin for sqrt on DF mode and uses that to implement the intrinsic. >>>>> >>>>> With this patch I don't see sqrt calls being created at -O0 on a large >>>>> arm_neon.h testcase where they were generated before. >>>>> aarch64-none-elf testing and the intrinsics testsuite in particular are >>>>> clean. >>>>> Ok for trunk? >>>> Maybe have a target fold which folds this into sqrt if -fno-math-errno >>>> is supplied. This might be useful the -ffast-math case. >>>> Maybe also fold it when a constant is supplied too. >>> Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0 >>> in the way Kyrill proposed than languishing on a TODO list. Though an >>> IOU ticket on bugzilla for the missed optimization seems a good idea >>> to me. >>> >>> Unless Kyrill already has something in the works to address your >>> comment, this looks like the right short-term solution to me >>> (Though Marcus/Richard will have to approve it). >> Sorry, this slipped through the cracks. >> I agree with James. A missed-optimization issue on bugzilla would be >> helpful to keep track of this. > I've filed PR 64821 to keep track of this for GCC 6. > Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then? > It's a regression fix at -O0 so should be appropriate for stage4 > > Thanks, > Kyrill > >> Kyrill >> >>> Thanks, >>> James >>> >>>>> 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>>> >>>>> * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. >>>>> * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf >>>>> instead of __builtin_sqrt. >> > >
On 12 January 2015 at 15:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Hi all, > > As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and > discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a > call to the library sqrt at -O0. To avoid that this patch uses a target > builtin for sqrt on DF mode and uses that to implement the intrinsic. > > With this patch I don't see sqrt calls being created at -O0 on a large > arm_neon.h testcase where they were generated before. > aarch64-none-elf testing and the intrinsics testsuite in particular are > clean. > Ok for trunk? > > Thanks, > Kyrill > > 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. > * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf > instead of __builtin_sqrt. OK /Marcus
On 4 February 2015 at 12:38, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote: > On 12 January 2015 at 15:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >> Hi all, >> >> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and >> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a >> call to the library sqrt at -O0. To avoid that this patch uses a target >> builtin for sqrt on DF mode and uses that to implement the intrinsic. >> >> With this patch I don't see sqrt calls being created at -O0 on a large >> arm_neon.h testcase where they were generated before. >> aarch64-none-elf testing and the intrinsics testsuite in particular are >> clean. >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2015-01-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. >> * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf >> instead of __builtin_sqrt. > > OK /Marcus Hi, I have noticed that this patch makes the following test fail: FAIL: gcc.dg/tree-ssa/foldconst-6.c scan-tree-dump-not ccp1 "666" on aarch64 targets. Christophe.
commit 865be1cc8365886904d571e244746815e2317162 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Fri Jan 9 12:18:59 2015 +0000 [AArch64] Use target builtin for vsqrt_f64 diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index b41d9f6..60cd1d7 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -41,7 +41,7 @@ BUILTIN_VDC (COMBINE, combine, 0) BUILTIN_VB (BINOP, pmul, 0) - BUILTIN_VDQF (UNOP, sqrt, 2) + BUILTIN_VDQF_DF (UNOP, sqrt, 2) BUILTIN_VD_BHSI (BINOP, addp, 0) VAR1 (UNOP, addp, 0, di) BUILTIN_VDQ_BHSI (UNOP, clrsb, 2) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index c679802..3b151a2 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -22194,7 +22194,7 @@ vsqrtq_f32 (float32x4_t a) __extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) vsqrt_f64 (float64x1_t a) { - return (float64x1_t) { __builtin_sqrt (a[0]) }; + return (float64x1_t) { __builtin_aarch64_sqrtdf (a[0]) }; } __extension__ static __inline float64x2_t __attribute__ ((__always_inline__))