diff mbox

[AArch64] Inline calls to lrint when possible

Message ID VI1PR0801MB20312637867C59D802147A5CFFC80@VI1PR0801MB2031.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Tamar Christina June 7, 2017, 11:38 a.m. UTC
Hi All,

This patch allows the inlining of lrint when -fno-math-errno
assuming that errno does not need to be set when the rounded value
is not representable as a long.

The case

void f(double *a, long *b, double x)
{
	*a = __builtin_rint(x);
	*b = __builtin_lrint(x);
}

now generates with -fno-math-errno:

f:
	frintx	d0, d0
	fcvtzs	x2, d0
	str	d0, [x0]
	str	x2, [x1]
	ret

When the flag is not used the same function call is emitted as before:

f:
	stp	x29, x30, [sp, -32]!
	frintx	d1, d0
	add	x29, sp, 0
	str	x19, [sp, 16]
	mov	x19, x1
	str	d1, [x0]
	bl	lrint
	str	x0, [x19]
	ldr	x19, [sp, 16]
	ldp	x29, x30, [sp], 32
	ret

Bootstrapped and regtested on aarch64-none-linux-gnu and no regressions.
The patch also has no regressions on Spec2006.

Ok for trunk?

gcc/
2017-06-07  Tamar Christina  <tamar.christina@arm.com>

	* config/aarch64/aarch64.md (lrint<GPF:mode><GPI:mode>2): New.

gcc/testsuite/
2017-06-07  Tamar Christina  <tamar.christina@arm.com>

	* gcc.target/aarch64/lrint-matherr.h: New.
	* gcc.target/aarch64/inline-lrint_1.c: New.
	* gcc.target/aarch64/inline-lrint_2.c: New.
	* gcc.target/aarch64/no-inline-lrint_1.c: New.
	* gcc.target/aarch64/no-inline-lrint_2.c: New.

Comments

James Greenhalgh June 8, 2017, 4:50 p.m. UTC | #1
On Wed, Jun 07, 2017 at 12:38:27PM +0100, Tamar Christina wrote:
> Hi All,
> 
> This patch allows the inlining of lrint when -fno-math-errno
> assuming that errno does not need to be set when the rounded value
> is not representable as a long.
> 
> The case
> 
> void f(double *a, long *b, double x)
> {
> 	*a = __builtin_rint(x);
> 	*b = __builtin_lrint(x);
> }
> 
> now generates with -fno-math-errno:
> 
> f:
> 	frintx	d0, d0
> 	fcvtzs	x2, d0
> 	str	d0, [x0]
> 	str	x2, [x1]
> 	ret
> 
> When the flag is not used the same function call is emitted as before:
> 
> f:
> 	stp	x29, x30, [sp, -32]!
> 	frintx	d1, d0
> 	add	x29, sp, 0
> 	str	x19, [sp, 16]
> 	mov	x19, x1
> 	str	d1, [x0]
> 	bl	lrint
> 	str	x0, [x19]
> 	ldr	x19, [sp, 16]
> 	ldp	x29, x30, [sp], 32
> 	ret
> 
> Bootstrapped and regtested on aarch64-none-linux-gnu and no regressions.
> The patch also has no regressions on Spec2006.
> 
> Ok for trunk?

OK.

Thanks,
James

> 
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* config/aarch64/aarch64.md (lrint<GPF:mode><GPI:mode>2): New.
> 
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* gcc.target/aarch64/lrint-matherr.h: New.
> 	* gcc.target/aarch64/inline-lrint_1.c: New.
> 	* gcc.target/aarch64/inline-lrint_2.c: New.
> 	* gcc.target/aarch64/no-inline-lrint_1.c: New.
> 	* gcc.target/aarch64/no-inline-lrint_2.c: New.
Christophe Lyon June 12, 2017, 12:10 p.m. UTC | #2
Hi Tamar,

On 8 June 2017 at 18:50, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> On Wed, Jun 07, 2017 at 12:38:27PM +0100, Tamar Christina wrote:
>> Hi All,
>>
>> This patch allows the inlining of lrint when -fno-math-errno
>> assuming that errno does not need to be set when the rounded value
>> is not representable as a long.
>>
>> The case
>>
>> void f(double *a, long *b, double x)
>> {
>>       *a = __builtin_rint(x);
>>       *b = __builtin_lrint(x);
>> }
>>
>> now generates with -fno-math-errno:
>>
>> f:
>>       frintx  d0, d0
>>       fcvtzs  x2, d0
>>       str     d0, [x0]
>>       str     x2, [x1]
>>       ret
>>
>> When the flag is not used the same function call is emitted as before:
>>
>> f:
>>       stp     x29, x30, [sp, -32]!
>>       frintx  d1, d0
>>       add     x29, sp, 0
>>       str     x19, [sp, 16]
>>       mov     x19, x1
>>       str     d1, [x0]
>>       bl      lrint
>>       str     x0, [x19]
>>       ldr     x19, [sp, 16]
>>       ldp     x29, x30, [sp], 32
>>       ret
>>
>> Bootstrapped and regtested on aarch64-none-linux-gnu and no regressions.
>> The patch also has no regressions on Spec2006.
>>
>> Ok for trunk?
>
> OK.
>
> Thanks,
> James
>
>>
>> gcc/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>       * config/aarch64/aarch64.md (lrint<GPF:mode><GPI:mode>2): New.
>>
>> gcc/testsuite/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>       * gcc.target/aarch64/lrint-matherr.h: New.
>>       * gcc.target/aarch64/inline-lrint_1.c: New.
>>       * gcc.target/aarch64/inline-lrint_2.c: New.
>>       * gcc.target/aarch64/no-inline-lrint_1.c: New.
>>       * gcc.target/aarch64/no-inline-lrint_2.c: New.
>
>

The tests
    gcc.target/aarch64/inline-lrint_1.c scan-assembler-times
fcvtzs\tx[0-9]+, d[0-9]+ 3
    gcc.target/aarch64/inline-lrint_1.c scan-assembler-times
fcvtzs\tx[0-9]+, s[0-9]+ 3
    gcc.target/aarch64/inline-lrint_1.c scan-assembler-times
frintx\td[0-9]+, d[0-9]+ 3
    gcc.target/aarch64/inline-lrint_1.c scan-assembler-times
frintx\ts[0-9]+, s[0-9]+ 3
    gcc.target/aarch64/no-inline-lrint_1.c scan-assembler-times
frintx\td[0-9]+, d[0-9]+ 3
    gcc.target/aarch64/no-inline-lrint_1.c scan-assembler-times
frintx\ts[0-9]+, s[0-9]+ 3
fail on aarch64 bare-metal targets (and pass on aarch64-linux-gnu)
Tamar Christina June 12, 2017, 2:29 p.m. UTC | #3
Hi Christophe,

Thanks, I've committed a fix to the testcase.

Tamar
Szabolcs Nagy Aug. 10, 2017, 2:58 p.m. UTC | #4
On 07/06/17 12:38, Tamar Christina wrote:
> Hi All,
> 
> This patch allows the inlining of lrint when -fno-math-errno
> assuming that errno does not need to be set when the rounded value
> is not representable as a long.
> 

turns out emitting frintx+fcvtzs is wrong for ilp32
because spurious inexact may be raised when the final
result is out-of-bound for 32bit long.

i opened
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81800

> The case
> 
> void f(double *a, long *b, double x)
> {
> 	*a = __builtin_rint(x);
> 	*b = __builtin_lrint(x);
> }
> 
> now generates with -fno-math-errno:
> 
> f:
> 	frintx	d0, d0
> 	fcvtzs	x2, d0
> 	str	d0, [x0]
> 	str	x2, [x1]
> 	ret
> 
> When the flag is not used the same function call is emitted as before:
> 
> f:
> 	stp	x29, x30, [sp, -32]!
> 	frintx	d1, d0
> 	add	x29, sp, 0
> 	str	x19, [sp, 16]
> 	mov	x19, x1
> 	str	d1, [x0]
> 	bl	lrint
> 	str	x0, [x19]
> 	ldr	x19, [sp, 16]
> 	ldp	x29, x30, [sp], 32
> 	ret
> 
> Bootstrapped and regtested on aarch64-none-linux-gnu and no regressions.
> The patch also has no regressions on Spec2006.
> 
> Ok for trunk?
> 
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* config/aarch64/aarch64.md (lrint<GPF:mode><GPI:mode>2): New.
> 
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* gcc.target/aarch64/lrint-matherr.h: New.
> 	* gcc.target/aarch64/inline-lrint_1.c: New.
> 	* gcc.target/aarch64/inline-lrint_2.c: New.
> 	* gcc.target/aarch64/no-inline-lrint_1.c: New.
> 	* gcc.target/aarch64/no-inline-lrint_2.c: New.
>
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5adc5edb8dde9c30450b04932a37c41f84cc5ed1..c65159085e342f7611104b2890de99fc02e6fb8e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4997,6 +4997,18 @@ 
   [(set_attr "type" "f_minmax<stype>")]
 )
 
+(define_expand "lrint<GPF:mode><GPI:mode>2"
+  [(match_operand:GPI 0 "register_operand")
+   (match_operand:GPF 1 "register_operand")]
+  "TARGET_FLOAT"
+{
+  rtx cvt = gen_reg_rtx (<GPF:MODE>mode);
+  emit_insn (gen_rint<GPF:mode>2 (cvt, operands[1]));
+  emit_insn (gen_lbtrunc<GPF:mode><GPI:mode>2 (operands[0], cvt));
+  DONE;
+}
+)
+
 ;; For copysign (x, y), we want to generate:
 ;;
 ;;   LDR d2, #(1 << 63)
diff --git a/gcc/testsuite/gcc.target/aarch64/inline-lrint_1.c b/gcc/testsuite/gcc.target/aarch64/inline-lrint_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..876cecd674b7cb35bc18d5cd3aa5587813e53dd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/inline-lrint_1.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O3 -fno-math-errno" } */
+
+#include "lrint-matherr.h"
+
+TEST (dld, double, long, )
+TEST (flf, float , long, )
+
+TEST (did, double, int, )
+TEST (fif, float , int, )
+
+TEST (dlld, double, long long, l)
+TEST (fllf, float , long long, l)
+
+/* { dg-final { scan-assembler-times "frintx\td\[0-9\]+, d\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "frintx\ts\[0-9\]+, s\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\]+, d\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\]+, s\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-not "bl"    } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c b/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..baa5aee761788e2b83f8f9283bb0aa7d79aad348
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-O3 -fno-math-errno" } */
+
+#include "lrint-matherr.h"
+
+TEST (dld, double, long, )
+TEST (flf, float , long, )
+
+TEST (did, double, int, )
+TEST (fif, float , int, )
+
+TEST (dlld, double, long long, l)
+TEST (fllf, float , long long, l)
+
+/* { dg-final { scan-assembler-times "frintx\td\[0-9\]+, d\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "frintx\ts\[0-9\]+, s\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\]+, d\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\]+, s\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\]+, d\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\]+, s\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-not "bl"    } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/lrint-matherr.h b/gcc/testsuite/gcc.target/aarch64/lrint-matherr.h
new file mode 100644
index 0000000000000000000000000000000000000000..cc6e3d13f9bd47a316cc56a07917f4b5de185236
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lrint-matherr.h
@@ -0,0 +1,5 @@ 
+#define TEST(name, float_type, int_type, pref) void f_##name (float_type x) \
+{									    \
+  volatile float_type a = __builtin_rint (x);				    \
+  volatile int_type   b = __builtin_l##pref##rint (x);			    \
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_1.c b/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..fb7f0655687568e9d6783acf88ef56b54a73c2c5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_1.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O3" } */
+
+#include "lrint-matherr.h"
+
+TEST (dld, double, long, )
+TEST (flf, float , long, )
+
+TEST (did, double, int, )
+TEST (fif, float , int, )
+
+TEST (dlld, double, long long, l)
+TEST (fllf, float , long long, l)
+
+/* { dg-final { scan-assembler-times "frintx\td\[0-9\]+, d\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "frintx\ts\[0-9\]+, s\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "bl\tlrint"  4 } } */
+/* { dg-final { scan-assembler-times "bl\tllrint" 2 } } */
+/* { dg-final { scan-assembler-not "fcvtzs" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_2.c b/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..c99843c3a25fbd519f2959d3eb0ce3da3f7f16d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/no-inline-lrint_2.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-O3" } */
+
+#include "lrint-matherr.h"
+
+TEST (dld, double, long, )
+TEST (flf, float , long, )
+
+TEST (did, double, int, )
+TEST (fif, float , int, )
+
+TEST (dlld, double, long long, l)
+TEST (fllf, float , long long, l)
+
+/* { dg-final { scan-assembler-times "frintx\td\[0-9\]+, d\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "frintx\ts\[0-9\]+, s\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "bl\tlrint"  4 } } */
+/* { dg-final { scan-assembler-times "bl\tllrint" 2 } } */
+/* { dg-final { scan-assembler-not "fcvtzs" } } */