diff mbox

[RFC] PR70117, ppc long double isinf

Message ID 20160405083340.GD18129@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra April 5, 2016, 8:33 a.m. UTC
This patch fixes the incompatibility between GNUlib's 107 bit
precision LDBL_MAX for IBM extended precision and gcc's 106 bit
LDBL_MAX used to test for Inf, by just testing the high double for inf
and nan.  This agrees with the ABI which has stated for many years
that IBM extended precision "does not fully support the IEEE special
numbers NaN and INF.  These values are encoded in the high-order
double value only.  The low-order value is not significant".

I've also changed the test for nan, and both the inf test and the
subnormal test in isnormal, to just use the high double.  Changing the
subnormal test *does* allow a small range of values to be seen as
normal that previously would be rejected in a test of the whole long
double against 2**-969.  Which is why I'm making this an RFC rather
than a patch submission.

What is "subnormal" for an IBM extended precision number, anyway?  I
think the only definition that makes sense is in terms of precision.
We can't say a long double is subnormal if the low double is
subnormal, because numbers like (1.0 + 0x1p-1074) are representable
with the high double properly rounded and are clearly not close to
zero or losing precision.  So "subnormal" for IBM extended precision
is a number that has less than 106 bits of precision.  That would be
at a magnitude of less than 2**-969.  You can see that
  (0x1p-969 + 0x1p-1074)  = 0x1.000000000000000000000000008p-969
still has 106 bits of precision.  (0x1p-1074 is the smallest double
distinct from zero, and of course is subnormal.)  However,
  (0x1p-969 + -0x1p-1074) = 0x1.ffffffffffffffffffffffffffp-970
has only 105 bits of precision, if I'm counting correctly.

So testing just the high double in isnormal() returns true for a range
of 105 bit precision values, from (0x1p-969 - 0x1p-1023) to 
(0x1p-969 - 0x1p-1074).  The question is whether I should make the
isnormal() code quite nasty in order to give the right answer.
Probably yes, in which case this post becomes an explanation for why
the lower bound test in isnormal() needs to be a long double test.
Or probably better in terms of emitted code, can I get at both of the
component doubles of an IBM long double at the tree level?
VEIW_CONVERT_EXPR to a complex double perhaps?

	PR target/70117
	* builtins.c (fold_builtin_classify): For IBM extended precision,
	look at just the high-order double to test for NaN.
	(fold_builtin_interclass_mathfn): Similarly for Inf, and range
	test for IBM extended precision isnormal.

Comments

Richard Biener April 5, 2016, 9:29 a.m. UTC | #1
On Tue, Apr 5, 2016 at 10:33 AM, Alan Modra <amodra@gmail.com> wrote:
> This patch fixes the incompatibility between GNUlib's 107 bit
> precision LDBL_MAX for IBM extended precision and gcc's 106 bit
> LDBL_MAX used to test for Inf, by just testing the high double for inf
> and nan.  This agrees with the ABI which has stated for many years
> that IBM extended precision "does not fully support the IEEE special
> numbers NaN and INF.  These values are encoded in the high-order
> double value only.  The low-order value is not significant".
>
> I've also changed the test for nan, and both the inf test and the
> subnormal test in isnormal, to just use the high double.  Changing the
> subnormal test *does* allow a small range of values to be seen as
> normal that previously would be rejected in a test of the whole long
> double against 2**-969.  Which is why I'm making this an RFC rather
> than a patch submission.
>
> What is "subnormal" for an IBM extended precision number, anyway?  I
> think the only definition that makes sense is in terms of precision.
> We can't say a long double is subnormal if the low double is
> subnormal, because numbers like (1.0 + 0x1p-1074) are representable
> with the high double properly rounded and are clearly not close to
> zero or losing precision.  So "subnormal" for IBM extended precision
> is a number that has less than 106 bits of precision.  That would be
> at a magnitude of less than 2**-969.  You can see that
>   (0x1p-969 + 0x1p-1074)  = 0x1.000000000000000000000000008p-969
> still has 106 bits of precision.  (0x1p-1074 is the smallest double
> distinct from zero, and of course is subnormal.)  However,
>   (0x1p-969 + -0x1p-1074) = 0x1.ffffffffffffffffffffffffffp-970
> has only 105 bits of precision, if I'm counting correctly.
>
> So testing just the high double in isnormal() returns true for a range
> of 105 bit precision values, from (0x1p-969 - 0x1p-1023) to
> (0x1p-969 - 0x1p-1074).  The question is whether I should make the
> isnormal() code quite nasty in order to give the right answer.
> Probably yes, in which case this post becomes an explanation for why
> the lower bound test in isnormal() needs to be a long double test.
> Or probably better in terms of emitted code, can I get at both of the
> component doubles of an IBM long double at the tree level?
> VEIW_CONVERT_EXPR to a complex double perhaps?

Yes, that would work I think, the other variant would be a
BIT_FIELD_REF (but watch out for endianess?).

In general the patch looks like a good approach to me but can we
hide that

> +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> +  bool is_ibm_extended = fmt->pnan < fmt->p;

in a function somewhere in real.[ch]?

Thanks,
Richard.

>         PR target/70117
>         * builtins.c (fold_builtin_classify): For IBM extended precision,
>         look at just the high-order double to test for NaN.
>         (fold_builtin_interclass_mathfn): Similarly for Inf, and range
>         test for IBM extended precision isnormal.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 9368ed0..ed27d57 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -7529,6 +7529,9 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>
>    mode = TYPE_MODE (TREE_TYPE (arg));
>
> +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> +  bool is_ibm_extended = fmt->pnan < fmt->p;
> +
>    /* If there is no optab, try generic code.  */
>    switch (DECL_FUNCTION_CODE (fndecl))
>      {
> @@ -7538,10 +7541,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
>         tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isgr_fn, 2,
> @@ -7554,10 +7565,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>        {
>         /* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
>         REAL_VALUE_TYPE r;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&r, buf);
>         result = build_call_expr (isle_fn, 2,
> @@ -7578,15 +7597,28 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
>            islessequal(fabs(x),DBL_MAX).  */
>         tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
>         tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
> -       tree const type = TREE_TYPE (arg);
> +       tree type = TREE_TYPE (arg);
> +       machine_mode orig_mode = mode;
>         REAL_VALUE_TYPE rmax, rmin;
>         char buf[128];
>
> +       if (is_ibm_extended)
> +         {
> +           /* Use double to test the normal range of IBM extended
> +              precision.  Emin for IBM extended precision is
> +              different to emin for IEEE double, being 53 higher
> +              since the low double exponent is at least 53 lower
> +              than the high double exponent.  */
> +           type = double_type_node;
> +           mode = DFmode;
> +           arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> +         }
> +       arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
> +
>         get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
>         real_from_string (&rmax, buf);
> -       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
> +       sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
>         real_from_string (&rmin, buf);
> -       arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
>         result = build_call_expr (isle_fn, 2, arg,
>                                   build_real (type, rmax));
>         result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
> @@ -7664,6 +7696,17 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
>        if (!HONOR_NANS (arg))
>         return omit_one_operand_loc (loc, type, integer_zero_node, arg);
>
> +      {
> +       const struct real_format *fmt
> +         = FLOAT_MODE_FORMAT (TYPE_MODE (TREE_TYPE (arg)));
> +       bool is_ibm_extended = fmt->pnan < fmt->p;
> +       if (is_ibm_extended)
> +         {
> +           /* NaN and INF are encoded in the high-order double value
> +              only.  The low-order value is not significant.  */
> +           arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
> +         }
> +      }
>        arg = builtin_save_expr (arg);
>        return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
>
>
> --
> Alan Modra
> Australia Development Lab, IBM
diff mbox

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9368ed0..ed27d57 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7529,6 +7529,9 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 
   mode = TYPE_MODE (TREE_TYPE (arg));
 
+  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
+  bool is_ibm_extended = fmt->pnan < fmt->p;
+
   /* If there is no optab, try generic code.  */
   switch (DECL_FUNCTION_CODE (fndecl))
     {
@@ -7538,10 +7541,18 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
 	tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];
 
+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isgr_fn, 2,
@@ -7554,10 +7565,18 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
       {
 	/* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
 	REAL_VALUE_TYPE r;
 	char buf[128];
 
+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&r, buf);
 	result = build_call_expr (isle_fn, 2,
@@ -7578,15 +7597,28 @@  fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg)
 	   islessequal(fabs(x),DBL_MAX).  */
 	tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
 	tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
-	tree const type = TREE_TYPE (arg);
+	tree type = TREE_TYPE (arg);
+	machine_mode orig_mode = mode;
 	REAL_VALUE_TYPE rmax, rmin;
 	char buf[128];
 
+	if (is_ibm_extended)
+	  {
+	    /* Use double to test the normal range of IBM extended
+	       precision.  Emin for IBM extended precision is
+	       different to emin for IEEE double, being 53 higher
+	       since the low double exponent is at least 53 lower
+	       than the high double exponent.  */
+	    type = double_type_node;
+	    mode = DFmode;
+	    arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+	  }
+	arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
+
 	get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
 	real_from_string (&rmax, buf);
-	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
+	sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
 	real_from_string (&rmin, buf);
-	arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
 	result = build_call_expr (isle_fn, 2, arg,
 				  build_real (type, rmax));
 	result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
@@ -7664,6 +7696,17 @@  fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index)
       if (!HONOR_NANS (arg))
 	return omit_one_operand_loc (loc, type, integer_zero_node, arg);
 
+      {
+	const struct real_format *fmt
+	  = FLOAT_MODE_FORMAT (TYPE_MODE (TREE_TYPE (arg)));
+	bool is_ibm_extended = fmt->pnan < fmt->p;
+	if (is_ibm_extended)
+	  {
+	    /* NaN and INF are encoded in the high-order double value
+	       only.  The low-order value is not significant.  */
+	    arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg);
+	  }
+      }
       arg = builtin_save_expr (arg);
       return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);