[tree-optimization] : [3 of 3]: Boolify compares & more

Message ID	CAEwic4YkA+GT-KDjb-65=8whqVXevNPOu+KHo4a-4ze-jasxQA@mail.gmail.com
State	New
Headers	show Return-Path: <gcc-patches-return-296241-incoming=patchwork.ozlabs.org@gcc.gnu.org> MIME-Version: 1.0 Date: Thu, 7 Jul 2011 18:07:59 +0200 Message-ID: <CAEwic4YkA+GT-KDjb-65=8whqVXevNPOu+KHo4a-4ze-jasxQA@mail.gmail.com> Subject: [patch tree-optimization]: [3 of 3]: Boolify compares & more From: Kai Tietz <ktietz70@googlemail.com> To: GCC Patches <gcc-patches@gcc.gnu.org> Cc: Richard Guenther <richard.guenther@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

Kai Tietz July 7, 2011, 4:07 p.m. UTC

Hello,

This patch - third of series - fixes vrp to handle bitwise one-bit
precision typed operations.
And it introduces a second - limitted to non-switch-statement range - vrp pass.

Bootstrapped and regression tested for all standard-languages (plus
Ada and Obj-C++) on host x86_64-pc-linux-gnu.

Ok for apply?

Regards,
Kai

ChangeLog

2011-07-07  Kai Tietz  <ktietz@redhat.com>

	* tree-vrp.c (in_second_pass): New static variable.
	(extract_range_from_binary_expr): Add handling for
	BIT_IOR_EXPR, BIT_AND_EXPR, and BIT_NOT_EXPR.
	(register_edge_assert_for_1): Add handling for 1-bit
	BIT_IOR_EXPR and BIT_NOT_EXPR.
	(register_edge_assert_for): Add handling for 1-bit
	BIT_IOR_EXPR.
	(ssa_name_get_inner_ssa_name_p): New helper function.
	(ssa_name_get_cast_to_p): New helper function.
	(simplify_truth_ops_using_ranges): Handle prefixed
	cast instruction for result, and add support for one
	bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
	, and BIT_NOT_EXPR.
	(simplify_stmt_using_ranges): Add handling for one bit
	precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
	and BIT_NOT_EXPR.
	(vrp_finalize): Do substitute and fold pass a second
	time for vrp_stmt and preserve switch-edge simplification
	on second run.
	(simplify_switch_using_ranges): Preserve rerun of function
	in second pass.

Paolo Bonzini July 7, 2011, 4:14 p.m. UTC | #1

On 07/07/2011 06:07 PM, Kai Tietz wrote:
> +  /* We redo folding here one time for allowing to inspect more
> +     complex reductions.  */
> +  substitute_and_fold (op_with_constant_singleton_value_range,
> +		       vrp_fold_stmt, false);
> +  /* We need to mark this second pass to avoid re-entering of same
> +     edges for switch statments.  */
> +  in_second_pass = true;
>     substitute_and_fold (op_with_constant_singleton_value_range,
>   		       vrp_fold_stmt, false);
> +  in_second_pass = false;

This needs a much better explanation.

Paolo

Kai Tietz July 7, 2011, 4:28 p.m. UTC | #2

2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>
>> +  /* We redo folding here one time for allowing to inspect more
>> +     complex reductions.  */
>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>> +                      vrp_fold_stmt, false);
>> +  /* We need to mark this second pass to avoid re-entering of same
>> +     edges for switch statments.  */
>> +  in_second_pass = true;
>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>                       vrp_fold_stmt, false);
>> +  in_second_pass = false;
>
> This needs a much better explanation.
>
> Paolo

Well, I can work on a better comment.  The complex reduction I mean
here are cases like

int x;
int y;
_Bool D1;
_Bool D2;
_Bool D3;
int R;

D1 = x[0..1] != 0;
D2 = y[0..1] != 0;
D3 = D1 & D2
R = (int) D3

(testcase is already present. See tree-ssa/vrp47.c).

As VRP in first pass produces (and replaces) to:

D1 = (_Bool) x[0..1];
D2 = (_Bool) y[0..1];
D3 = D1 & D2
R = (int) D3

Just in the second pass the reduction

R = x[0..1] & y[0..1]

can happen.  In general it is sad that VRP can't insert during pass
new statements right now.  This would cause issues in range-tables,
which aren't designed for insertations.  As otherwise, we could do
also simplify things like

D1 = x[0..1] != 0;
D2 = y[0..1] == 0;
D3 = D1 & D2
R = (int) D3

to
R = x[0..1] & (y[0..1] ^ 1)

Regards,
Kai

Richard Biener July 8, 2011, 9:35 a.m. UTC | #3

On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> Hello,
>
> This patch - third of series - fixes vrp to handle bitwise one-bit
> precision typed operations.
> And it introduces a second - limitted to non-switch-statement range - vrp pass.

Err - please split this patch.  I agree with Paolo, this 2nd
substitute_and_fold call is bogus.  More comments inline.

>
> Bootstrapped and regression tested for all standard-languages (plus
> Ada and Obj-C++) on host x86_64-pc-linux-gnu.
>
> Ok for apply?
>
> Regards,
> Kai
>
> ChangeLog
>
> 2011-07-07  Kai Tietz  <ktietz@redhat.com>
>
>        * tree-vrp.c (in_second_pass): New static variable.
>        (extract_range_from_binary_expr): Add handling for
>        BIT_IOR_EXPR, BIT_AND_EXPR, and BIT_NOT_EXPR.
>        (register_edge_assert_for_1): Add handling for 1-bit
>        BIT_IOR_EXPR and BIT_NOT_EXPR.
>        (register_edge_assert_for): Add handling for 1-bit
>        BIT_IOR_EXPR.
>        (ssa_name_get_inner_ssa_name_p): New helper function.
>        (ssa_name_get_cast_to_p): New helper function.
>        (simplify_truth_ops_using_ranges): Handle prefixed
>        cast instruction for result, and add support for one
>        bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
>        , and BIT_NOT_EXPR.
>        (simplify_stmt_using_ranges): Add handling for one bit
>        precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
>        and BIT_NOT_EXPR.
>        (vrp_finalize): Do substitute and fold pass a second
>        time for vrp_stmt and preserve switch-edge simplification
>        on second run.
>        (simplify_switch_using_ranges): Preserve rerun of function
>        in second pass.
>
> Index: gcc-head/gcc/tree-vrp.c
> ===================================================================
> --- gcc-head.orig/gcc/tree-vrp.c
> +++ gcc-head/gcc/tree-vrp.c
> @@ -74,6 +74,9 @@ struct value_range_d
>
>  typedef struct value_range_d value_range_t;
>
> +/* This flag indicates that we are doing a second pass of VRP.  */
> +static bool in_second_pass = false;
> +
>  /* Set of SSA names found live during the RPO traversal of the function
>    for still active basic-blocks.  */
>  static sbitmap *live;
> @@ -2232,6 +2235,7 @@ extract_range_from_binary_expr (value_ra
>      some cases.  */
>   if (code != BIT_AND_EXPR
>       && code != TRUTH_AND_EXPR
> +      && code != BIT_IOR_EXPR

Huh?  So how would VARYING | x ever produce something better
than VARYING?

>       && code != TRUTH_OR_EXPR
>       && code != TRUNC_DIV_EXPR
>       && code != FLOOR_DIV_EXPR
> @@ -2291,6 +2295,8 @@ extract_range_from_binary_expr (value_ra
>          else
>            set_value_range_to_varying (vr);
>        }
> +      else if (code == BIT_IOR_EXPR)
> +        set_value_range_to_varying (vr);

err - BIT_IOR_EXPR on pointers?

>       else
>        gcc_unreachable ();
>
> @@ -2300,11 +2306,13 @@ extract_range_from_binary_expr (value_ra
>   /* For integer ranges, apply the operation to each end of the
>      range and see what we end up with.  */
>   if (code == TRUTH_AND_EXPR
> -      || code == TRUTH_OR_EXPR)
> +      || code == TRUTH_OR_EXPR
> +      || ((code == BIT_AND_EXPR || code == BIT_IOR_EXPR)
> +          && TYPE_PRECISION (TREE_TYPE (op1)) == 1))

Rather than adding code to handle BIT_*_EXPR this patch should
transform the TRUTH_*_EXPR handling to appropriate BIT_*_EXPR
handling as we no longer have TRUTH_*_EXPR in our IL.

In fact I would say the existing BIT_*_EXPR handling should already
cover all the TRUTH_*_CASES, so this patch patches the wrong
spot if it is necessary at all.

>     {
>       /* If one of the operands is zero, we know that the whole
>         expression evaluates zero.  */
> -      if (code == TRUTH_AND_EXPR
> +      if ((code == TRUTH_AND_EXPR || code == BIT_AND_EXPR)
>          && ((vr0.type == VR_RANGE
>               && integer_zerop (vr0.min)
>               && integer_zerop (vr0.max))
> @@ -2317,7 +2325,7 @@ extract_range_from_binary_expr (value_ra
>        }
>       /* If one of the operands is one, we know that the whole
>         expression evaluates one.  */
> -      else if (code == TRUTH_OR_EXPR
> +      else if ((code == TRUTH_OR_EXPR || code == BIT_IOR_EXPR)
>               && ((vr0.type == VR_RANGE
>                    && integer_onep (vr0.min)
>                    && integer_onep (vr0.max))
> @@ -2809,7 +2817,7 @@ extract_range_from_unary_expr (value_ran
>      cannot easily determine a resulting range.  */
>   if (code == FIX_TRUNC_EXPR
>       || code == FLOAT_EXPR
> -      || code == BIT_NOT_EXPR
> +      || (code == BIT_NOT_EXPR && TYPE_PRECISION (type) != 1)
>       || code == CONJ_EXPR)
>     {
>       /* We can still do constant propagation here.  */
> @@ -3976,7 +3984,9 @@ build_assert_expr_for (tree cond, tree v
>       tree a = build2 (ASSERT_EXPR, TREE_TYPE (v), v, cond);
>       assertion = gimple_build_assign (n, a);
>     }
> -  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
> +  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR
> +          || (TREE_CODE (cond) == BIT_NOT_EXPR
> +              && TYPE_PRECISION (TREE_TYPE (cond)) == 1))
>     {
>       /* Given !V, build the assignment N = false.  */
>       tree op0 = TREE_OPERAND (cond, 0);
> @@ -4531,7 +4541,9 @@ register_edge_assert_for_1 (tree op, enu
>       retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def),
>                                            code, e, bsi);
>     }
> -  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR)
> +  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR
> +          || (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
> +              && TYPE_PRECISION (TREE_TYPE (op)) == 1))
>     {
>       /* Recurse, flipping CODE.  */
>       code = invert_tree_comparison (code, false);
> @@ -4617,6 +4629,9 @@ register_edge_assert_for (tree name, edg
>
>       if (is_gimple_assign (def_stmt)
>          && (gimple_assign_rhs_code (def_stmt) == TRUTH_OR_EXPR
> +             || (gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR
> +                 && INTEGRAL_TYPE_P (TREE_TYPE (name))
> +                 && TYPE_PRECISION (TREE_TYPE (name)) == 1)
>              /* For BIT_IOR_EXPR only if NAME == 0 both operands have
>                 necessarily zero value.  */
>              || (comp_code == EQ_EXPR
> @@ -6747,19 +6762,96 @@ varying:
>   return SSA_PROP_VARYING;
>  }
>
> +/* Returns operand1 of ssa-name with SSA_NAME as code, Otherwise it
> +   returns NULL_TREE.  */

?  Why would you want to look through a single copy?

> +static tree
> +ssa_name_get_inner_ssa_name_p (tree op)
> +{
> +  gimple stmt;
> +
> +  if (TREE_CODE (op) != SSA_NAME
> +      || !is_gimple_assign (SSA_NAME_DEF_STMT (op)))
> +    return NULL_TREE;
> +  stmt = SSA_NAME_DEF_STMT (op);
> +  if (gimple_assign_rhs_code (stmt) != SSA_NAME)
> +    return NULL_TREE;
> +  return gimple_assign_rhs1 (stmt);
> +}
> +
> +/* Returns operand of cast operation, if OP is a type-conversion. Otherwise
> +   return NULL_TREE.  */
> +static tree
> +ssa_name_get_cast_to_p (tree op)
> +{
> +  gimple stmt;
> +
> +  if (TREE_CODE (op) != SSA_NAME
> +      || !is_gimple_assign (SSA_NAME_DEF_STMT (op)))
> +    return NULL_TREE;
> +  stmt = SSA_NAME_DEF_STMT (op);
> +  if (!CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt)))
> +    return NULL_TREE;
> +  return gimple_assign_rhs1 (stmt);
> +}
> +
>  /* Simplify boolean operations if the source is known
>    to be already a boolean.  */
>  static bool
>  simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
>  {
>   enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
> +  gimple stmt2 = stmt;
>   tree val = NULL;
> -  tree op0, op1;
> +  tree op0, op1, cop0, cop1;
>   value_range_t *vr;
>   bool sop = false;
>   bool need_conversion;
> +  location_t loc = gimple_location (stmt);
>
>   op0 = gimple_assign_rhs1 (stmt);
> +  op1 = NULL_TREE;
> +
> +  /* Handle cases with prefixed type-cast.  */

What cases?  This code lacks comments.

> +  if (CONVERT_EXPR_CODE_P (rhs_code)

So this simplifies conversions, not truth ops.

> +      && INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +      && TREE_CODE (op0) == SSA_NAME
> +      && is_gimple_assign (SSA_NAME_DEF_STMT (op0))
> +      && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
> +    {
> +      stmt2 = SSA_NAME_DEF_STMT (op0);
> +      op0 = gimple_assign_rhs1 (stmt2);
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (op0)))
> +       return false;
> +      rhs_code = gimple_assign_rhs_code (stmt2);
> +      if (rhs_code != BIT_NOT_EXPR && rhs_code != TRUTH_NOT_EXPR
> +         && rhs_code != TRUTH_AND_EXPR && rhs_code != BIT_AND_EXPR
> +         && rhs_code != TRUTH_OR_EXPR && rhs_code != BIT_IOR_EXPR
> +         && rhs_code != TRUTH_XOR_EXPR && rhs_code != BIT_XOR_EXPR
> +         && rhs_code != NE_EXPR && rhs_code != EQ_EXPR)
> +       return false;
> +      if (rhs_code == BIT_AND_EXPR || rhs_code == BIT_IOR_EXPR
> +         || rhs_code == BIT_XOR_EXPR || rhs_code == TRUTH_AND_EXPR
> +         || rhs_code == TRUTH_OR_EXPR || rhs_code == TRUTH_XOR_EXPR
> +         || rhs_code == NE_EXPR || rhs_code == EQ_EXPR)
> +       op1 = gimple_assign_rhs2 (stmt2);
> +      if (gimple_has_location (stmt2))
> +        loc = gimple_location (stmt2);
> +    }
> +  else if (CONVERT_EXPR_CODE_P (rhs_code))
> +    return false;

That's funny control flow.

> +  else if (rhs_code == BIT_AND_EXPR || rhs_code == BIT_IOR_EXPR
> +      || rhs_code == BIT_XOR_EXPR || rhs_code == TRUTH_AND_EXPR
> +      || rhs_code == TRUTH_OR_EXPR || rhs_code == TRUTH_XOR_EXPR
> +      || rhs_code == NE_EXPR || rhs_code == EQ_EXPR)
> +    op1 = gimple_assign_rhs2 (stmt);
> +
> +  /* ~X is only equivalent of !X, if type-precision is one and X has
> +     an integral type.  */
> +  if (rhs_code == BIT_NOT_EXPR
> +      && (!INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +         || TYPE_PRECISION (TREE_TYPE (op0)) != 1))
> +    return false;
> +
>   if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
>     {
>       if (TREE_CODE (op0) != SSA_NAME)
> @@ -6775,22 +6867,100 @@ simplify_truth_ops_using_ranges (gimple_
>         return false;
>     }
>
> -  if (rhs_code == TRUTH_NOT_EXPR)
> +  if (op1 && TREE_CODE (op1) != INTEGER_CST
> +      && TYPE_PRECISION (TREE_TYPE (op1)) != 1)
> +    {
> +      vr = get_value_range (op1);
> +      val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
> +      if (!val || !integer_onep (val))
> +       return false;
> +
> +      val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
> +      if (!val || !integer_onep (val))
> +       return false;
> +    }
> +
> +  need_conversion =
> +    !useless_type_conversion_p (TREE_TYPE (gimple_assign_lhs (stmt)),
> +                               TREE_TYPE (op0));
> +
> +  /* As comparisons X != 0 getting folded by prior pass to (bool) X,
> +     but X == 0 might be not folded for none boolean type of X
> +     to (bool) (X ^ 1).
> +     So for bitwise-binary operations we have three cases to handle:
> +     a) ((bool) X) op ((bool) Y)
> +     b) ((bool) X) op (Y == 0) OR (X == 0) op ((bool) Y)
> +     c) (X == 0) op (Y == 0)
> +     The later two cases can't be handled for now, as vr tables
> +     would need to be adjusted.  */
> +  if (need_conversion
> +      && (rhs_code == BIT_XOR_EXPR
> +         || rhs_code == BIT_AND_EXPR
> +         || rhs_code == BIT_IOR_EXPR)
> +      && TREE_CODE (op1) == SSA_NAME && TREE_CODE (op0) == SSA_NAME)
> +    {
> +      cop0 = ssa_name_get_cast_to_p (op0);
> +      cop1 = ssa_name_get_cast_to_p (op1);
> +      if (!cop0 || !cop1)
> +        /* We would need an new statment for cases b and c, and we can't
> +           due vr table, so bail out.  */
> +        return false;
> +
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (cop0))
> +         || !types_compatible_p (TREE_TYPE (cop0), TREE_TYPE (cop1)))
> +       return false;
> +      need_conversion =
> +       !useless_type_conversion_p (TREE_TYPE (gimple_assign_lhs (stmt)),
> +                                   TREE_TYPE (cop0));
> +      if (need_conversion)
> +       return false;
> +      op0 = cop0;
> +      op1 = cop1;
> +
> +      /* We need to re-check if value ranges for new operands
> +         for 1-bit precision/range.  */
> +      if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
> +       {
> +         if (TREE_CODE (op0) != SSA_NAME)
> +           return false;
> +         vr = get_value_range (op0);
> +
> +         val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
> +         if (!val || !integer_onep (val))
> +           return false;
> +
> +         val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
> +         if (!val || !integer_onep (val))
> +           return false;
> +       }
> +
> +      if (op1 && TYPE_PRECISION (TREE_TYPE (op1)) != 1)
> +       {
> +         vr = get_value_range (op1);
> +         val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
> +         if (!val || !integer_onep (val))
> +           return false;
> +
> +         val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
> +         if (!val || !integer_onep (val))
> +           return false;
> +       }
> +    }
> +  else if (rhs_code == TRUTH_NOT_EXPR || rhs_code == BIT_NOT_EXPR)
>     {
>       rhs_code = NE_EXPR;
>       op1 = build_int_cst (TREE_TYPE (op0), 1);
>     }
>   else
>     {
> -      op1 = gimple_assign_rhs2 (stmt);
> -
>       /* Reduce number of cases to handle.  */
>       if (is_gimple_min_invariant (op1))
>        {
>           /* Exclude anything that should have been already folded.  */
>          if (rhs_code != EQ_EXPR
>              && rhs_code != NE_EXPR
> -             && rhs_code != TRUTH_XOR_EXPR)
> +             && rhs_code != TRUTH_XOR_EXPR
> +             && rhs_code != BIT_XOR_EXPR)
>            return false;
>
>          if (!integer_zerop (op1)
> @@ -6810,18 +6980,6 @@ simplify_truth_ops_using_ranges (gimple_
>          /* Punt on A == B as there is no BIT_XNOR_EXPR.  */
>          if (rhs_code == EQ_EXPR)
>            return false;
> -
> -         if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
> -           {
> -             vr = get_value_range (op1);
> -             val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
> -             if (!val || !integer_onep (val))
> -               return false;
> -
> -             val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
> -             if (!val || !integer_onep (val))
> -               return false;
> -           }
>        }
>     }
>
> @@ -6838,7 +6996,8 @@ simplify_truth_ops_using_ranges (gimple_
>         warning_at (location, OPT_Wstrict_overflow,
>                    _("assuming signed overflow does not occur when "
>                      "simplifying && or || to & or |"));
> -      else
> +      else if (rhs_code != BIT_AND_EXPR && rhs_code != BIT_IOR_EXPR
> +              && rhs_code != BIT_XOR_EXPR)
>         warning_at (location, OPT_Wstrict_overflow,
>                    _("assuming signed overflow does not occur when "
>                      "simplifying ==, != or ! to identity or ^"));
> @@ -6859,16 +7018,21 @@ simplify_truth_ops_using_ranges (gimple_
>     case TRUTH_AND_EXPR:
>       rhs_code = BIT_AND_EXPR;
>       break;
> +    case BIT_AND_EXPR:
> +      break;
>     case TRUTH_OR_EXPR:
>       rhs_code = BIT_IOR_EXPR;
> +    case BIT_IOR_EXPR:
>       break;
>     case TRUTH_XOR_EXPR:
> +    case BIT_XOR_EXPR:
>     case NE_EXPR:
>       if (integer_zerop (op1))
>        {
>          gimple_assign_set_rhs_with_ops (gsi,
>                                          need_conversion ? NOP_EXPR : SSA_NAME,
>                                          op0, NULL);
> +         gimple_set_location (stmt, loc);
>          update_stmt (gsi_stmt (*gsi));
>          return true;
>        }
> @@ -6879,10 +7043,20 @@ simplify_truth_ops_using_ranges (gimple_
>       gcc_unreachable ();
>     }
>
> +  /* We can't insert here new expression as otherwise
> +     tracked vr tables getting out of bounds.  */
>   if (need_conversion)
>     return false;
>
> +  /* Reduce here SSA_NAME -> SSA_NAME.  */
> +  while ((cop0 = ssa_name_get_inner_ssa_name_p (op0)) != NULL_TREE)
> +    op0 = cop0;
> +
> +  while ((cop1 = ssa_name_get_inner_ssa_name_p (op1)) != NULL_TREE)
> +    op1 = cop1;
> +
>   gimple_assign_set_rhs_with_ops (gsi, rhs_code, op0, op1);
> +  gimple_set_location (stmt, loc);
>   update_stmt (gsi_stmt (*gsi));
>   return true;
>  }
> @@ -7263,6 +7437,9 @@ simplify_switch_using_ranges (gimple stm
>   tree vec2;
>   switch_update su;
>
> +  if (in_second_pass)
> +    return false;
> +
>   if (TREE_CODE (op) == SSA_NAME)
>     {
>       vr = get_value_range (op);
> @@ -7390,6 +7567,7 @@ simplify_stmt_using_ranges (gimple_stmt_
>        {
>        case EQ_EXPR:
>        case NE_EXPR:
> +       case BIT_NOT_EXPR:
>        case TRUTH_NOT_EXPR:
>        case TRUTH_AND_EXPR:
>        case TRUTH_OR_EXPR:
> @@ -7425,13 +7603,21 @@ simplify_stmt_using_ranges (gimple_stmt_
>             if all the bits being cleared are already cleared or
>             all the bits being set are already set.  */
>          if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
> -           return simplify_bit_ops_using_ranges (gsi, stmt);
> +           {
> +             if (simplify_truth_ops_using_ranges (gsi, stmt))
> +               return true;
> +             return simplify_bit_ops_using_ranges (gsi, stmt);
> +           }
>          break;
>
>        CASE_CONVERT:
>          if (TREE_CODE (rhs1) == SSA_NAME
>              && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
> -           return simplify_conversion_using_ranges (stmt);
> +           {
> +             if (simplify_truth_ops_using_ranges (gsi, stmt))
> +               return true;
> +             return simplify_conversion_using_ranges (stmt);
> +           }
>          break;
>
>        default:
> @@ -7685,8 +7870,16 @@ vrp_finalize (void)
>       fprintf (dump_file, "\n");
>     }
>
> +  /* We redo folding here one time for allowing to inspect more
> +     complex reductions.  */
> +  substitute_and_fold (op_with_constant_singleton_value_range,
> +                      vrp_fold_stmt, false);
> +  /* We need to mark this second pass to avoid re-entering of same
> +     edges for switch statments.  */
> +  in_second_pass = true;
>   substitute_and_fold (op_with_constant_singleton_value_range,
>                       vrp_fold_stmt, false);
> +  in_second_pass = false;

If at all you only want to re-call vrp_fold_stmt on all stmts in the
function, not do a full-blown substitute_and_fold.

Richard.

>   if (warn_array_bounds)
>     check_all_array_refs ();
>

Richard Biener July 8, 2011, 9:39 a.m. UTC | #4

On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>
>>> +  /* We redo folding here one time for allowing to inspect more
>>> +     complex reductions.  */
>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>> +                      vrp_fold_stmt, false);
>>> +  /* We need to mark this second pass to avoid re-entering of same
>>> +     edges for switch statments.  */
>>> +  in_second_pass = true;
>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>                       vrp_fold_stmt, false);
>>> +  in_second_pass = false;
>>
>> This needs a much better explanation.
>>
>> Paolo
>
> Well, I can work on a better comment.  The complex reduction I mean
> here are cases like
>
> int x;
> int y;
> _Bool D1;
> _Bool D2;
> _Bool D3;
> int R;
>
> D1 = x[0..1] != 0;
> D2 = y[0..1] != 0;
> D3 = D1 & D2
> R = (int) D3
>
> (testcase is already present. See tree-ssa/vrp47.c).
>
> As VRP in first pass produces (and replaces) to:
>
> D1 = (_Bool) x[0..1];
> D2 = (_Bool) y[0..1];
> D3 = D1 & D2
> R = (int) D3
>
> Just in the second pass the reduction
>
> R = x[0..1] & y[0..1]

So why wouldn't that happen during the first pass?  The first
pass could change the IL to

 D1 = x[0..1] != 0;
 D2 = y[0..1] != 0;
 D3 = D1 & D2;
 R = x & y;

if D3 only has a single use.

> can happen.  In general it is sad that VRP can't insert during pass
> new statements right now.  This would cause issues in range-tables,
> which aren't designed for insertations.  As otherwise, we could do
> also simplify things like
>
> D1 = x[0..1] != 0;
> D2 = y[0..1] == 0;
> D3 = D1 & D2
> R = (int) D3
>
> to
> R = x[0..1] & (y[0..1] ^ 1)

Why that ^ 1?  And why does that confuse the range tables
if you re-use R?

> Regards,
> Kai
>

Kai Tietz July 8, 2011, 10:51 a.m. UTC | #5

2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>>
>>>> +  /* We redo folding here one time for allowing to inspect more
>>>> +     complex reductions.  */
>>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>>> +                      vrp_fold_stmt, false);
>>>> +  /* We need to mark this second pass to avoid re-entering of same
>>>> +     edges for switch statments.  */
>>>> +  in_second_pass = true;
>>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>>                       vrp_fold_stmt, false);
>>>> +  in_second_pass = false;
>>>
>>> This needs a much better explanation.
>>>
>>> Paolo
>>
>> Well, I can work on a better comment.  The complex reduction I mean
>> here are cases like
>>
>> int x;
>> int y;
>> _Bool D1;
>> _Bool D2;
>> _Bool D3;
>> int R;
>>
>> D1 = x[0..1] != 0;
>> D2 = y[0..1] != 0;
>> D3 = D1 & D2
>> R = (int) D3
>>
>> (testcase is already present. See tree-ssa/vrp47.c).
>>
>> As VRP in first pass produces (and replaces) to:
>>
>> D1 = (_Bool) x[0..1];
>> D2 = (_Bool) y[0..1];
>> D3 = D1 & D2
>> R = (int) D3
>>
>> Just in the second pass the reduction
>>
>> R = x[0..1] & y[0..1]
>
> So why wouldn't that happen during the first pass?  The first
> pass could change the IL to
>
>  D1 = x[0..1] != 0;
>  D2 = y[0..1] != 0;
>  D3 = D1 & D2;
>  R = x & y;
>
> if D3 only has a single use.
>
>> can happen.  In general it is sad that VRP can't insert during pass
>> new statements right now.  This would cause issues in range-tables,
>> which aren't designed for insertations.  As otherwise, we could do
>> also simplify things like
>>
>> D1 = x[0..1] != 0;
>> D2 = y[0..1] == 0;
>> D3 = D1 & D2
>> R = (int) D3
>>
>> to
>> R = x[0..1] & (y[0..1] ^ 1)
>
> Why that ^ 1?  And why does that confuse the range tables
> if you re-use R?

Because (y[0..1] ^1) has a type change.  All present SSA-nodes have
boolean type, but (y[0..1] ^ 1) is an integer one.  We have just the
cast def, which has final type. See code of vrp_stmt truth and you
will notice that for X with range 0..1 it converts X == 0 -> X ^ 1.
But here we have possible type change as a comparison is boolean and X
might not.

Kai

>> Regards,
>> Kai
>>
>

Kai Tietz July 8, 2011, 11:02 a.m. UTC | #6

2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>>
>>>> +  /* We redo folding here one time for allowing to inspect more
>>>> +     complex reductions.  */
>>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>>> +                      vrp_fold_stmt, false);
>>>> +  /* We need to mark this second pass to avoid re-entering of same
>>>> +     edges for switch statments.  */
>>>> +  in_second_pass = true;
>>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>>                       vrp_fold_stmt, false);
>>>> +  in_second_pass = false;
>>>
>>> This needs a much better explanation.
>>>
>>> Paolo
>>
>> Well, I can work on a better comment.  The complex reduction I mean
>> here are cases like
>>
>> int x;
>> int y;
>> _Bool D1;
>> _Bool D2;
>> _Bool D3;
>> int R;
>>
>> D1 = x[0..1] != 0;
>> D2 = y[0..1] != 0;
>> D3 = D1 & D2
>> R = (int) D3
>>
>> (testcase is already present. See tree-ssa/vrp47.c).
>>
>> As VRP in first pass produces (and replaces) to:
>>
>> D1 = (_Bool) x[0..1];
>> D2 = (_Bool) y[0..1];
>> D3 = D1 & D2
>> R = (int) D3
>>
>> Just in the second pass the reduction
>>
>> R = x[0..1] & y[0..1]
>
> So why wouldn't that happen during the first pass?  The first
> pass could change the IL to

The issue is that substitute_and_fold runs within BBs statements
folding from last to first.  So most simplifications are done too late
to recognize dependent one. Maybe it would be another way here to have
a flag for substitute_and_fold to indicate that folding pass shall run
first -> last or last->first?

>  D1 = x[0..1] != 0;
>  D2 = y[0..1] != 0;
>  D3 = D1 & D2;
>  R = x & y;
>
> if D3 only has a single use.

Well, to change type of an SSA-name, if it has single-use might be
another way here.  To have the ability to enter new temp-registers
would be better and avoids the dependency of single use, but well,
range tables don't support that now.

Kai Tietz July 8, 2011, 2:35 p.m. UTC | #7

2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>>
>>>> +  /* We redo folding here one time for allowing to inspect more
>>>> +     complex reductions.  */
>>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>>> +                      vrp_fold_stmt, false);
>>>> +  /* We need to mark this second pass to avoid re-entering of same
>>>> +     edges for switch statments.  */
>>>> +  in_second_pass = true;
>>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>>                       vrp_fold_stmt, false);
>>>> +  in_second_pass = false;
>>>
>>> This needs a much better explanation.
>>>
>>> Paolo
>>
>> Well, I can work on a better comment.  The complex reduction I mean
>> here are cases like
>>
>> int x;
>> int y;
>> _Bool D1;
>> _Bool D2;
>> _Bool D3;
>> int R;
>>
>> D1 = x[0..1] != 0;
>> D2 = y[0..1] != 0;
>> D3 = D1 & D2
>> R = (int) D3
>>
>> (testcase is already present. See tree-ssa/vrp47.c).
>>
>> As VRP in first pass produces (and replaces) to:
>>
>> D1 = (_Bool) x[0..1];
>> D2 = (_Bool) y[0..1];
>> D3 = D1 & D2
>> R = (int) D3
>>
>> Just in the second pass the reduction
>>
>> R = x[0..1] & y[0..1]
>
> So why wouldn't that happen during the first pass?  The first
> pass could change the IL to
>
>  D1 = x[0..1] != 0;
>  D2 = y[0..1] != 0;
>  D3 = D1 & D2;
>  R = x & y;
>
> if D3 only has a single use.
No, as D3 would need a type change, and this isn't possible.  If it
wasn't absolutely clear, this patch to VRP is necessary after patch 2,
as here D1, D2, and D3 have bool-type, and just R is of type int.

>> can happen.  In general it is sad that VRP can't insert during pass
>> new statements right now.  This would cause issues in range-tables,
>> which aren't designed for insertations.  As otherwise, we could do
>> also simplify things like
>>
>> D1 = x[0..1] != 0;
>> D2 = y[0..1] == 0;
>> D3 = D1 & D2
>> R = (int) D3
>>
>> to
>> R = x[0..1] & (y[0..1] ^ 1)
>
> Why that ^ 1?  And why does that confuse the range tables
> if you re-use R?
Because we would need to insert a new statement and this isn't allowed
in VRP. See the comments in VRP and substitute_and_fold.  VRP
disallows to remove statements or to insert new ones.

>> Regards,
>> Kai

Richard Biener July 8, 2011, 2:40 p.m. UTC | #8

On Fri, Jul 8, 2011 at 4:35 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
>> On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>>>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>>>
>>>>> +  /* We redo folding here one time for allowing to inspect more
>>>>> +     complex reductions.  */
>>>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>>>> +                      vrp_fold_stmt, false);
>>>>> +  /* We need to mark this second pass to avoid re-entering of same
>>>>> +     edges for switch statments.  */
>>>>> +  in_second_pass = true;
>>>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>>>                       vrp_fold_stmt, false);
>>>>> +  in_second_pass = false;
>>>>
>>>> This needs a much better explanation.
>>>>
>>>> Paolo
>>>
>>> Well, I can work on a better comment.  The complex reduction I mean
>>> here are cases like
>>>
>>> int x;
>>> int y;
>>> _Bool D1;
>>> _Bool D2;
>>> _Bool D3;
>>> int R;
>>>
>>> D1 = x[0..1] != 0;
>>> D2 = y[0..1] != 0;
>>> D3 = D1 & D2
>>> R = (int) D3
>>>
>>> (testcase is already present. See tree-ssa/vrp47.c).
>>>
>>> As VRP in first pass produces (and replaces) to:
>>>
>>> D1 = (_Bool) x[0..1];
>>> D2 = (_Bool) y[0..1];
>>> D3 = D1 & D2
>>> R = (int) D3
>>>
>>> Just in the second pass the reduction
>>>
>>> R = x[0..1] & y[0..1]
>>
>> So why wouldn't that happen during the first pass?  The first
>> pass could change the IL to
>>
>>  D1 = x[0..1] != 0;
>>  D2 = y[0..1] != 0;
>>  D3 = D1 & D2;
>>  R = x & y;
>>
>> if D3 only has a single use.
> No, as D3 would need a type change, and this isn't possible.  If it
> wasn't absolutely clear, this patch to VRP is necessary after patch 2,
> as here D1, D2, and D3 have bool-type, and just R is of type int.

In your example x,y and R are int, so it works with re-using R.

>>> can happen.  In general it is sad that VRP can't insert during pass
>>> new statements right now.  This would cause issues in range-tables,
>>> which aren't designed for insertations.  As otherwise, we could do
>>> also simplify things like
>>>
>>> D1 = x[0..1] != 0;
>>> D2 = y[0..1] == 0;
>>> D3 = D1 & D2
>>> R = (int) D3
>>>
>>> to
>>> R = x[0..1] & (y[0..1] ^ 1)
>>
>> Why that ^ 1?  And why does that confuse the range tables
>> if you re-use R?
> Because we would need to insert a new statement and this isn't allowed
> in VRP. See the comments in VRP and substitute_and_fold.  VRP
> disallows to remove statements or to insert new ones.

That's not a hard limitation.

>>> Regards,
>>> Kai
>

Kai Tietz July 8, 2011, 2:53 p.m. UTC | #9

2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
> On Fri, Jul 8, 2011 at 4:35 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
>>> On Thu, Jul 7, 2011 at 6:28 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> 2011/7/7 Paolo Bonzini <bonzini@gnu.org>:
>>>>> On 07/07/2011 06:07 PM, Kai Tietz wrote:
>>>>>>
>>>>>> +  /* We redo folding here one time for allowing to inspect more
>>>>>> +     complex reductions.  */
>>>>>> +  substitute_and_fold (op_with_constant_singleton_value_range,
>>>>>> +                      vrp_fold_stmt, false);
>>>>>> +  /* We need to mark this second pass to avoid re-entering of same
>>>>>> +     edges for switch statments.  */
>>>>>> +  in_second_pass = true;
>>>>>>    substitute_and_fold (op_with_constant_singleton_value_range,
>>>>>>                       vrp_fold_stmt, false);
>>>>>> +  in_second_pass = false;
>>>>>
>>>>> This needs a much better explanation.
>>>>>
>>>>> Paolo
>>>>
>>>> Well, I can work on a better comment.  The complex reduction I mean
>>>> here are cases like
>>>>
>>>> int x;
>>>> int y;
>>>> _Bool D1;
>>>> _Bool D2;
>>>> _Bool D3;
>>>> int R;
>>>>
>>>> D1 = x[0..1] != 0;
>>>> D2 = y[0..1] != 0;
>>>> D3 = D1 & D2
>>>> R = (int) D3
>>>>
>>>> (testcase is already present. See tree-ssa/vrp47.c).
>>>>
>>>> As VRP in first pass produces (and replaces) to:
>>>>
>>>> D1 = (_Bool) x[0..1];
>>>> D2 = (_Bool) y[0..1];
>>>> D3 = D1 & D2
>>>> R = (int) D3
>>>>
>>>> Just in the second pass the reduction
>>>>
>>>> R = x[0..1] & y[0..1]
>>>
>>> So why wouldn't that happen during the first pass?  The first
>>> pass could change the IL to
>>>
>>>  D1 = x[0..1] != 0;
>>>  D2 = y[0..1] != 0;
>>>  D3 = D1 & D2;
>>>  R = x & y;
>>>
>>> if D3 only has a single use.
>> No, as D3 would need a type change, and this isn't possible.  If it
>> wasn't absolutely clear, this patch to VRP is necessary after patch 2,
>> as here D1, D2, and D3 have bool-type, and just R is of type int.
>
> In your example x,y and R are int, so it works with re-using R.
Well, if we add pattern match with prefixed cast, it works. This
actual my patch does, as it finds out that D1's and D2's operand is of
kind int, and matches R's type. So it uses R to simplify.  This
pattern is better then nothing, but more complex operations can't be
handled without introducing new statements.

Eg:

int foo (int a, int b, int c)
{
  if (a < 0 || a > 1 || b < 0 || b > 1 || c < 0 || c > 1)
    return -1;
  return (a != 0 | b != 0) | c != 0;
}

Here we get:

int a; int b; int c;
_Bool D1, D2, D3, D4;
int R;

...

D1 = (bool) a;
D2 = (bool) b;
D3 = (bool) c;
D4 = D1 | D2;
D5 = D4 | D3
R = (int) D5;

This can't be simplified by VRP without inserting new statement.

>>>> can happen.  In general it is sad that VRP can't insert during pass
>>>> new statements right now.  This would cause issues in range-tables,
>>>> which aren't designed for insertations.  As otherwise, we could do
>>>> also simplify things like
>>>>
>>>> D1 = x[0..1] != 0;
>>>> D2 = y[0..1] == 0;
>>>> D3 = D1 & D2
>>>> R = (int) D3
>>>>
>>>> to
>>>> R = x[0..1] & (y[0..1] ^ 1)
>>>
>>> Why that ^ 1?  And why does that confuse the range tables
>>> if you re-use R?
>> Because we would need to insert a new statement and this isn't allowed
>> in VRP. See the comments in VRP and substitute_and_fold.  VRP
>> disallows to remove statements or to insert new ones.
>
> That's not a hard limitation.

Hmm, ok. I played by it for a while to add this, but some later passes
like switch-range analyzis and jump-threading (IIRC) getting confused
by this.  AFAIU are the major culprits here the inserted ASSERTs, but
maybe I am wrong about this.

Kai

Kai Tietz July 8, 2011, 3:43 p.m. UTC | #10

2011/7/8 Richard Guenther <richard.guenther@gmail.com>:
> On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> Index: gcc-head/gcc/tree-vrp.c
>> @@ -2232,6 +2235,7 @@ extract_range_from_binary_expr (value_ra
>>      some cases.  */
>>   if (code != BIT_AND_EXPR
>>       && code != TRUTH_AND_EXPR
>> +      && code != BIT_IOR_EXPR
>
> Huh?  So how would VARYING | x ever produce something better
> than VARYING?

Because BIT_IOR_EXPR might be a 1-bit precision operation and so
equivalent to TRUTH_OR_EXPR. It might be that BIT_XOR_EXPR is worth to
be added here too, as for one-bit precision typed expression it is
equivalent to TRUTH_XOR_EXPR.

Kai

[tree-optimization] : [3 of 3]: Boolify compares & more

Commit Message

Comments

Patch