[tree-optimization] : Improve handling of conditional-branches on targets with high branch costs

2011/10/10 Richard Guenther <richard.guenther@gmail.com>:
> On Mon, Oct 10, 2011 at 5:07 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/10/10 Richard Guenther <richard.guenther@gmail.com>:
>>> On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>> 2011/10/10 Richard Guenther <richard.guenther@gmail.com>:
>>>>> On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>>>>> Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF.  It
>>>>>> has to be checked that the LHS code is same as outer CODE, as
>>>>>> otherwise we wouldn't apply different TRUTH-IF only on inner RHS of
>>>>>> LHS, which is of course wrong.
>>>>>>
>>>>>> Index: gcc/gcc/fold-const.c
>>>>>> ===================================================================
>>>>>> --- gcc.orig/gcc/fold-const.c
>>>>>> +++ gcc/gcc/fold-const.c
>>>>>> @@ -111,14 +111,13 @@ static tree decode_field_reference (loca
>>>>>>                                    tree *, tree *);
>>>>>>  static int all_ones_mask_p (const_tree, int);
>>>>>>  static tree sign_bit_p (tree, const_tree);
>>>>>> -static int simple_operand_p (const_tree);
>>>>>> +static int simple_operand_p (tree);
>>>>>>  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
>>>>>>  static tree range_predecessor (tree);
>>>>>>  static tree range_successor (tree);
>>>>>>  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
>>>>>>  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
>>>>>> tree, tree);
>>>>>>  static tree unextend (tree, int, int, tree);
>>>>>> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
>>>>>>  static tree optimize_minmax_comparison (location_t, enum tree_code,
>>>>>>                                        tree, tree, tree);
>>>>>>  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
>>>>>> @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
>>>>>>   return lhs;
>>>>>>  }
>>>>>>
>>>>>> -/* Subroutine for fold_truthop: decode a field reference.
>>>>>> +/* Subroutine for fold_truth_andor_1: decode a field reference.
>>>>>>
>>>>>>    If EXP is a comparison reference, we return the innermost reference.
>>>>>>
>>>>>> @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val)
>>>>>>   return NULL_TREE;
>>>>>>  }
>>>>>>
>>>>>> -/* Subroutine for fold_truthop: determine if an operand is simple enough
>>>>>> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
>>>>>>    to be evaluated unconditionally.  */
>>>>>>
>>>>>>  static int
>>>>>> -simple_operand_p (const_tree exp)
>>>>>> +simple_operand_p (tree exp)
>>>>>>  {
>>>>>> +  enum tree_code code;
>>>>>>   /* Strip any conversions that don't change the machine mode.  */
>>>>>>   STRIP_NOPS (exp);
>>>>>>
>>>>>> +  code = TREE_CODE (exp);
>>>>>> +
>>>>>> +  /* Handle some trivials   */
>>>>>> +  if (TREE_CODE_CLASS (code) == tcc_comparison)
>>>>>> +    return (tree_could_trap_p (exp)
>>>>>> +           && simple_operand_p (TREE_OPERAND (exp, 0))
>>>>>> +           && simple_operand_p (TREE_OPERAND (exp, 1)));
>>>>>
>>>>> And that's still wrong.
>>>>>
>>>>> Stopped reading here.
>>>>>
>>>>> Richard.
>>>>
>>>> Oh, there is a not missing.  I didn't spot that, sorry.
>>>>
>>>> To the point why we need to handle comparisons within simple_operand_p.
>>>>
>>>> If we reject comparisons and logical not here, we won't have any
>>>> branching optimization anymore, as this the patch moves into
>>>> fold_truthandor.
>>>>
>>>> The result with rejecting in simple_operand_p compares and logic-not
>>>> provides for the following example:
>>>
>>> But you change what simple_operand_p accepts and thus change what
>>> fold_truthop accepts as operands to its simplifications.
>>>
>>> Richard.
>>
>> Well, not really.  I assume you mean fold_truth_andor_1 (aka fold_truthop).
>>
>> It checks for
>> ...
>>  if (TREE_CODE_CLASS (lcode) != tcc_comparison
>>      || TREE_CODE_CLASS (rcode) != tcc_comparison)
>>    return 0;
>> ...
>> before checking for simple_operand_p.  So there is actual no change.
>> It might be of some interest here to add in a different patch support
>> for logic-not, but well, this is would be material for a different
>> patch.
>> So, it won't operate on anything else then comparisons as before.
>
> Sure, because simple_operand_p is checked on the comparison
> operands, not the comparison itself.
>
> Richard.

Right,  we would allow by this things like (a != b) < (c != d) etc.
Here it seems that only some cases for comparison in comparison are
folded.  (eg (a == b) == (c == d) -> (a == b) ^ (c == d) ). Well,
other material for a different patch.

To ensure that we use simple_operand_p in all cases, beside for
branching AND/OR chains, in same way as before, I added to this
function an additional argument, by which
the looking into comparisons can be activated.

Regards,
Kai

 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

    If EXP is a comparison reference, we return the innermost reference.

@@ -3668,17 +3667,48 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
-   to be evaluated unconditionally.  */
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   If IN_COMPARES is TRUE, then we assume comparisons and logic-not
+   operations are simple, if their operands are simple.  */

-static int
-simple_operand_p (const_tree exp)
+static bool
+simple_operand_p (tree exp, bool in_compares)
 {
+  enum tree_code code;
   /* Strip any conversions that don't change the machine mode.  */
   STRIP_NOPS (exp);

+  code = TREE_CODE (exp);
+
+  if (in_compares)
+    {
+      if (TREE_CODE_CLASS (code) == tcc_comparison)
+	return (!tree_could_trap_p (exp)
+		&& simple_operand_p (TREE_OPERAND (exp, 0), true)
+		&& simple_operand_p (TREE_OPERAND (exp, 1), true));
+
+      if (FLOAT_TYPE_P (TREE_TYPE (exp))
+	  && tree_could_trap_p (exp))
+	return false;
+
+      switch (code)
+	{
+	case SSA_NAME:
+	  return true;
+	case TRUTH_NOT_EXPR:
+	  return simple_operand_p (TREE_OPERAND (exp, 0), true);
+	case BIT_NOT_EXPR:
+	  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+	    return false;
+	  return simple_operand_p (TREE_OPERAND (exp, 0), true);
+	default:
+	  break;
+	}
+    }
+
   return (CONSTANT_CLASS_P (exp)
-	  || TREE_CODE (exp) == SSA_NAME
+  	  || code == SSA_NAME
 	  || (DECL_P (exp)
 	      && ! TREE_ADDRESSABLE (exp)
 	      && ! TREE_THIS_VOLATILE (exp)
@@ -4858,7 +4888,7 @@ fold_range_test (location_t loc, enum tr
       /* If simple enough, just rewrite.  Otherwise, make a SAVE_EXPR
 	 unless we are at top level or LHS contains a PLACEHOLDER_EXPR, in
 	 which cases we can't do this.  */
-      if (simple_operand_p (lhs))
+      if (simple_operand_p (lhs, false))
 	return build2_loc (loc, code == TRUTH_ANDIF_EXPR
 			   ? TRUTH_AND_EXPR : TRUTH_OR_EXPR,
 			   type, op0, op1);
@@ -4888,7 +4918,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
    bit value.  Arrange things so the extra bits will be set to zero if and
    only if C is signed-extended to its full width.  If MASK is nonzero,
    it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5055,8 @@ merge_truthop_with_opposite_arm (locatio
    We return the simplified tree or 0 if no optimization is possible.  */

 static tree
-fold_truthop (location_t loc, enum tree_code code, tree truth_type,
-	      tree lhs, tree rhs)
+fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
+		    tree lhs, tree rhs)
 {
   /* If this is the "or" of two comparisons, we can do something if
      the comparisons are NE_EXPR.  If this is the "and", we can do something
@@ -5054,8 +5084,6 @@ fold_truthop (location_t loc, enum tree_
   tree lntype, rntype, result;
   HOST_WIDE_INT first_bit, end_bit;
   int volatilep;
-  tree orig_lhs = lhs, orig_rhs = rhs;
-  enum tree_code orig_code = code;

   /* Start by getting the comparison codes.  Fail if anything is volatile.
      If one operand is a BIT_AND_EXPR with the constant one, treat it as if
@@ -5091,8 +5119,8 @@ fold_truthop (location_t loc, enum tree_
   rr_arg = TREE_OPERAND (rhs, 1);

   /* Simplify (x<y) && (x==y) into (x<=y) and related optimizations.  */
-  if (simple_operand_p (ll_arg)
-      && simple_operand_p (lr_arg))
+  if (simple_operand_p (ll_arg, false)
+      && simple_operand_p (lr_arg, false))
     {
       if (operand_equal_p (ll_arg, rl_arg, 0)
           && operand_equal_p (lr_arg, rr_arg, 0))
@@ -5119,14 +5147,13 @@ fold_truthop (location_t loc, enum tree_
   /* If the RHS can be evaluated unconditionally and its operands are
      simple, it wins to evaluate the RHS unconditionally on machines
      with expensive branches.  In this case, this isn't a comparison
-     that can be merged.  Avoid doing this if the RHS is a floating-point
-     comparison since those can trap.  */
+     that can be merged.  */

   if (BRANCH_COST (optimize_function_for_speed_p (cfun),
 		   false) >= 2
       && ! FLOAT_TYPE_P (TREE_TYPE (rl_arg))
-      && simple_operand_p (rl_arg)
-      && simple_operand_p (rr_arg))
+      && simple_operand_p (rl_arg, false)
+      && simple_operand_p (rr_arg, false))
     {
       /* Convert (a != 0) || (b != 0) into (a | b) != 0.  */
       if (code == TRUTH_OR_EXPR
@@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_
 			   build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
 				   ll_arg, rl_arg),
 			   build_int_cst (TREE_TYPE (ll_arg), 0));
-
-      if (LOGICAL_OP_NON_SHORT_CIRCUIT)
-	{
-	  if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs)
-	    return build2_loc (loc, code, truth_type, lhs, rhs);
-	  return NULL_TREE;
-	}
     }

   /* See if the comparisons can be merged.  Then get all the parameters for
@@ -8380,13 +8400,46 @@ fold_truth_andor (location_t loc, enum t
      lhs is another similar operation, try to merge its rhs with our
      rhs.  Then try to merge our lhs and rhs.  */
   if (TREE_CODE (arg0) == code
-      && 0 != (tem = fold_truthop (loc, code, type,
-				   TREE_OPERAND (arg0, 1), arg1)))
+      && 0 != (tem = fold_truth_andor_1 (loc, code, type,
+					 TREE_OPERAND (arg0, 1), arg1)))
     return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem);

-  if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0)
+  if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
     return tem;

+  if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR)
+      && (BRANCH_COST (optimize_function_for_speed_p (cfun),
+		       false) >= 2)
+      && !TREE_SIDE_EFFECTS (arg1)
+      && LOGICAL_OP_NON_SHORT_CIRCUIT
+      && simple_operand_p (arg1, true))
+    {
+      enum tree_code ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR
+						       : TRUTH_OR_EXPR);
+
+      /* We don't want to pack more then two leafs to an non-IF
+         If tree-code of left-hand operand isn't an AND/OR-IF code and not
+         equal to CODE, then we don't want to add right-hand operand.
+         If the inner right-hand side of left-hand operand has side-effects,
+         or isn't simple, then we can't add to it, as otherwise we might
+         destroy if-sequence.  */
+      if (TREE_CODE (arg0) == code
+      	  /* Needed for sequence points to handle trappings, and
+      	     side-effects.  */
+      	  && !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1))
+      	  && simple_operand_p (TREE_OPERAND (arg0, 1), true))
+       {
+         tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1),
+				arg1);
+         return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
+				 tem);
+       }
+     /* Needed for sequence points to handle trappings, and side-effects.  */
+     else if (!TREE_SIDE_EFFECTS (arg0)
+	      && simple_operand_p (arg0, true))
+       return fold_build2_loc (loc, ncode, type, arg0, arg1);
+    }
+
   return NULL_TREE;
 }


[tree-optimization] : Improve handling of conditional-branches on targets with high branch costs

Commit Message

Comments

Patch