From patchwork Thu Aug 4 10:45:50 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kai Tietz X-Patchwork-Id: 108412 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E9B95B6F85 for ; Thu, 4 Aug 2011 20:46:11 +1000 (EST) Received: (qmail 914 invoked by alias); 4 Aug 2011 10:46:10 -0000 Received: (qmail 902 invoked by uid 22791); 4 Aug 2011 10:46:07 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_TM X-Spam-Check-By: sourceware.org Received: from mail-qw0-f47.google.com (HELO mail-qw0-f47.google.com) (209.85.216.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 04 Aug 2011 10:45:51 +0000 Received: by qwh5 with SMTP id 5so1098346qwh.20 for ; Thu, 04 Aug 2011 03:45:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.101.137 with SMTP id c9mr455739qco.102.1312454750229; Thu, 04 Aug 2011 03:45:50 -0700 (PDT) Received: by 10.229.99.137 with HTTP; Thu, 4 Aug 2011 03:45:50 -0700 (PDT) In-Reply-To: References: Date: Thu, 4 Aug 2011 12:45:50 +0200 Message-ID: Subject: Re: [patch tree-optimization]: Improve reassociation pass for bitwise-operations From: Kai Tietz To: Richard Guenther Cc: Michael Matz , GCC Patches X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org 2011/8/3 Richard Guenther : > On Wed, Aug 3, 2011 at 3:32 PM, Kai Tietz wrote: >> 2011/8/3 Michael Matz : >>> Hi, >>> >>> On Tue, 2 Aug 2011, Kai Tietz wrote: >>> >>>> this patch improves the ability of reassociation pass to simplifiy >>>> more complex bitwise-binary >>>> operations with comparisons.  We break-up for this patch statements >>>> like (X | Y) != 0 to X != 0 | Y != 0, >>>> and (X | Y) == 0 to expanded X == 0 & Y == 0. >>>> Additionally we expand logical-not expressions like ~(A & B) -> ~A | >>>> ~B, ~(A & B) -> ~A | ~B, and >>>> ~(A ^ B) -> A ^ ~B.  These expansion are just temporary for this pass >>>> and getting later by fold >>>> reversed again back to their original form. >>> >>> Implement all of this in the normal reassoc machinery that already exists. >>> Don't implement your own walker (which btw is superlinear because you >>> recurse into both operands).  If no simplifications were possible you have >>> to fold back the NOTs into the shorter sequences again, which conveniently >>> reassoc already can do for negates and PLUS/MINUS chains. >>> >>> Hence extend the existing support for arithmetic operations to logical >>> operations. >>> >>> >>> Ciao, >>> Michael. >> >> What you mean by existing machinery for negate expression here?  This >> machinery doen't work in this case and additionally doesn't provide >> the opportunity to do a complete reassociation rewrite of >> bitwise-expression-chains. >> >> Eg: the case (~(a | c) & (b & ~d))  would be expanded (by code in >> patch) to ~a & ~c & b & ~d. >> This intermediate result is good to inspect doubles, or inverted optimizations. >> On rebuilding of tree the result gets transformed (or should) to ~(a | >> c | d) & b. > > It depends on the context whether a conjunctive or a disjunctive normal > form is what you want.  As you are mixing two operation kinds reassoc > isn't the perfect place to deal with this (yet). > > You don't seem to stop when single-use chains end (which is where > reassoc will give up) and even visit stmts multiple times that way. > You need to at least do this "unfolding" in a lot more controlled manner. > > Richard. Hello, This revised patch takes care that we don't visit statements, which aren't modified, recursively. Additionally it takes care that we just do normalize statements with single-use. ChangeLog 2011-08-04 Kai Tietz * tree-ssa-reassoc.c (gimple build_and_add_sum): Add forwarder declaration and add support for unary-expression. (remove_visited_stmt_chain): Add forwarder declaration. (make_new_tmp_statement): New helper function. (expand_not_bitwise_binary): Likewise. (break_up_bitwise_combined_stmt): Likeiwise. (break_up_subtract_bb): Add call to break_up_bitwise_combined_stmt. ChangeLog 2011-08-03 Kai Tietz * gcc.dg/tree-ssa/reassoc-24.c: New test. * gcc.dg/tree-ssa/reassoc-25.c: New test. * gcc.dg/tree-ssa/reassoc-26.c: New test. * gcc.dg/tree-ssa/reassoc-27.c: New test. * gcc.dg/tree-ssa/reassoc-28.c: New test. * gcc.dg/tree-ssa/reassoc-29.c: New test. Bootstrapped and regression tested for all languages (including Ada and Obj-C++) on host x86_64-pc-linux-gnu. Ok for apply? Regards, Kai Index: gcc/gcc/tree-ssa-reassoc.c =================================================================== --- gcc.orig/gcc/tree-ssa-reassoc.c +++ gcc/gcc/tree-ssa-reassoc.c @@ -41,6 +41,10 @@ along with GCC; see the file COPYING3. #include "cfgloop.h" #include "flags.h" +/* Forwarders. */ +static gimple build_and_add_sum (tree, tree, tree, enum tree_code); +static void remove_visited_stmt_chain (tree); + /* This is a simple global reassociation pass. It is, in part, based on the LLVM pass of the same name (They do some things more/less than we do, in different orders, etc). @@ -48,7 +52,9 @@ along with GCC; see the file COPYING3. It consists of five steps: 1. Breaking up subtract operations into addition + negate, where - it would promote the reassociation of adds. + it would promote the reassociation of adds. Additionally breaking + up combined expression made out of boolean-typed bitwise expressions + for improving simplification. 2. Left linearization of the expression trees, so that (A+B)+(C+D) becomes (((A+B)+C)+D), which is easier for us to rewrite later. @@ -554,6 +560,232 @@ get_unary_op (tree name, enum tree_code return NULL_TREE; } +/* Create a temorary register expression with type TYPE, tree code CODE, and + operands OP1 and OP2. If REF_DEF is a valid gimple statement, we use its + location for new generated temporary. + Function returns left-hand-side of new generated temporary register. */ +static tree +make_new_tmp_statement (tree type, enum tree_code code, tree op1, tree op2, + gimple ref_def) +{ + gimple sum; + tree tmpvar = create_tmp_reg (type, NULL); + add_referenced_var (tmpvar); + sum = build_and_add_sum (tmpvar, op1, op2, code); + if (ref_def) + gimple_set_location (sum, gimple_location (ref_def)); + return gimple_get_lhs (sum); +} + +/* Perfrom on tree LHS with optional definition statement EPXR + a logic-not operation. TYPE is of kind boolean with a 1-bit + precision. */ +static tree +expand_not_bitwise_binary (tree type, tree lhs, gimple expr) +{ + enum tree_code code = ERROR_MARK; + tree op1 = NULL, op2 = NULL; + gimple s1 = NULL, s2 = NULL; + gimple_stmt_iterator gsi; + + if (TREE_CODE (lhs) == INTEGER_CST) + return fold_build1 (BIT_NOT_EXPR, type, lhs); + + if (expr && is_gimple_assign (expr)) + code = gimple_assign_rhs_code (expr); + + /* If statement lhs isn't a single-use statement, + we don't want to modify it. So we can do only default-case + operation for it. */ + if (code != ERROR_MARK && !has_single_use (lhs)) + code = ERROR_MARK; + + if (TREE_CODE_CLASS (code) == tcc_comparison + || code == BIT_XOR_EXPR + || code == BIT_AND_EXPR + || code == BIT_IOR_EXPR) + { + op1 = gimple_assign_rhs1 (expr); + op2 = gimple_assign_rhs2 (expr); + } + else if (code == BIT_NOT_EXPR) + op1 = gimple_assign_rhs1 (expr); + else + return make_new_tmp_statement (type, BIT_NOT_EXPR, lhs, NULL_TREE, expr); + + + /* ~(~X) -> X. */ + if (code == BIT_NOT_EXPR) + return op1; + + /* Invert comparison if possible, otherwise fall through to + default case. */ + else if (TREE_CODE_CLASS (code) == tcc_comparison) + { + enum tree_code ncode; + ncode = invert_tree_comparison (code, + HONOR_NANS (TYPE_MODE (type))); + if (ncode != ERROR_MARK) + return make_new_tmp_statement (type, ncode, op1, op2, expr); + } + /* ~(A & B) -> ~A | ~B. */ + else if (code == BIT_AND_EXPR) + { + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + /* We have to do inversion in one step to avoid looping. */ + op1 = expand_not_bitwise_binary (type, op1, s1); + op2 = expand_not_bitwise_binary (type, op2, s2); + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op1; + else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return op2; + return make_new_tmp_statement (type, BIT_IOR_EXPR, op1, op2, expr); + } + /* ~(A | B) -> ~A & ~B. */ + else if (code == BIT_IOR_EXPR) + { + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + op1 = expand_not_bitwise_binary (type, op1, s1); + op2 = expand_not_bitwise_binary (type, op2, s2); + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op2; + else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return op1; + return make_new_tmp_statement (type, BIT_AND_EXPR, op1, op2, expr); + } + /* ~(A ^ B) -> ~A ^ B. Handle here special cases for B being + an integer constant, or being a logical-not. */ + else if (code == BIT_XOR_EXPR) + { + if (TREE_CODE (op1) == SSA_NAME) + s1 = SSA_NAME_DEF_STMT (op1); + if (TREE_CODE (op2) == SSA_NAME) + s2 = SSA_NAME_DEF_STMT (op2); + if (TREE_CODE (op2) != INTEGER_CST + && (!s2 || !is_gimple_assign (s2) + || gimple_assign_rhs_code (s2) != BIT_NOT_EXPR)) + op1 = expand_not_bitwise_binary (type, op1, s1); + else + op2 = expand_not_bitwise_binary (type, op2, s2); + + if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2)) + return op1; + + gsi = gsi_for_stmt (expr); + if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2)) + return make_new_tmp_statement (type, BIT_NOT_EXPR, op1, NULL, expr); + return make_new_tmp_statement (type, BIT_XOR_EXPR, op1, op2, expr); + } + + /* Default case lhs -> ~lhs */ + return make_new_tmp_statement (type, BIT_NOT_EXPR, lhs, NULL_TREE, expr); +} + +/* Break up statement STMT if it is a combined expressions made out of + bitwise operations. Handle expansion of (A | B) !=/== 0, and ~(A op B). */ +static bool +break_up_bitwise_combined_stmt (gimple stmt) +{ + tree op1, op2; + gimple op1_def, op2_def; + enum tree_code code = gimple_assign_rhs_code (stmt); + tree type = TREE_TYPE (gimple_assign_lhs (stmt)); + gimple_stmt_iterator gsi; + + /* Don't do anything for none integral type. */ + if (!INTEGRAL_TYPE_P (type)) + return false; + + op1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (op1) != SSA_NAME) + return false; + op1_def = SSA_NAME_DEF_STMT (op1); + if (!op1_def + || !is_gimple_assign (op1_def) + || !has_single_use (op1)) + return false; + + if (code == NE_EXPR || code == EQ_EXPR) + { + tree zero = gimple_assign_rhs2 (stmt); + tree old_op = op1; + + /* Check that right-hand operand has integral type and + not a boolean. And see if it is constant zero valued. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (zero)) + || TREE_CODE (TREE_TYPE (zero)) == BOOLEAN_TYPE + || !integer_zerop (zero)) + return false; + /* Is left-hand operand bitwise-or expression? */ + if (gimple_assign_rhs_code (op1_def) != BIT_IOR_EXPR) + return false; + op1 = make_new_tmp_statement (type, code, + gimple_assign_rhs1 (op1_def), zero, + stmt); + op2 = make_new_tmp_statement (type, code, + gimple_assign_rhs2 (op1_def), zero, + stmt); + + gsi = gsi_for_stmt (stmt); + gimple_assign_set_rhs_with_ops (&gsi, (code == NE_EXPR ? BIT_IOR_EXPR + : BIT_AND_EXPR), + op1, op2); + update_stmt (gsi_stmt (gsi)); + remove_visited_stmt_chain (old_op); + return true; + } + /* Handle expansion for expansion of ~X. */ + else if (code == BIT_NOT_EXPR) + { + tree old_op; + enum tree_code inner_code = gimple_assign_rhs_code (op1_def); + if (inner_code != BIT_AND_EXPR && inner_code != BIT_IOR_EXPR + && inner_code != BIT_XOR_EXPR) + return false; + old_op = op1; + op1 = gimple_assign_rhs1 (op1_def); + op2 = gimple_assign_rhs2 (op1_def); + op1_def = op2_def = NULL; + if (TREE_CODE (op1) != SSA_NAME + || (op1_def = SSA_NAME_DEF_STMT (op1)) == NULL + || !is_gimple_assign (op1_def)) + op1_def = NULL; + if (TREE_CODE (op2) != SSA_NAME + || (op2_def = SSA_NAME_DEF_STMT (op2)) == NULL + || !is_gimple_assign (op2_def)) + op2_def = NULL; + if (inner_code == BIT_XOR_EXPR) + { + if (TREE_CODE (op2) != INTEGER_CST + && (!op2_def || !is_gimple_assign (op2_def) + || !has_single_use (op2) + || gimple_assign_rhs_code (op2_def) != BIT_NOT_EXPR)) + op1 = expand_not_bitwise_binary (type, op1, op1_def); + else + op2 = expand_not_bitwise_binary (type, op2, op2_def); + } + else + { + op1 = expand_not_bitwise_binary (type, op1, op1_def); + op2 = expand_not_bitwise_binary (type, op2, op2_def); + inner_code = (inner_code == BIT_AND_EXPR ? BIT_IOR_EXPR + : BIT_AND_EXPR); + } + gsi = gsi_for_stmt (stmt); + gimple_assign_set_rhs_with_ops (&gsi, inner_code, op1, op2); + update_stmt (gsi_stmt (gsi)); + remove_visited_stmt_chain (old_op); + return true; + } + return false; +} + /* If CURR and LAST are a pair of ops that OPCODE allows us to eliminate through equivalences, do so, remove them from OPS, and return true. Otherwise, return false. */ @@ -1015,8 +1247,8 @@ zero_one_operation (tree *def, enum tree while (1); } -/* Builds one statement performing OP1 OPCODE OP2 using TMPVAR for - the result. Places the statement after the definition of either +/* Builds one statement performing OP1 OPCODE OP2, OPCODE op1 using TMPVAR + for the result. Places the statement after the definition of either OP1 or OP2. Returns the new statement. */ static gimple @@ -1035,7 +1267,7 @@ build_and_add_sum (tree tmpvar, tree op1 /* Find an insertion place and insert. */ if (TREE_CODE (op1) == SSA_NAME) op1def = SSA_NAME_DEF_STMT (op1); - if (TREE_CODE (op2) == SSA_NAME) + if (op2 && TREE_CODE (op2) == SSA_NAME) op2def = SSA_NAME_DEF_STMT (op2); if ((!op1def || gimple_nop_p (op1def)) && (!op2def || gimple_nop_p (op2def))) @@ -2133,6 +2365,17 @@ can_reassociate_p (tree op) we want to break up k = t - q, but we won't until we've transformed q = b - r, which won't be broken up until we transform b = c - d. + Break up comparison !=/== 0 operations of bitwise-or operations for + being able to optimize within combined conditions. + (A | B) != 0 -> (A != 0) || (B != 0) + (A | B) == 0 -> (A == 0) && (B != 0) + + Break up logical-not expressions of bitwise boolean-typed and/or/xor + operations for being able to optimize wihin combined conditions. + ~(A | B) -> ~A | ~B + ~(A & B) -> ~A & ~B + ~(A ^ B) -> A ^ ~B (special case if B is a constant) + En passant, clear the GIMPLE visited flag on every statement. */ static void @@ -2141,21 +2384,32 @@ break_up_subtract_bb (basic_block bb) gimple_stmt_iterator gsi; basic_block son; - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); ) { gimple stmt = gsi_stmt (gsi); gimple_set_visited (stmt, false); - - if (!is_gimple_assign (stmt) - || !can_reassociate_p (gimple_assign_lhs (stmt))) + if (!is_gimple_assign (stmt)) + { + gsi_next (&gsi); + continue; + } + if (break_up_bitwise_combined_stmt (stmt)) continue; + if (!can_reassociate_p (gimple_assign_lhs (stmt))) + { + gsi_next (&gsi); + continue; + } /* Look for simple gimple subtract operations. */ if (gimple_assign_rhs_code (stmt) == MINUS_EXPR) { if (!can_reassociate_p (gimple_assign_rhs1 (stmt)) || !can_reassociate_p (gimple_assign_rhs2 (stmt))) - continue; + { + gsi_next (&gsi); + continue; + } /* Check for a subtract used only in an addition. If this is the case, transform it into add of a negate for better @@ -2167,6 +2421,7 @@ break_up_subtract_bb (basic_block bb) else if (gimple_assign_rhs_code (stmt) == NEGATE_EXPR && can_reassociate_p (gimple_assign_rhs1 (stmt))) VEC_safe_push (tree, heap, plus_negates, gimple_assign_lhs (stmt)); + gsi_next (&gsi); } for (son = first_dom_son (CDI_DOMINATORS, bb); son; Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a != 0 & c != 0 & b != 0; + int r2 = a == 0 | b != 0 | d == 0; + return (r1 != 0 & r2 == 0); +} + +/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a != 0 & c != 0 & b != 0; + int r2 = a == 0 | b != 0 | d == 0; + return (r1 == 0 | r2 != 0); +} + +/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = (a | b | c) == 0; + int r2 = (a | d | c) != 0 | b == 0; + return (r1 == 0 | r2 != 0); +} + +/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-27.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-27.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a & ~c & b; + int r2 = ~a | b | ~d; + return (r1 & ~r2); +} + +/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-28.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-28.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = a & ~c & b; + int r2 = ~a | b | ~d; + return (~r1 | r2); +} + +/* { dg-final { scan-tree-dump-times "return -1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-29.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-29.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */ + +int foo (int a, int b, int c, int d) +{ + int r1 = ~(a | b | c); + int r2 = (a | d | c) | ~b; + return (~r1 | r2); +} + +/* { dg-final { scan-tree-dump-times "return -1" 1 "optimized"} } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */