Message ID | orcyi0iqbz.fsf_-_@lxoliva.fsfla.org |
---|---|
State | New |
Headers | show |
Series | [v4] fold fold_truth_andor field merging into ifcombine | expand |
On Mon, Dec 9, 2024 at 8:55 PM Alexandre Oliva <oliva@adacore.com> wrote: > > This patch introduces various improvements to the logic that merges > field compares, while moving it into ifcombine. > > Before the patch, we could merge: > > (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) > > into something like: > > (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) > > if both of A's fields live within the same alignment boundaries, and > so do B's, at the same relative positions. Constants may be used > instead of the object B. > > The initial goal of this patch was to enable such combinations when a > field crossed alignment boundaries, e.g. for packed types. We can't > generally access such fields with a single memory access, so when we > come across such a compare, we will attempt to combine each access > separately. > > Some merging opportunities were missed because of right-shifts, > compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and > narrowing conversions, especially after earlier merges. This patch > introduces handlers for several cases involving these. > > The merging of multiple field accesses into wider bitfield-like > accesses is undesirable to do too early in compilation, so we move it > from folding to ifcombine, and guard its warnings with > -Wtautological-compare, turned into a common flag. > > When the second of a noncontiguous pair of compares is the first that > accesses a word, we may merge the first compare with part of the > second compare that refers to the same word, keeping the compare of > the remaining bits at the spot where the second compare used to be. > > Handling compares with non-constant fields was somewhat generalized > from what fold used to do, now handling non-adjacent fields, even if a > field of one object crosses an alignment boundary but the other > doesn't. > > > Regstrapped on x86_64-linux-gnu. The aarch64 CI had reported a > regression with v3, so I'm also running a regstrap on aarch64-linux-gnu, > but it will still be a while. Ok to install if it passes? OK. Thanks again for doing this, Richard. > Changed from v3: > > - renamed new predicates within gimple-fold.cc, split out check for > integral constants, drop unneeded canonicalization > > - disabled ifcombine for gcc.target/aarch64/long_branch_1.c. I posted > the incremental patch too soon and it had a thinko, fixed herein, > sorry. > > Changes from v2: > > - fixed ICE when xor appeared only in the right-hand operand of a > compare > > - simplified the interface of decode_field_reference. pand_mask can now > be NULL, meaning mask operations are to be rejected, but this ended up > unused. It no longer rejects plain SSA names as the expr to be > combined. > > - drop unused variables from fold_truth_andor_for_ifcombine, and ensure > right-hand masks are applied even on constants > > - normalize width of masks and constants represented as wide_int, fixing > regressions introduced by the transition to wide_int. Don't bother to > sign extend them, treat them as unsigned since it's all bitwise > equality tests. > > - more tests; fixed thinko in v2-added test > > - guard warnings with -Wtautological-compares; made the option available > on all languages > > - backpedal from match.pd matchers to open-coded matchers > > - improve vuse setting to avoid unwarranted assumptions about > gimplification > > - simplify selection of types for combined operands > > - drop side effects tests not sensible in gimple > > - explain apparent presence of TRUTH_*IF_EXPR in gimple > > Changes from v1: > > - Noncontiguous ifcombine split out and already installed. > > - Do not attempt to place new loads next to the latest of the original > loads. Only check that loads have the same vuses, and place loads > next to either one, even if that's suboptimal. > > - Use gimple pattern matching. It turned out not to be very useful, but > it enabled the elimination of redundant auxiliary functions. > > - Rewrote constants and masks using wide_int. > > - Work harder to gather and reuse location_t. > > - Rework BIT_XOR handling to avoid having to match patterns again. > > - Distinguish the cases of contiguous and noncontiguous conditions. > > - Comments, lots of comments. > > - More tests. > > - Dropped the warnings that were hitting i386 and rs6000 bootstraps. > The new possibilities with ifcombine, noncontiguous at that, on top of > more flexible field merging, made room for too many false positives. > > Requested but unchanged from v1: > > - fold_truth_andor_for_ifcombine (renamed from ...andor_maybe_separate) > still builds and returns generic expressions. Since other > ifcombine_ifandif build generic exprs and have to go through > regimplification anyway, I failed to see the point. Implementing that > change would require some more work in tree-ssa-ifcombine.cc to support. > I'd rather do that as a follow up if desired, please let me know. > > - it also remains a single bulky function. There's so much state that > breaking it up doesn't seem to be helpful. Hopefully the added comments > will help despite making it even bigger. > > - TBAA situation is unchanged, same as what's carried over from fold. > I'm not sure what the concerns in your mind are, but if there are actual > problems, they have long been there, and we'd better address them in > both fold and in this bit now moved to ifcombine, ideally in a separate > backportable patch. > > > for gcc/ChangeLog > > * fold-const.cc (make_bit_field): Export. > (unextend, all_ones_mask_p): Drop. > (decode_field_reference, fold_truth_andor_1): Move > field compare merging logic... > * gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here, > with -Wtautological-compare warning guards, and... > (decode_field_reference): ... here. Rework for gimple. > (gimple_convert_def_p, gimple_binop_def_p): New. > (compute_split_boundary_from_align): New. > (make_bit_field_load, build_split_load): New. > (reuse_split_load): New. > * fold-const.h: (make_bit_field_ref): Declare > (fold_truth_andor_for_ifcombine): Declare. > * tree-ssa-ifcombine.cc (ifcombine_ifandif): Try > fold_truth_andor_for_ifcombine. > * common.opt (Wtautological-compare): Move here. > > for gcc/c-family/ChangeLog > > * c.opt (Wtautological-compare): Move to ../common.opt. > > for gcc/testsuite/ChangeLog > > * gcc.dg/field-merge-1.c: New. > * gcc.dg/field-merge-2.c: New. > * gcc.dg/field-merge-3.c: New. > * gcc.dg/field-merge-4.c: New. > * gcc.dg/field-merge-5.c: New. > * gcc.dg/field-merge-6.c: New. > * gcc.dg/field-merge-7.c: New. > * gcc.dg/field-merge-8.c: New. > * gcc.dg/field-merge-9.c: New. > * gcc.dg/field-merge-10.c: New. > * gcc.dg/field-merge-11.c: New. > * gcc.dg/field-merge-12.c: New. > * gcc.target/aarch64/long_branch_1.c: Disable ifcombine. > --- > gcc/c-family/c.opt | 4 > gcc/common.opt | 4 > gcc/fold-const.cc | 512 ---------- > gcc/fold-const.h | 10 > gcc/gimple-fold.cc | 1148 ++++++++++++++++++++++ > gcc/testsuite/gcc.dg/field-merge-1.c | 64 + > gcc/testsuite/gcc.dg/field-merge-10.c | 36 + > gcc/testsuite/gcc.dg/field-merge-11.c | 32 + > gcc/testsuite/gcc.dg/field-merge-12.c | 33 + > gcc/testsuite/gcc.dg/field-merge-2.c | 31 + > gcc/testsuite/gcc.dg/field-merge-3.c | 36 + > gcc/testsuite/gcc.dg/field-merge-4.c | 40 + > gcc/testsuite/gcc.dg/field-merge-5.c | 40 + > gcc/testsuite/gcc.dg/field-merge-6.c | 26 > gcc/testsuite/gcc.dg/field-merge-7.c | 23 > gcc/testsuite/gcc.dg/field-merge-8.c | 25 > gcc/testsuite/gcc.dg/field-merge-9.c | 38 + > gcc/testsuite/gcc.target/aarch64/long_branch_1.c | 2 > gcc/tree-ssa-ifcombine.cc | 14 > 19 files changed, 1604 insertions(+), 514 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/field-merge-1.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-10.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-11.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-12.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-2.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-3.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-4.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-5.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-6.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-7.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-8.c > create mode 100644 gcc/testsuite/gcc.dg/field-merge-9.c > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 0d255561b1bdc..be7916e957d66 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1477,10 +1477,6 @@ Wtemplates > C++ ObjC++ Var(warn_templates) Warning > Warn on primary template declaration. > > -Wtautological-compare > -C ObjC C++ ObjC++ Var(warn_tautological_compare) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) > -Warn if a comparison always evaluates to true or false. > - > Wtemplate-body > C++ ObjC++ Var(warn_template_body) Warning Init(1) > Diagnose errors when parsing a template. > diff --git a/gcc/common.opt b/gcc/common.opt > index bb226ac61e6a1..915ce5bffb4e0 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -812,6 +812,10 @@ Wsystem-headers > Common Var(warn_system_headers) Warning > Do not suppress warnings from system headers. > > +Wtautological-compare > +Common Var(warn_tautological_compare) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) > +Warn if a comparison always evaluates to true or false. > + > Wtrampolines > Common Var(warn_trampolines) Warning > Warn whenever a trampoline is generated. > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 1e8ae1ab493ba..6449664598641 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -137,7 +137,6 @@ static tree range_successor (tree); > static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); > static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, > tree, tree, tree, tree); > -static tree unextend (tree, int, int, tree); > static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); > static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); > static tree fold_binary_op_with_conditional_arg (location_t, > @@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg) > is the original memory reference used to preserve the alias set of > the access. */ > > -static tree > +tree > make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, > HOST_WIDE_INT bitsize, poly_int64 bitpos, > int unsignedp, int reversep) > @@ -4961,136 +4960,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, > return lhs; > } > > -/* Subroutine for fold_truth_andor_1: decode a field reference. > - > - If EXP is a comparison reference, we return the innermost reference. > - > - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is > - set to the starting bit number. > - > - If the innermost field can be completely contained in a mode-sized > - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. > - > - *PVOLATILEP is set to 1 if the any expression encountered is volatile; > - otherwise it is not changed. > - > - *PUNSIGNEDP is set to the signedness of the field. > - > - *PREVERSEP is set to the storage order of the field. > - > - *PMASK is set to the mask used. This is either contained in a > - BIT_AND_EXPR or derived from the width of the field. > - > - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. > - > - Return 0 if this is not a component reference or is one that we can't > - do anything with. */ > - > -static tree > -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, > - HOST_WIDE_INT *pbitpos, machine_mode *pmode, > - int *punsignedp, int *preversep, int *pvolatilep, > - tree *pmask, tree *pand_mask) > -{ > - tree exp = *exp_; > - tree outer_type = 0; > - tree and_mask = 0; > - tree mask, inner, offset; > - tree unsigned_type; > - unsigned int precision; > - > - /* All the optimizations using this function assume integer fields. > - There are problems with FP fields since the type_for_size call > - below can fail for, e.g., XFmode. */ > - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) > - return NULL_TREE; > - > - /* We are interested in the bare arrangement of bits, so strip everything > - that doesn't affect the machine mode. However, record the type of the > - outermost expression if it may matter below. */ > - if (CONVERT_EXPR_P (exp) > - || TREE_CODE (exp) == NON_LVALUE_EXPR) > - outer_type = TREE_TYPE (exp); > - STRIP_NOPS (exp); > - > - if (TREE_CODE (exp) == BIT_AND_EXPR) > - { > - and_mask = TREE_OPERAND (exp, 1); > - exp = TREE_OPERAND (exp, 0); > - STRIP_NOPS (exp); STRIP_NOPS (and_mask); > - if (TREE_CODE (and_mask) != INTEGER_CST) > - return NULL_TREE; > - } > - > - poly_int64 poly_bitsize, poly_bitpos; > - inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, > - pmode, punsignedp, preversep, pvolatilep); > - if ((inner == exp && and_mask == 0) > - || !poly_bitsize.is_constant (pbitsize) > - || !poly_bitpos.is_constant (pbitpos) > - || *pbitsize < 0 > - || offset != 0 > - || TREE_CODE (inner) == PLACEHOLDER_EXPR > - /* We eventually want to build a larger reference and need to take > - the address of this. */ > - || (!REFERENCE_CLASS_P (inner) && !DECL_P (inner)) > - /* Reject out-of-bound accesses (PR79731). */ > - || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) > - && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), > - *pbitpos + *pbitsize) < 0)) > - return NULL_TREE; > - > - unsigned_type = lang_hooks.types.type_for_size (*pbitsize, 1); > - if (unsigned_type == NULL_TREE) > - return NULL_TREE; > - > - *exp_ = exp; > - > - /* If the number of bits in the reference is the same as the bitsize of > - the outer type, then the outer type gives the signedness. Otherwise > - (in case of a small bitfield) the signedness is unchanged. */ > - if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) > - *punsignedp = TYPE_UNSIGNED (outer_type); > - > - /* Compute the mask to access the bitfield. */ > - precision = TYPE_PRECISION (unsigned_type); > - > - mask = build_int_cst_type (unsigned_type, -1); > - > - mask = const_binop (LSHIFT_EXPR, mask, size_int (precision - *pbitsize)); > - mask = const_binop (RSHIFT_EXPR, mask, size_int (precision - *pbitsize)); > - > - /* Merge it with the mask we found in the BIT_AND_EXPR, if any. */ > - if (and_mask != 0) > - mask = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, > - fold_convert_loc (loc, unsigned_type, and_mask), mask); > - > - *pmask = mask; > - *pand_mask = and_mask; > - return inner; > -} > - > -/* Return nonzero if MASK represents a mask of SIZE ones in the low-order > - bit positions and MASK is SIGNED. */ > - > -static bool > -all_ones_mask_p (const_tree mask, unsigned int size) > -{ > - tree type = TREE_TYPE (mask); > - unsigned int precision = TYPE_PRECISION (type); > - > - /* If this function returns true when the type of the mask is > - UNSIGNED, then there will be errors. In particular see > - gcc.c-torture/execute/990326-1.c. There does not appear to be > - any documentation paper trail as to why this is so. But the pre > - wide-int worked with that restriction and it has been preserved > - here. */ > - if (size > precision || TYPE_SIGN (type) == UNSIGNED) > - return false; > - > - return wi::mask (size, false, precision) == wi::to_wide (mask); > -} > - > /* Subroutine for fold: determine if VAL is the INTEGER_CONST that > represents the sign bit of EXP's type. If EXP represents a sign > or zero extension, also test VAL against the unextended type. > @@ -6400,48 +6269,6 @@ fold_range_test (location_t loc, enum tree_code code, tree type, > return 0; > } > > -/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P > - bit value. Arrange things so the extra bits will be set to zero if and > - only if C is signed-extended to its full width. If MASK is nonzero, > - it is an INTEGER_CST that should be AND'ed with the extra bits. */ > - > -static tree > -unextend (tree c, int p, int unsignedp, tree mask) > -{ > - tree type = TREE_TYPE (c); > - int modesize = GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE (type)); > - tree temp; > - > - if (p == modesize || unsignedp) > - return c; > - > - /* We work by getting just the sign bit into the low-order bit, then > - into the high-order bit, then sign-extend. We then XOR that value > - with C. */ > - temp = build_int_cst (TREE_TYPE (c), > - wi::extract_uhwi (wi::to_wide (c), p - 1, 1)); > - > - /* We must use a signed type in order to get an arithmetic right shift. > - However, we must also avoid introducing accidental overflows, so that > - a subsequent call to integer_zerop will work. Hence we must > - do the type conversion here. At this point, the constant is either > - zero or one, and the conversion to a signed type can never overflow. > - We could get an overflow if this conversion is done anywhere else. */ > - if (TYPE_UNSIGNED (type)) > - temp = fold_convert (signed_type_for (type), temp); > - > - temp = const_binop (LSHIFT_EXPR, temp, size_int (modesize - 1)); > - temp = const_binop (RSHIFT_EXPR, temp, size_int (modesize - p - 1)); > - if (mask != 0) > - temp = const_binop (BIT_AND_EXPR, temp, > - fold_convert (TREE_TYPE (c), mask)); > - /* If necessary, convert the type back to match the type of C. */ > - if (TYPE_UNSIGNED (type)) > - temp = fold_convert (type, temp); > - > - return fold_convert (type, const_binop (BIT_XOR_EXPR, c, temp)); > -} > - > /* For an expression that has the form > (A && B) || ~B > or > @@ -6512,20 +6339,13 @@ merge_truthop_with_opposite_arm (location_t loc, tree op, tree cmpop, > lhs, rhs); > return NULL_TREE; > } > - > + > /* Find ways of folding logical expressions of LHS and RHS: > Try to merge two comparisons to the same innermost item. > Look for range tests like "ch >= '0' && ch <= '9'". > Look for combinations of simple terms on machines with expensive branches > and evaluate the RHS unconditionally. > > - For example, if we have p->a == 2 && p->b == 4 and we can make an > - object large enough to span both A and B, we can do this with a comparison > - against the object ANDed with the a mask. > - > - If we have p->a == q->a && p->b == q->b, we may be able to use bit masking > - operations to do this with one comparison. > - > We check for both normal comparisons and the BIT_AND_EXPRs made this by > function and the one above. > > @@ -6550,24 +6370,9 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, > convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" > comparison for one-bit fields. */ > > - enum tree_code wanted_code; > enum tree_code lcode, rcode; > tree ll_arg, lr_arg, rl_arg, rr_arg; > - tree ll_inner, lr_inner, rl_inner, rr_inner; > - HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; > - HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; > - HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; > - HOST_WIDE_INT lnbitsize, lnbitpos, rnbitsize, rnbitpos; > - int ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; > - int ll_reversep, lr_reversep, rl_reversep, rr_reversep; > - machine_mode ll_mode, lr_mode, rl_mode, rr_mode; > - scalar_int_mode lnmode, rnmode; > - tree ll_mask, lr_mask, rl_mask, rr_mask; > - tree ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; > - tree l_const, r_const; > - tree lntype, rntype, result; > - HOST_WIDE_INT first_bit, end_bit; > - int volatilep; > + tree result; > > /* Start by getting the comparison codes. Fail if anything is volatile. > If one operand is a BIT_AND_EXPR with the constant one, treat it as if > @@ -6662,316 +6467,7 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, > build_int_cst (TREE_TYPE (ll_arg), 0)); > } > > - /* See if the comparisons can be merged. Then get all the parameters for > - each side. */ > - > - if ((lcode != EQ_EXPR && lcode != NE_EXPR) > - || (rcode != EQ_EXPR && rcode != NE_EXPR)) > - return 0; > - > - ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; > - volatilep = 0; > - ll_inner = decode_field_reference (loc, &ll_arg, > - &ll_bitsize, &ll_bitpos, &ll_mode, > - &ll_unsignedp, &ll_reversep, &volatilep, > - &ll_mask, &ll_and_mask); > - lr_inner = decode_field_reference (loc, &lr_arg, > - &lr_bitsize, &lr_bitpos, &lr_mode, > - &lr_unsignedp, &lr_reversep, &volatilep, > - &lr_mask, &lr_and_mask); > - rl_inner = decode_field_reference (loc, &rl_arg, > - &rl_bitsize, &rl_bitpos, &rl_mode, > - &rl_unsignedp, &rl_reversep, &volatilep, > - &rl_mask, &rl_and_mask); > - rr_inner = decode_field_reference (loc, &rr_arg, > - &rr_bitsize, &rr_bitpos, &rr_mode, > - &rr_unsignedp, &rr_reversep, &volatilep, > - &rr_mask, &rr_and_mask); > - > - /* It must be true that the inner operation on the lhs of each > - comparison must be the same if we are to be able to do anything. > - Then see if we have constants. If not, the same must be true for > - the rhs's. */ > - if (volatilep > - || ll_reversep != rl_reversep > - || ll_inner == 0 || rl_inner == 0 > - || ! operand_equal_p (ll_inner, rl_inner, 0)) > - return 0; > - > - if (TREE_CODE (lr_arg) == INTEGER_CST > - && TREE_CODE (rr_arg) == INTEGER_CST) > - { > - l_const = lr_arg, r_const = rr_arg; > - lr_reversep = ll_reversep; > - } > - else if (lr_reversep != rr_reversep > - || lr_inner == 0 || rr_inner == 0 > - || ! operand_equal_p (lr_inner, rr_inner, 0)) > - return 0; > - else > - l_const = r_const = 0; > - > - /* If either comparison code is not correct for our logical operation, > - fail. However, we can convert a one-bit comparison against zero into > - the opposite comparison against that bit being set in the field. */ > - > - wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); > - if (lcode != wanted_code) > - { > - if (l_const && integer_zerop (l_const) && integer_pow2p (ll_mask)) > - { > - /* Make the left operand unsigned, since we are only interested > - in the value of one bit. Otherwise we are doing the wrong > - thing below. */ > - ll_unsignedp = 1; > - l_const = ll_mask; > - } > - else > - return 0; > - } > - > - /* This is analogous to the code for l_const above. */ > - if (rcode != wanted_code) > - { > - if (r_const && integer_zerop (r_const) && integer_pow2p (rl_mask)) > - { > - rl_unsignedp = 1; > - r_const = rl_mask; > - } > - else > - return 0; > - } > - > - /* See if we can find a mode that contains both fields being compared on > - the left. If we can't, fail. Otherwise, update all constants and masks > - to be relative to a field of that size. */ > - first_bit = MIN (ll_bitpos, rl_bitpos); > - end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); > - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, > - TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, > - volatilep, &lnmode)) > - return 0; > - > - lnbitsize = GET_MODE_BITSIZE (lnmode); > - lnbitpos = first_bit & ~ (lnbitsize - 1); > - lntype = lang_hooks.types.type_for_size (lnbitsize, 1); > - xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; > - > - if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > - { > - xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; > - xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; > - } > - > - ll_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, ll_mask), > - size_int (xll_bitpos)); > - rl_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, rl_mask), > - size_int (xrl_bitpos)); > - if (ll_mask == NULL_TREE || rl_mask == NULL_TREE) > - return 0; > - > - if (l_const) > - { > - l_const = fold_convert_loc (loc, lntype, l_const); > - l_const = unextend (l_const, ll_bitsize, ll_unsignedp, ll_and_mask); > - l_const = const_binop (LSHIFT_EXPR, l_const, size_int (xll_bitpos)); > - if (l_const == NULL_TREE) > - return 0; > - if (! integer_zerop (const_binop (BIT_AND_EXPR, l_const, > - fold_build1_loc (loc, BIT_NOT_EXPR, > - lntype, ll_mask)))) > - { > - warning (0, "comparison is always %d", wanted_code == NE_EXPR); > - > - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); > - } > - } > - if (r_const) > - { > - r_const = fold_convert_loc (loc, lntype, r_const); > - r_const = unextend (r_const, rl_bitsize, rl_unsignedp, rl_and_mask); > - r_const = const_binop (LSHIFT_EXPR, r_const, size_int (xrl_bitpos)); > - if (r_const == NULL_TREE) > - return 0; > - if (! integer_zerop (const_binop (BIT_AND_EXPR, r_const, > - fold_build1_loc (loc, BIT_NOT_EXPR, > - lntype, rl_mask)))) > - { > - warning (0, "comparison is always %d", wanted_code == NE_EXPR); > - > - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); > - } > - } > - > - /* If the right sides are not constant, do the same for it. Also, > - disallow this optimization if a size, signedness or storage order > - mismatch occurs between the left and right sides. */ > - if (l_const == 0) > - { > - if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize > - || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp > - || ll_reversep != lr_reversep > - /* Make sure the two fields on the right > - correspond to the left without being swapped. */ > - || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) > - return 0; > - > - first_bit = MIN (lr_bitpos, rr_bitpos); > - end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); > - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, > - TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, > - volatilep, &rnmode)) > - return 0; > - > - rnbitsize = GET_MODE_BITSIZE (rnmode); > - rnbitpos = first_bit & ~ (rnbitsize - 1); > - rntype = lang_hooks.types.type_for_size (rnbitsize, 1); > - xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; > - > - if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > - { > - xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; > - xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; > - } > - > - lr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, > - rntype, lr_mask), > - size_int (xlr_bitpos)); > - rr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, > - rntype, rr_mask), > - size_int (xrr_bitpos)); > - if (lr_mask == NULL_TREE || rr_mask == NULL_TREE) > - return 0; > - > - /* Make a mask that corresponds to both fields being compared. > - Do this for both items being compared. If the operands are the > - same size and the bits being compared are in the same position > - then we can do this by masking both and comparing the masked > - results. */ > - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); > - lr_mask = const_binop (BIT_IOR_EXPR, lr_mask, rr_mask); > - if (lnbitsize == rnbitsize > - && xll_bitpos == xlr_bitpos > - && lnbitpos >= 0 > - && rnbitpos >= 0) > - { > - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, > - lntype, lnbitsize, lnbitpos, > - ll_unsignedp || rl_unsignedp, ll_reversep); > - if (! all_ones_mask_p (ll_mask, lnbitsize)) > - lhs = build2 (BIT_AND_EXPR, lntype, lhs, ll_mask); > - > - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, > - rntype, rnbitsize, rnbitpos, > - lr_unsignedp || rr_unsignedp, lr_reversep); > - if (! all_ones_mask_p (lr_mask, rnbitsize)) > - rhs = build2 (BIT_AND_EXPR, rntype, rhs, lr_mask); > - > - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); > - } > - > - /* There is still another way we can do something: If both pairs of > - fields being compared are adjacent, we may be able to make a wider > - field containing them both. > - > - Note that we still must mask the lhs/rhs expressions. Furthermore, > - the mask must be shifted to account for the shift done by > - make_bit_field_ref. */ > - if (((ll_bitsize + ll_bitpos == rl_bitpos > - && lr_bitsize + lr_bitpos == rr_bitpos) > - || (ll_bitpos == rl_bitpos + rl_bitsize > - && lr_bitpos == rr_bitpos + rr_bitsize)) > - && ll_bitpos >= 0 > - && rl_bitpos >= 0 > - && lr_bitpos >= 0 > - && rr_bitpos >= 0) > - { > - tree type; > - > - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, lntype, > - ll_bitsize + rl_bitsize, > - MIN (ll_bitpos, rl_bitpos), > - ll_unsignedp, ll_reversep); > - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, rntype, > - lr_bitsize + rr_bitsize, > - MIN (lr_bitpos, rr_bitpos), > - lr_unsignedp, lr_reversep); > - > - ll_mask = const_binop (RSHIFT_EXPR, ll_mask, > - size_int (MIN (xll_bitpos, xrl_bitpos))); > - lr_mask = const_binop (RSHIFT_EXPR, lr_mask, > - size_int (MIN (xlr_bitpos, xrr_bitpos))); > - if (ll_mask == NULL_TREE || lr_mask == NULL_TREE) > - return 0; > - > - /* Convert to the smaller type before masking out unwanted bits. */ > - type = lntype; > - if (lntype != rntype) > - { > - if (lnbitsize > rnbitsize) > - { > - lhs = fold_convert_loc (loc, rntype, lhs); > - ll_mask = fold_convert_loc (loc, rntype, ll_mask); > - type = rntype; > - } > - else if (lnbitsize < rnbitsize) > - { > - rhs = fold_convert_loc (loc, lntype, rhs); > - lr_mask = fold_convert_loc (loc, lntype, lr_mask); > - type = lntype; > - } > - } > - > - if (! all_ones_mask_p (ll_mask, ll_bitsize + rl_bitsize)) > - lhs = build2 (BIT_AND_EXPR, type, lhs, ll_mask); > - > - if (! all_ones_mask_p (lr_mask, lr_bitsize + rr_bitsize)) > - rhs = build2 (BIT_AND_EXPR, type, rhs, lr_mask); > - > - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); > - } > - > - return 0; > - } > - > - /* Handle the case of comparisons with constants. If there is something in > - common between the masks, those bits of the constants must be the same. > - If not, the condition is always false. Test for this to avoid generating > - incorrect code below. */ > - result = const_binop (BIT_AND_EXPR, ll_mask, rl_mask); > - if (! integer_zerop (result) > - && simple_cst_equal (const_binop (BIT_AND_EXPR, result, l_const), > - const_binop (BIT_AND_EXPR, result, r_const)) != 1) > - { > - if (wanted_code == NE_EXPR) > - { > - warning (0, "%<or%> of unmatched not-equal tests is always 1"); > - return constant_boolean_node (true, truth_type); > - } > - else > - { > - warning (0, "%<and%> of mutually exclusive equal-tests is always 0"); > - return constant_boolean_node (false, truth_type); > - } > - } > - > - if (lnbitpos < 0) > - return 0; > - > - /* Construct the expression we will return. First get the component > - reference we will make. Unless the mask is all ones the width of > - that field, perform the mask operation. Then compare with the > - merged constant. */ > - result = make_bit_field_ref (loc, ll_inner, ll_arg, > - lntype, lnbitsize, lnbitpos, > - ll_unsignedp || rl_unsignedp, ll_reversep); > - > - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); > - if (! all_ones_mask_p (ll_mask, lnbitsize)) > - result = build2_loc (loc, BIT_AND_EXPR, lntype, result, ll_mask); > - > - return build2_loc (loc, wanted_code, truth_type, result, > - const_binop (BIT_IOR_EXPR, l_const, r_const)); > + return 0; > } > > /* T is an integer expression that is being multiplied, divided, or taken a > diff --git a/gcc/fold-const.h b/gcc/fold-const.h > index 3e3998b57b042..77a5c916cbd88 100644 > --- a/gcc/fold-const.h > +++ b/gcc/fold-const.h > @@ -253,11 +253,21 @@ extern tree fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE > extern tree_code minmax_from_comparison (tree_code, tree, tree, > tree, tree); > > +extern tree make_bit_field_ref (location_t, tree, tree, tree, > + HOST_WIDE_INT, poly_int64, int, int); > + > /* In gimple-fold.cc. */ > extern void clear_type_padding_in_mask (tree, unsigned char *); > extern bool clear_padding_type_may_have_padding_p (tree); > extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, > const_tree); > +extern tree fold_truth_andor_for_ifcombine (enum tree_code, tree, > + location_t, enum tree_code, > + tree, tree, > + location_t, enum tree_code, > + tree, tree, > + tree *); > + > > /* Class used to compare gimple operands. */ > > diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc > index 3c72cd6c79ae6..a31fc283d51b0 100644 > --- a/gcc/gimple-fold.cc > +++ b/gcc/gimple-fold.cc > @@ -7438,7 +7438,1155 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, > > return NULL_TREE; > } > + > +/* Return TRUE and set op[0] if T, following all SSA edges, is a type > + conversion. */ > > +static bool > +gimple_convert_def_p (tree t, tree op[1]) > +{ > + if (TREE_CODE (t) == SSA_NAME > + && !SSA_NAME_IS_DEFAULT_DEF (t)) > + if (gassign *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (t))) > + switch (gimple_assign_rhs_code (def)) > + { > + CASE_CONVERT: > + op[0] = gimple_assign_rhs1 (def); > + return true; > + case VIEW_CONVERT_EXPR: > + op[0] = TREE_OPERAND (gimple_assign_rhs1 (def), 0); > + return true; > + default: > + break; > + } > + return false; > +} > + > +/* Return TRUE and set op[*] if T, following all SSA edges, resolves to a > + binary expression with code CODE. */ > + > +static bool > +gimple_binop_def_p (enum tree_code code, tree t, tree op[2]) > +{ > + if (TREE_CODE (t) == SSA_NAME > + && !SSA_NAME_IS_DEFAULT_DEF (t)) > + if (gimple *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (t))) > + if (gimple_assign_rhs_code (def) == code) > + { > + op[0] = gimple_assign_rhs1 (def); > + op[1] = gimple_assign_rhs2 (def); > + return true; > + } > + return false; > +} > +/* Subroutine for fold_truth_andor_1: decode a field reference. > + > + If *PEXP is a comparison reference, we return the innermost reference. > + > + *PBITSIZE is set to the number of bits in the reference, *PBITPOS is > + set to the starting bit number. > + > + *PVOLATILEP is set to 1 if the any expression encountered is volatile; > + otherwise it is not changed. > + > + *PUNSIGNEDP is set to the signedness of the field. > + > + *PREVERSEP is set to the storage order of the field. > + > + *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. > + If PAND_MASK *is NULL, BIT_AND_EXPR is not recognized. > + > + *XOR_P is to be FALSE if EXP might be a XOR used in a compare, in which > + case, if XOR_CMP_OP is a zero constant, it will be overridden with *PEXP, > + *XOR_P will be set to TRUE, and the left-hand operand of the XOR will be > + decoded. If *XOR_P is TRUE, XOR_CMP_OP is supposed to be NULL, and then the > + right-hand operand of the XOR will be decoded. > + > + *LOAD is set to the load stmt of the innermost reference, if any, > + *and NULL otherwise. > + > + LOC[0..3] are filled in as conversion, masking, shifting and loading > + operations are located. > + > + Return 0 if this is not a component reference or is one that we can't > + do anything with. */ > + > +static tree > +decode_field_reference (tree *pexp, HOST_WIDE_INT *pbitsize, > + HOST_WIDE_INT *pbitpos, > + bool *punsignedp, bool *preversep, bool *pvolatilep, > + wide_int *pand_mask, bool *xor_p, tree *xor_cmp_op, > + gimple **load, location_t loc[4]) > +{ > + tree exp = *pexp; > + tree outer_type = 0; > + wide_int and_mask; > + tree inner, offset; > + int shiftrt = 0; > + tree res_ops[2]; > + machine_mode mode; > + > + *load = NULL; > + > + /* All the optimizations using this function assume integer fields. > + There are problems with FP fields since the type_for_size call > + below can fail for, e.g., XFmode. */ > + if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) > + return NULL_TREE; > + > + /* Drop casts, only save the outermost type. We need not worry about > + narrowing then widening casts, or vice-versa, for those that are not > + essential for the compare have already been optimized out at this > + point. */ > + if (gimple_convert_def_p (exp, res_ops)) > + { > + if (!outer_type) > + { > + outer_type = TREE_TYPE (exp); > + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); > + } > + exp = res_ops[0]; > + } > + > + /* Recognize and save a masking operation. */ > + if (pand_mask && gimple_binop_def_p (BIT_AND_EXPR, exp, res_ops) > + && uniform_integer_cst_p (res_ops[1])) > + { > + loc[1] = gimple_location (SSA_NAME_DEF_STMT (exp)); > + exp = res_ops[0]; > + and_mask = wi::to_wide (res_ops[1]); > + } > + > + /* Turn (a ^ b) [!]= 0 into a [!]= b. */ > + if (xor_p && gimple_binop_def_p (BIT_XOR_EXPR, exp, res_ops) > + && uniform_integer_cst_p (res_ops[1])) > + { > + /* No location recorded for this one, it's entirely subsumed by the > + compare. */ > + if (*xor_p) > + { > + exp = res_ops[1]; > + gcc_checking_assert (!xor_cmp_op); > + } > + else if (!xor_cmp_op) > + /* Not much we can do when xor appears in the right-hand compare > + operand. */ > + return NULL_TREE; > + else > + { > + *xor_p = true; > + exp = res_ops[0]; > + *xor_cmp_op = *pexp; > + } > + } > + > + /* Another chance to drop conversions. */ > + if (gimple_convert_def_p (exp, res_ops)) > + { > + if (!outer_type) > + { > + outer_type = TREE_TYPE (exp); > + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); > + } > + exp = res_ops[0]; > + } > + > + /* Take note of shifts. */ > + if (gimple_binop_def_p (RSHIFT_EXPR, exp, res_ops) > + && uniform_integer_cst_p (res_ops[1])) > + { > + loc[2] = gimple_location (SSA_NAME_DEF_STMT (exp)); > + exp = res_ops[0]; > + if (!tree_fits_shwi_p (res_ops[1])) > + return NULL_TREE; > + shiftrt = tree_to_shwi (res_ops[1]); > + if (shiftrt <= 0) > + return NULL_TREE; > + } > + > + /* Yet another chance to drop conversions. */ > + if (gimple_convert_def_p (exp, res_ops)) > + { > + if (!outer_type) > + { > + outer_type = TREE_TYPE (exp); > + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); > + } > + exp = res_ops[0]; > + } > + > + /* Identify the load, if there is one. */ > + if (TREE_CODE (exp) == SSA_NAME > + && !SSA_NAME_IS_DEFAULT_DEF (exp)) > + { > + gimple *def = SSA_NAME_DEF_STMT (exp); > + if (gimple_assign_load_p (def)) > + { > + loc[3] = gimple_location (def); > + *load = def; > + exp = gimple_assign_rhs1 (def); > + } > + } > + > + /* Identify the relevant bits. */ > + poly_int64 poly_bitsize, poly_bitpos; > + int unsignedp, reversep = *preversep, volatilep = *pvolatilep; > + inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, > + &mode, &unsignedp, &reversep, &volatilep); > + > + HOST_WIDE_INT bs, bp; > + if (!poly_bitsize.is_constant (&bs) > + || !poly_bitpos.is_constant (&bp) > + || bs <= shiftrt > + || offset != 0 > + || TREE_CODE (inner) == PLACEHOLDER_EXPR > + /* Reject out-of-bound accesses (PR79731). */ > + || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) > + && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), > + bp + bs) < 0)) > + return NULL_TREE; > + > + *pbitsize = bs; > + *pbitpos = bp; > + *punsignedp = unsignedp; > + *preversep = reversep; > + *pvolatilep = volatilep; > + > + /* Adjust shifts... */ > + if (shiftrt) > + { > + if (!*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + *pbitpos += shiftrt; > + *pbitsize -= shiftrt; > + } > + > + /* ... and bit position. */ > + if (outer_type && *pbitsize > TYPE_PRECISION (outer_type)) > + { > + HOST_WIDE_INT excess = *pbitsize - TYPE_PRECISION (outer_type); > + if (*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + *pbitpos += excess; > + *pbitsize -= excess; > + } > + > + *pexp = exp; > + > + /* If the number of bits in the reference is the same as the bitsize of > + the outer type, then the outer type gives the signedness. Otherwise > + (in case of a small bitfield) the signedness is unchanged. */ > + if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) > + *punsignedp = TYPE_UNSIGNED (outer_type); > + > + /* Make the mask the expected width. */ > + if (and_mask.get_precision () != 0) > + and_mask = wide_int::from (and_mask, *pbitsize, UNSIGNED); > + > + if (pand_mask) > + *pand_mask = and_mask; > + > + return inner; > +} > + > +/* Return the one bitpos within bit extents L or R that is at an > + ALIGN-bit alignment boundary, or -1 if there is more than one such > + boundary, if there isn't any, or if there is any such boundary > + between the extents. L and R are given by bitpos and bitsize. If > + it doesn't return -1, there are two consecutive ALIGN-bit words > + that contain both extents, and at least one of the extents > + straddles across the returned alignment boundary. */ > + > +static inline HOST_WIDE_INT > +compute_split_boundary_from_align (HOST_WIDE_INT align, > + HOST_WIDE_INT l_bitpos, > + HOST_WIDE_INT l_bitsize, > + HOST_WIDE_INT r_bitpos, > + HOST_WIDE_INT r_bitsize) > +{ > + HOST_WIDE_INT amask = ~(align - 1); > + > + HOST_WIDE_INT first_bit = MIN (l_bitpos, r_bitpos); > + HOST_WIDE_INT end_bit = MAX (l_bitpos + l_bitsize, r_bitpos + r_bitsize); > + > + HOST_WIDE_INT boundary = (end_bit - 1) & amask; > + > + /* Make sure we're crossing no more than one alignment boundary. > + > + ??? We don't have logic to recombine loads of two adjacent > + fields that each crosses a different alignment boundary, so > + as to load the middle word only once, if other words can't be > + otherwise recombined. */ > + if (boundary - first_bit > align) > + return -1; > + > + HOST_WIDE_INT l_start_word = l_bitpos & amask; > + HOST_WIDE_INT l_end_word = (l_bitpos + l_bitsize - 1) & amask; > + > + HOST_WIDE_INT r_start_word = r_bitpos & amask; > + HOST_WIDE_INT r_end_word = (r_bitpos + r_bitsize - 1) & amask; > + > + /* If neither field straddles across an alignment boundary, it's no > + use to even try to merge them. */ > + if (l_start_word == l_end_word && r_start_word == r_end_word) > + return -1; > + > + return boundary; > +} > + > +/* Make a bit_field_ref. If POINT is NULL, return the BIT_FIELD_REF. > + Otherwise, build and insert a load stmt before POINT, and return > + the SSA_NAME. ??? Rewrite LOAD in terms of the bitfield? */ > + > +static tree > +make_bit_field_load (location_t loc, tree inner, tree orig_inner, tree type, > + HOST_WIDE_INT bitsize, poly_int64 bitpos, > + bool unsignedp, bool reversep, gimple *point) > +{ > + if (point && loc == UNKNOWN_LOCATION) > + loc = gimple_location (point); > + > + tree ref = make_bit_field_ref (loc, unshare_expr (inner), > + unshare_expr (orig_inner), > + type, bitsize, bitpos, > + unsignedp, reversep); > + if (!point) > + return ref; > + > + gimple_seq stmts = NULL; > + tree ret = force_gimple_operand (ref, &stmts, true, NULL_TREE); > + > + /* We know the vuse is supposed to end up being the same as that at the > + original load at the insertion point, but if we don't set it, it will be a > + generic placeholder that only the global SSA update at the end of the pass > + would make equal, too late for us to use in further combinations. So go > + ahead and copy the vuse. */ > + > + tree reaching_vuse = gimple_vuse (point); > + for (gimple_stmt_iterator i = gsi_start (stmts); > + !gsi_end_p (i); gsi_next (&i)) > + { > + gimple *new_stmt = gsi_stmt (i); > + if (gimple_has_mem_ops (new_stmt)) > + gimple_set_vuse (new_stmt, reaching_vuse); > + } > + > + gimple_stmt_iterator gsi = gsi_for_stmt (point); > + gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT); > + return ret; > +} > + > +/* Initialize ln_arg[0] and ln_arg[1] to a pair of newly-created (at > + LOC) loads from INNER (from ORIG_INNER), of modes MODE and MODE2, > + respectively, starting at BIT_POS, using reversed endianness if > + REVERSEP. Also initialize BITPOS (the starting position of each > + part into INNER), BITSIZ (the bit count starting at BITPOS), > + TOSHIFT[1] (the amount by which the part and its mask are to be > + shifted right to bring its least-significant bit to bit zero) and > + SHIFTED (the amount by which the part, by separate loading, has > + already been shifted right, but that the mask needs shifting to > + match). */ > + > +static inline void > +build_split_load (tree /* out */ ln_arg[2], > + HOST_WIDE_INT /* out */ bitpos[2], > + HOST_WIDE_INT /* out */ bitsiz[2], > + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], > + HOST_WIDE_INT /* out */ shifted[2], > + location_t loc, tree inner, tree orig_inner, > + scalar_int_mode mode, scalar_int_mode mode2, > + HOST_WIDE_INT bit_pos, bool reversep, > + gimple *point[2]) > +{ > + scalar_int_mode modes[2] = { mode, mode2 }; > + bitsiz[0] = GET_MODE_BITSIZE (mode); > + bitsiz[1] = GET_MODE_BITSIZE (mode2); > + > + for (int i = 0; i < 2; i++) > + { > + tree type = lang_hooks.types.type_for_mode (modes[i], 1); > + if (!type) > + { > + type = build_nonstandard_integer_type (bitsiz[0], 1); > + gcc_assert (type); > + } > + bitpos[i] = bit_pos; > + ln_arg[i] = make_bit_field_load (loc, inner, orig_inner, > + type, bitsiz[i], > + bit_pos, 1, reversep, point[i]); > + bit_pos += bitsiz[i]; > + } > + > + toshift[1] = toshift[0]; > + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + { > + shifted[0] = bitsiz[1]; > + shifted[1] = 0; > + toshift[0] = 0; > + } > + else > + { > + shifted[1] = bitsiz[0]; > + shifted[0] = 0; > + toshift[1] = 0; > + } > +} > + > +/* Make arrangements to split at bit BOUNDARY a single loaded word > + (with REVERSEP bit order) LN_ARG[0], to be shifted right by > + TOSHIFT[0] to bring the field of interest to the least-significant > + bit. The expectation is that the same loaded word will be > + propagated from part 0 to part 1, with just different shifting and > + masking to extract both parts. MASK is not expected to do more > + than masking out the bits that belong to the other part. See > + build_split_load for more information on the other fields. */ > + > +static inline void > +reuse_split_load (tree /* in[0] out[1] */ ln_arg[2], > + HOST_WIDE_INT /* in[0] out[1] */ bitpos[2], > + HOST_WIDE_INT /* in[0] out[1] */ bitsiz[2], > + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], > + HOST_WIDE_INT /* out */ shifted[2], > + wide_int /* out */ mask[2], > + HOST_WIDE_INT boundary, bool reversep) > +{ > + unsigned prec = TYPE_PRECISION (TREE_TYPE (ln_arg[0])); > + > + ln_arg[1] = ln_arg[0]; > + bitpos[1] = bitpos[0]; > + bitsiz[1] = bitsiz[0]; > + shifted[1] = shifted[0] = 0; > + > + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + { > + toshift[1] = toshift[0]; > + toshift[0] = bitpos[0] + bitsiz[0] - boundary; > + mask[0] = wi::mask (toshift[0], true, prec); > + mask[1] = wi::mask (toshift[0], false, prec); > + } > + else > + { > + toshift[1] = boundary - bitpos[1]; > + mask[1] = wi::mask (toshift[1], true, prec); > + mask[0] = wi::mask (toshift[1], false, prec); > + } > +} > + > +/* Find ways of folding logical expressions of LHS and RHS: > + > + Try to merge two comparisons to nearby fields. > + > + For example, if we have p->a == 2 && p->b == 4 and we can load both A and B > + at once, we can do this with a comparison against the object ANDed with the > + a mask. > + > + If we have p->a == q->a && p->b == q->b, we may be able to use bit masking > + operations to do this with one comparison, loading both fields from P at > + once, and likewise from Q. > + > + Herein, loading at once means loading from within the same alignment > + boundary for the enclosing object. If (packed) fields cross such alignment > + boundaries, we may still recombine the compares, so that loads do not cross > + the boundaries. > + > + CODE is the logical operation being done. It can be TRUTH_ANDIF_EXPR, > + TRUTH_AND_EXPR, TRUTH_ORIF_EXPR, or TRUTH_OR_EXPR. > + > + TRUTH_TYPE is the type of the logical operand. > + > + LHS is denoted as LL_ARG LCODE LR_ARG. > + > + RHS is denoted as RL_ARG RCODE RR_ARG. > + > + LHS is assumed to dominate RHS. > + > + Combined loads are inserted next to preexisting loads, once we determine > + that the combination is viable, and the combined condition references new > + SSA_NAMEs that hold the loaded values. Since the original loads are > + verified to have the same gimple_vuse, the insertion point doesn't matter > + for correctness. ??? The loads may be a lot earlier than the compares, and > + it's conceivable that one or two loads for RHS appear before those for LHS. > + It could be advantageous to try to place the loads optimally, taking > + advantage of knowing whether RHS is accessed before LHS, or that both are > + accessed before both compares, but we don't do that (yet?). > + > + SEPARATEP should be NULL if the combined condition must be returned as a > + single expression, even if it is a compound condition. This must only be > + done if LHS and RHS are adjacent, without intervening conditions, and the > + combined condition is to replace RHS, while LHS is dropped altogether. > + > + Otherwise, SEPARATEP must be a non-NULL pointer to a NULL_TREE, that may be > + replaced by a part of the compound condition that could replace RHS, while > + the returned expression replaces LHS. This works whether or not LHS and RHS > + are adjacent, as long as there aren't VDEFs or other side effects between > + them. > + > + If the "words" accessed by RHS are already accessed by LHS, this won't > + matter, but if RHS accesses "words" that LHS doesn't, then *SEPARATEP will > + be set to the compares that should take RHS's place. By "words" we mean > + contiguous bits that do not cross a an TYPE_ALIGN boundary of the accessed > + object's type. > + > + We return the simplified tree or 0 if no optimization is possible. */ > + > +tree > +fold_truth_andor_for_ifcombine (enum tree_code code, tree truth_type, > + location_t lloc, enum tree_code lcode, > + tree ll_arg, tree lr_arg, > + location_t rloc, enum tree_code rcode, > + tree rl_arg, tree rr_arg, > + tree *separatep) > +{ > + /* If this is the "or" of two comparisons, we can do something if > + the comparisons are NE_EXPR. If this is the "and", we can do something > + if the comparisons are EQ_EXPR. I.e., > + (a->b == 2 && a->c == 4) can become (a->new == NEW). > + > + WANTED_CODE is this operation code. For single bit fields, we can > + convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" > + comparison for one-bit fields. */ > + > + enum tree_code orig_code = code; > + enum tree_code wanted_code; > + tree ll_inner, lr_inner, rl_inner, rr_inner; > + gimple *ll_load, *lr_load, *rl_load, *rr_load; > + HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; > + HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; > + HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; > + HOST_WIDE_INT lnbitsize, lnbitpos, lnprec; > + HOST_WIDE_INT rnbitsize, rnbitpos, rnprec; > + bool ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; > + bool ll_reversep, lr_reversep, rl_reversep, rr_reversep; > + scalar_int_mode lnmode, lnmode2, rnmode; > + wide_int ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; > + wide_int l_const, r_const; > + tree lntype, rntype, result; > + HOST_WIDE_INT first_bit, end_bit; > + bool volatilep; > + bool l_split_load; > + > + /* These are indexed by: conv, mask, shft, load. */ > + location_t ll_loc[4] = { lloc, lloc, lloc, UNKNOWN_LOCATION }; > + location_t lr_loc[4] = { lloc, lloc, lloc, UNKNOWN_LOCATION }; > + location_t rl_loc[4] = { rloc, rloc, rloc, UNKNOWN_LOCATION }; > + location_t rr_loc[4] = { rloc, rloc, rloc, UNKNOWN_LOCATION }; > + > + gcc_checking_assert (!separatep || !*separatep); > + > + /* Start by getting the comparison codes. Fail if anything is volatile. > + If one operand is a BIT_AND_EXPR with the constant one, treat it as if > + it were surrounded with a NE_EXPR. */ > + > + if (TREE_CODE_CLASS (lcode) != tcc_comparison > + || TREE_CODE_CLASS (rcode) != tcc_comparison) > + return 0; > + > + /* We don't normally find TRUTH_*IF_EXPR in gimple, but these codes may be > + given by our caller to denote conditions from different blocks. */ > + switch (code) > + { > + case TRUTH_AND_EXPR: > + case TRUTH_ANDIF_EXPR: > + code = TRUTH_AND_EXPR; > + break; > + > + case TRUTH_OR_EXPR: > + case TRUTH_ORIF_EXPR: > + code = TRUTH_OR_EXPR; > + break; > + > + default: > + return 0; > + } > + > + bool lsignbit = false, rsignbit = false; > + if ((lcode == LT_EXPR || lcode == GE_EXPR) > + && integer_zerop (lr_arg) > + && INTEGRAL_TYPE_P (TREE_TYPE (ll_arg)) > + && !TYPE_UNSIGNED (TREE_TYPE (ll_arg))) > + { > + lsignbit = true; > + lcode = (lcode == LT_EXPR ? NE_EXPR : EQ_EXPR); > + } > + if ((rcode == LT_EXPR || rcode == GE_EXPR) > + && integer_zerop (rr_arg) > + && INTEGRAL_TYPE_P (TREE_TYPE (rl_arg)) > + && !TYPE_UNSIGNED (TREE_TYPE (rl_arg))) > + { > + rsignbit = true; > + rcode = (rcode == LT_EXPR ? NE_EXPR : EQ_EXPR); > + } > + > + /* See if the comparisons can be merged. Then get all the parameters for > + each side. */ > + > + if ((lcode != EQ_EXPR && lcode != NE_EXPR) > + || (rcode != EQ_EXPR && rcode != NE_EXPR)) > + return 0; > + > + ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; > + volatilep = 0; > + bool l_xor = false, r_xor = false; > + ll_inner = decode_field_reference (&ll_arg, &ll_bitsize, &ll_bitpos, > + &ll_unsignedp, &ll_reversep, &volatilep, > + &ll_and_mask, &l_xor, &lr_arg, > + &ll_load, ll_loc); > + lr_inner = decode_field_reference (&lr_arg, &lr_bitsize, &lr_bitpos, > + &lr_unsignedp, &lr_reversep, &volatilep, > + &lr_and_mask, &l_xor, 0, > + &lr_load, lr_loc); > + rl_inner = decode_field_reference (&rl_arg, &rl_bitsize, &rl_bitpos, > + &rl_unsignedp, &rl_reversep, &volatilep, > + &rl_and_mask, &r_xor, &rr_arg, > + &rl_load, rl_loc); > + rr_inner = decode_field_reference (&rr_arg, &rr_bitsize, &rr_bitpos, > + &rr_unsignedp, &rr_reversep, &volatilep, > + &rr_and_mask, &r_xor, 0, > + &rr_load, rr_loc); > + > + /* It must be true that the inner operation on the lhs of each > + comparison must be the same if we are to be able to do anything. > + Then see if we have constants. If not, the same must be true for > + the rhs's. */ > + if (volatilep > + || ll_reversep != rl_reversep > + || ll_inner == 0 || rl_inner == 0 > + || ! operand_equal_p (ll_inner, rl_inner, 0) > + || (ll_load && rl_load > + && gimple_vuse (ll_load) != gimple_vuse (rl_load))) > + return 0; > + > + if (TREE_CODE (lr_arg) == INTEGER_CST > + && TREE_CODE (rr_arg) == INTEGER_CST) > + { > + l_const = wi::to_wide (lr_arg); > + /* We don't expect masks on constants, but if there are any, apply > + them now. */ > + if (lr_and_mask.get_precision ()) > + l_const &= wide_int::from (lr_and_mask, > + l_const.get_precision (), UNSIGNED); > + r_const = wi::to_wide (rr_arg); > + if (rr_and_mask.get_precision ()) > + r_const &= wide_int::from (rr_and_mask, > + r_const.get_precision (), UNSIGNED); > + lr_reversep = ll_reversep; > + } > + else if (lr_reversep != rr_reversep > + || lr_inner == 0 || rr_inner == 0 > + || ! operand_equal_p (lr_inner, rr_inner, 0) > + || (lr_load && rr_load > + && gimple_vuse (lr_load) != gimple_vuse (rr_load))) > + return 0; > + > + /* If we found sign tests, finish turning them into bit tests. */ > + > + if (lsignbit) > + { > + wide_int sign = wi::mask (ll_bitsize - 1, true, ll_bitsize); > + if (!ll_and_mask.get_precision ()) > + ll_and_mask = sign; > + else > + ll_and_mask &= sign; > + } > + > + if (rsignbit) > + { > + wide_int sign = wi::mask (rl_bitsize - 1, true, rl_bitsize); > + if (!rl_and_mask.get_precision ()) > + rl_and_mask = sign; > + else > + rl_and_mask &= sign; > + } > + > + /* If either comparison code is not correct for our logical operation, > + fail. However, we can convert a one-bit comparison against zero into > + the opposite comparison against that bit being set in the field. */ > + > + wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); > + if (lcode != wanted_code) > + { > + if (l_const.get_precision () > + && l_const == 0 > + && ll_and_mask.get_precision () > + && wi::popcount (ll_and_mask) == 1) > + { > + /* Make the left operand unsigned, since we are only interested > + in the value of one bit. Otherwise we are doing the wrong > + thing below. */ > + ll_unsignedp = 1; > + l_const = ll_and_mask; > + } > + else > + return 0; > + } > + > + /* This is analogous to the code for l_const above. */ > + if (rcode != wanted_code) > + { > + if (r_const.get_precision () > + && r_const == 0 > + && rl_and_mask.get_precision () > + && wi::popcount (rl_and_mask) == 1) > + { > + rl_unsignedp = 1; > + r_const = rl_and_mask; > + } > + else > + return 0; > + } > + > + /* This will be bumped to 2 if any of the field pairs crosses an > + alignment boundary, so the merged compare has to be done in two > + parts. */ > + int parts = 1; > + /* Set to true if the second combined compare should come first, > + e.g., because the second original compare accesses a word that > + the first one doesn't, and the combined compares access those in > + cmp[0]. */ > + bool first1 = false; > + /* Set to true if the first original compare is not the one being > + split. */ > + bool maybe_separate = false; > + > + /* The following 2-dimensional arrays use the first index to > + identify left(0)- vs right(1)-hand compare operands, and the > + second one to identify merged compare parts. */ > + /* The memory loads or constants to be compared. */ > + tree ld_arg[2][2]; > + /* The first bit of the corresponding inner object that the > + corresponding LD_ARG covers. */ > + HOST_WIDE_INT bitpos[2][2]; > + /* The bit count starting at BITPOS that the corresponding LD_ARG > + covers. */ > + HOST_WIDE_INT bitsiz[2][2]; > + /* The number of bits by which LD_ARG has already been shifted > + right, WRT mask. */ > + HOST_WIDE_INT shifted[2][2]; > + /* The number of bits by which both LD_ARG and MASK need shifting to > + bring its least-significant bit to bit zero. */ > + HOST_WIDE_INT toshift[2][2]; > + /* An additional mask to be applied to LD_ARG, to remove any bits > + that may have been loaded for use in another compare, but that > + don't belong in the corresponding compare. */ > + wide_int xmask[2][2] = {}; > + > + /* The combined compare or compares. */ > + tree cmp[2]; > + > + /* Consider we're comparing two non-contiguous fields of packed > + structs, both aligned at 32-bit boundaries: > + > + ll_arg: an 8-bit field at offset 0 > + lr_arg: a 16-bit field at offset 2 > + > + rl_arg: an 8-bit field at offset 1 > + rr_arg: a 16-bit field at offset 3 > + > + We'll have r_split_load, because rr_arg straddles across an > + alignment boundary. > + > + We'll want to have: > + > + bitpos = { { 0, 0 }, { 0, 32 } } > + bitsiz = { { 32, 32 }, { 32, 8 } } > + > + And, for little-endian: > + > + shifted = { { 0, 0 }, { 0, 32 } } > + toshift = { { 0, 24 }, { 0, 0 } } > + > + Or, for big-endian: > + > + shifted = { { 0, 0 }, { 8, 0 } } > + toshift = { { 8, 0 }, { 0, 0 } } > + */ > + > + /* See if we can find a mode that contains both fields being compared on > + the left. If we can't, fail. Otherwise, update all constants and masks > + to be relative to a field of that size. */ > + first_bit = MIN (ll_bitpos, rl_bitpos); > + end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); > + if (get_best_mode (end_bit - first_bit, first_bit, 0, 0, > + TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, > + volatilep, &lnmode)) > + l_split_load = false; > + else > + { > + /* Consider the possibility of recombining loads if any of the > + fields straddles across an alignment boundary, so that either > + part can be loaded along with the other field. */ > + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (ll_inner)); > + HOST_WIDE_INT boundary = compute_split_boundary_from_align > + (align, ll_bitpos, ll_bitsize, rl_bitpos, rl_bitsize); > + > + if (boundary < 0 > + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, > + align, BITS_PER_WORD, volatilep, &lnmode) > + || !get_best_mode (end_bit - boundary, boundary, 0, 0, > + align, BITS_PER_WORD, volatilep, &lnmode2)) > + return 0; > + > + /* If we can't have a single load, but can with two, figure out whether > + the two compares can be separated, i.e., whether the entirety of the > + first original compare is encompassed by the entirety of the first > + combined compare. If the first original compare is past the alignment > + boundary, arrange to compare that range first, by setting first1 > + (meaning make cmp[1] first, instead of cmp[0]). */ > + l_split_load = true; > + parts = 2; > + if (ll_bitpos >= boundary) > + maybe_separate = first1 = true; > + else if (ll_bitpos + ll_bitsize <= boundary) > + maybe_separate = true; > + } > + > + lnbitsize = GET_MODE_BITSIZE (lnmode); > + lnbitpos = first_bit & ~ (lnbitsize - 1); > + /* Avoid situations that the code below can't handle. */ > + if (lnbitpos < 0) > + return 0; > + > + /* Choose the type for the combined compare. Even if we're splitting loads, > + make it wide enough to hold both. */ > + if (l_split_load) > + lnbitsize += GET_MODE_BITSIZE (lnmode2); > + lntype = build_nonstandard_integer_type (lnbitsize, 1); > + if (!lntype) > + return NULL_TREE; > + lnprec = TYPE_PRECISION (lntype); > + xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; > + > + /* Adjust bit ranges for reverse endianness. */ > + if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + { > + xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; > + xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; > + } > + > + /* Adjust masks to match the positions in the combined lntype. */ > + wide_int ll_mask, rl_mask, r_mask; > + if (ll_and_mask.get_precision ()) > + ll_mask = wi::lshift (wide_int::from (ll_and_mask, lnprec, UNSIGNED), > + xll_bitpos); > + else > + ll_mask = wi::shifted_mask (xll_bitpos, ll_bitsize, false, lnprec); > + if (rl_and_mask.get_precision ()) > + rl_mask = wi::lshift (wide_int::from (rl_and_mask, lnprec, UNSIGNED), > + xrl_bitpos); > + else > + rl_mask = wi::shifted_mask (xrl_bitpos, rl_bitsize, false, lnprec); > + > + /* Adjust right-hand constants in both original comparisons to match width > + and bit position. */ > + if (l_const.get_precision ()) > + { > + /* We're doing bitwise equality tests, so don't bother with sign > + extensions. */ > + l_const = wide_int::from (l_const, lnprec, UNSIGNED); > + if (ll_and_mask.get_precision ()) > + l_const &= wide_int::from (ll_and_mask, lnprec, UNSIGNED); > + l_const <<= xll_bitpos; > + if ((l_const & ~ll_mask) != 0) > + { > + warning_at (lloc, OPT_Wtautological_compare, > + "comparison is always %d", wanted_code == NE_EXPR); > + > + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); > + } > + > + /* When we set l_const, we also set r_const, so we need not test it > + again. */ > + gcc_checking_assert (r_const.get_precision ()); > + > + r_const = wide_int::from (r_const, lnprec, UNSIGNED); > + if (rl_and_mask.get_precision ()) > + r_const &= wide_int::from (rl_and_mask, lnprec, UNSIGNED); > + r_const <<= xrl_bitpos; > + if ((r_const & ~rl_mask) != 0) > + { > + warning_at (rloc, OPT_Wtautological_compare, > + "comparison is always %d", wanted_code == NE_EXPR); > + > + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); > + } > + > + /* If there is something in common between the masks, those bits of the > + constants must be the same. If not, the combined condition cannot be > + met, and the result is known. Test for this to avoid generating > + incorrect code below. */ > + wide_int mask = ll_mask & rl_mask; > + if (mask != 0 > + && (l_const & mask) != (r_const & mask)) > + { > + if (wanted_code == NE_EXPR) > + return constant_boolean_node (true, truth_type); > + else > + return constant_boolean_node (false, truth_type); > + } > + > + /* The constants are combined so as to line up with the loaded field, so > + tentatively use the same parameters for the second combined > + compare. */ > + ld_arg[1][0] = wide_int_to_tree (lntype, l_const | r_const); > + toshift[1][0] = MIN (xll_bitpos, xrl_bitpos); > + shifted[1][0] = 0; > + bitpos[1][0] = lnbitpos; > + bitsiz[1][0] = lnbitsize; > + > + if (parts > 1) > + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], > + shifted[1], xmask[1], > + lnbitpos + GET_MODE_BITSIZE (lnmode), > + lr_reversep); > + > + /* No masking needed, we know the full constants. */ > + r_mask = wi::mask (0, true, lnprec); > + > + /* If the compiler thinks this is used uninitialized below, it's > + because it can't realize that parts can only be 2 when > + comparing with constants if l_split_load is also true. This > + just silences the warning. */ > + rnbitpos = 0; > + } > + > + /* Likewise, if the right sides are not constant, align them for the combined > + compare. Also, disallow this optimization if a size, signedness or > + storage order mismatch occurs between the left and right sides. */ > + else > + { > + if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize > + || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp > + || ll_reversep != lr_reversep > + /* Make sure the two fields on the right > + correspond to the left without being swapped. */ > + || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) > + return 0; > + > + bool r_split_load; > + scalar_int_mode rnmode2; > + > + /* Figure out how to load the bits for the right-hand size of the > + combined compare. As in the left-hand size, we may have to split it, > + and then we use two separate compares. */ > + first_bit = MIN (lr_bitpos, rr_bitpos); > + end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); > + if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, > + TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, > + volatilep, &rnmode)) > + { > + /* Consider the possibility of recombining loads if any of the > + fields straddles across an alignment boundary, so that either > + part can be loaded along with the other field. */ > + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (lr_inner)); > + HOST_WIDE_INT boundary = compute_split_boundary_from_align > + (align, lr_bitpos, lr_bitsize, rr_bitpos, rr_bitsize); > + > + if (boundary < 0 > + /* If we're to split both, make sure the split point is > + the same. */ > + || (l_split_load > + && (boundary - lr_bitpos > + != (lnbitpos + GET_MODE_BITSIZE (lnmode)) - ll_bitpos)) > + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, > + align, BITS_PER_WORD, volatilep, &rnmode) > + || !get_best_mode (end_bit - boundary, boundary, 0, 0, > + align, BITS_PER_WORD, volatilep, &rnmode2)) > + return 0; > + > + r_split_load = true; > + parts = 2; > + if (lr_bitpos >= boundary) > + maybe_separate = first1 = true; > + else if (lr_bitpos + lr_bitsize <= boundary) > + maybe_separate = true; > + } > + else > + r_split_load = false; > + > + /* Find a type that can hold the entire right-hand operand. */ > + rnbitsize = GET_MODE_BITSIZE (rnmode); > + rnbitpos = first_bit & ~ (rnbitsize - 1); > + if (r_split_load) > + rnbitsize += GET_MODE_BITSIZE (rnmode2); > + rntype = build_nonstandard_integer_type (rnbitsize, 1); > + if (!rntype) > + return 0; > + rnprec = TYPE_PRECISION (rntype); > + xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; > + > + /* Adjust for reversed endianness. */ > + if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) > + { > + xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; > + xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; > + } > + > + /* Adjust the masks to match the combined type, and combine them. */ > + wide_int lr_mask, rr_mask; > + if (lr_and_mask.get_precision ()) > + lr_mask = wi::lshift (wide_int::from (lr_and_mask, rnprec, UNSIGNED), > + xlr_bitpos); > + else > + lr_mask = wi::shifted_mask (xlr_bitpos, lr_bitsize, false, rnprec); > + if (rl_and_mask.get_precision ()) > + rr_mask = wi::lshift (wide_int::from (rr_and_mask, rnprec, UNSIGNED), > + xrr_bitpos); > + else > + rr_mask = wi::shifted_mask (xrr_bitpos, rr_bitsize, false, rnprec); > + r_mask = lr_mask | rr_mask; > + > + /* Load the right-hand operand of the combined compare. */ > + toshift[1][0] = MIN (xlr_bitpos, xrr_bitpos); > + shifted[1][0] = 0; > + > + if (!r_split_load) > + { > + bitpos[1][0] = rnbitpos; > + bitsiz[1][0] = rnbitsize; > + ld_arg[1][0] = make_bit_field_load (ll_loc[3], lr_inner, lr_arg, > + rntype, rnbitsize, rnbitpos, > + lr_unsignedp || rr_unsignedp, > + lr_reversep, lr_load); > + } > + > + /* ... and the second part of the right-hand operand if needed. */ > + if (parts > 1) > + { > + if (r_split_load) > + { > + gimple *point[2]; > + point[0] = lr_load; > + point[1] = rr_load; > + build_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], > + shifted[1], rl_loc[3], lr_inner, lr_arg, > + rnmode, rnmode2, rnbitpos, lr_reversep, point); > + } > + else > + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], > + shifted[1], xmask[1], > + lnbitpos + GET_MODE_BITSIZE (lnmode) > + - ll_bitpos + lr_bitpos, lr_reversep); > + } > + } > + > + /* Now issue the loads for the left-hand combined operand/s. */ > + wide_int l_mask = ll_mask | rl_mask; > + toshift[0][0] = MIN (xll_bitpos, xrl_bitpos); > + shifted[0][0] = 0; > + > + if (!l_split_load) > + { > + bitpos[0][0] = lnbitpos; > + bitsiz[0][0] = lnbitsize; > + ld_arg[0][0] = make_bit_field_load (ll_loc[3], ll_inner, ll_arg, > + lntype, lnbitsize, lnbitpos, > + ll_unsignedp || rl_unsignedp, > + ll_reversep, ll_load); > + } > + > + if (parts > 1) > + { > + if (l_split_load) > + { > + gimple *point[2]; > + point[0] = ll_load; > + point[1] = rl_load; > + build_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], > + shifted[0], rl_loc[3], ll_inner, ll_arg, > + lnmode, lnmode2, lnbitpos, ll_reversep, point); > + } > + else > + reuse_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], > + shifted[0], xmask[0], > + rnbitpos + GET_MODE_BITSIZE (rnmode) > + - lr_bitpos + ll_bitpos, ll_reversep); > + } > + > + /* Compute the compares. */ > + for (int i = 0; i < parts; i++) > + { > + tree op[2] = { ld_arg[0][i], ld_arg[1][i] }; > + wide_int mask[2] = { l_mask, r_mask }; > + location_t *locs[2] = { i ? rl_loc : ll_loc, i ? rr_loc : lr_loc }; > + > + /* Figure out the masks, and unshare the original operands. */ > + for (int j = 0; j < 2; j++) > + { > + unsigned prec = TYPE_PRECISION (TREE_TYPE (op[j])); > + op[j] = unshare_expr (op[j]); > + > + /* Mask out the bits belonging to the other part. */ > + if (xmask[j][i].get_precision ()) > + mask[j] &= xmask[j][i]; > + > + if (shifted[j][i]) > + { > + wide_int shift = wide_int::from (shifted[j][i], prec, UNSIGNED); > + mask[j] = wi::lrshift (mask[j], shift); > + } > + mask[j] = wide_int::from (mask[j], prec, UNSIGNED); > + } > + > + /* Line up the operands for a compare. */ > + HOST_WIDE_INT shift = (toshift[0][i] - toshift[1][i]); > + > + if (shift) > + { > + int j; > + if (shift > 0) > + j = 0; > + else > + { > + j = 1; > + shift = -shift; > + } > + > + tree shiftsz = bitsize_int (shift); > + op[j] = fold_build2_loc (locs[j][1], RSHIFT_EXPR, TREE_TYPE (op[j]), > + op[j], shiftsz); > + mask[j] = wi::lrshift (mask[j], shift); > + } > + > + /* Convert to the smaller type before masking out unwanted > + bits. */ > + tree type = TREE_TYPE (op[0]); > + if (type != TREE_TYPE (op[1])) > + { > + int j = (TYPE_PRECISION (type) > + < TYPE_PRECISION (TREE_TYPE (op[1]))); > + if (!j) > + type = TREE_TYPE (op[1]); > + op[j] = fold_convert_loc (locs[j][0], type, op[j]); > + mask[j] = wide_int::from (mask[j], TYPE_PRECISION (type), UNSIGNED); > + } > + > + /* Apply masks. */ > + for (int j = 0; j < 2; j++) > + if (mask[j] != wi::mask (0, true, mask[j].get_precision ())) > + op[j] = build2_loc (locs[j][2], BIT_AND_EXPR, type, > + op[j], wide_int_to_tree (type, mask[j])); > + > + cmp[i] = build2_loc (i ? rloc : lloc, wanted_code, truth_type, > + op[0], op[1]); > + } > + > + /* Reorder the compares if needed. */ > + if (first1) > + std::swap (cmp[0], cmp[1]); > + > + /* Prepare to return the resulting compares. Combine two parts if > + needed. */ > + if (parts == 1) > + result = cmp[0]; > + else if (!separatep || !maybe_separate) > + result = build2_loc (rloc, orig_code, truth_type, cmp[0], cmp[1]); > + else > + { > + result = cmp[0]; > + *separatep = cmp[1]; > + } > + > + return result; > +} > + > /* Try to simplify the AND of two comparisons, specified by > (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively. > If this can be simplified to a single expression (without requiring > diff --git a/gcc/testsuite/gcc.dg/field-merge-1.c b/gcc/testsuite/gcc.dg/field-merge-1.c > new file mode 100644 > index 0000000000000..1818e104437e1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-1.c > @@ -0,0 +1,64 @@ > +/* { dg-do run } */ > +/* { dg-options "-O -save-temps -fdump-tree-optimized" } */ > + > +/* Check that field loads compared with constants are merged, even if > + tested out of order, and when fields straddle across alignment > + boundaries. */ > + > +struct TL { > + unsigned char p; > + unsigned int a; > + unsigned char q; > + unsigned int b; > + unsigned char r; > + unsigned int c; > + unsigned char s; > +} __attribute__ ((packed, aligned (4), scalar_storage_order ("little-endian"))); > + > +struct TB { > + unsigned char p; > + unsigned int a; > + unsigned char q; > + unsigned int b; > + unsigned char r; > + unsigned int c; > + unsigned char s; > +} __attribute__ ((packed, aligned (4), scalar_storage_order ("big-endian"))); > + > +#define vc 0xaa > +#define vi 0x12345678 > + > +struct TL vL = { vc, vi, vc, vi, vc, vi, vc }; > +struct TB vB = { vc, vi, vc, vi, vc, vi, vc }; > + > +void f (void) { > + /* Which words of | vL | vB | */ > + /* are accessed by |0123|0123| */ > + if (0 /* the tests? | | | */ > + || vL.p != vc /* |* | | */ > + || vB.p != vc /* | |* | */ > + || vL.s != vc /* | *| | */ > + || vB.q != vc /* | | * | */ > + || vL.a != vi /* |^* | | */ > + || vB.b != vi /* | | ^* | */ > + || vL.c != vi /* | *^| | */ > + || vB.c != vi /* | | ^*| */ > + || vL.b != vi /* | ^^ | | */ > + || vL.q != vc /* | ^ | | */ > + || vB.a != vi /* | |^^ | */ > + || vB.r != vc /* | | ^ | */ > + || vB.s != vc /* | | ^| */ > + || vL.r != vc /* | ^ | | */ > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 8 "optimized" } } */ > +/* { dg-final { scan-assembler-not "cmpb" { target { i*86-*-* || x86_64-*-* } } } } */ > +/* { dg-final { scan-assembler-times "cmpl" 8 { target { i*86-*-* || x86_64-*-* } } } } */ > +/* { dg-final { scan-assembler-times "cmpw" 8 { target { powerpc*-*-* || rs6000-*-* } } } } */ > diff --git a/gcc/testsuite/gcc.dg/field-merge-10.c b/gcc/testsuite/gcc.dg/field-merge-10.c > new file mode 100644 > index 0000000000000..dca6febb75859 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-10.c > @@ -0,0 +1,36 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +/* Check that we don't move loads across stores. */ > + > +struct s { > + short a; > + short b; > +} __attribute__ ((aligned (4))); > + > +struct s p[2] = { > + { 42, 0xfe01 }, > + { 42, 0xfe10 } > +}; > + > +void f (void) { > + short a0 = p[0].a; > + short b0 = p[0].b; > + short a1 = p[1].a; > + short b1 = p[1].b; > + __builtin_memset (p, 0, sizeof (p)); > + asm ("" : "+m" (p)); > + if (0 > + || p[0].a != p[1].a > + || p[0].b != p[1].b > + ) > + __builtin_abort (); > + if (a0 == a1) > + if (b0 == b1) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-11.c b/gcc/testsuite/gcc.dg/field-merge-11.c > new file mode 100644 > index 0000000000000..fe627cddd7fdf > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-11.c > @@ -0,0 +1,32 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +/* Check that narrowing casts aren't ignored, and that same-field tests at > + different widths aren't misoptimized. */ > + > +struct s { > + short a; > + unsigned short b; > + int c; > +} __attribute__ ((aligned (4))); > + > +struct s p = { 42, (short)(0xef1 - 0x1000), 0x12345678 }; > + > +void f (void) { > + if (0 > + || (p.a & 0xcc) != 8 > + || p.a != 42 > + || (int)(signed char)p.b != (int)(signed char)(0xef1 - 0x1000) > + || (unsigned)(unsigned char)p.b != (unsigned)(unsigned char)(0xef1 - 0x1000) > + || (unsigned)p.b != (unsigned short)(0xef1 - 0x1000) > + || (int)(short)p.b != (int)(0xef1 - 0x1000) > + || (long)(unsigned char)(p.c >> 8) != (long)(unsigned char)0x123456 > + || p.c != 0x12345678 > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-12.c b/gcc/testsuite/gcc.dg/field-merge-12.c > new file mode 100644 > index 0000000000000..7056eb607e904 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-12.c > @@ -0,0 +1,33 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > + > +/* Check that we don't crash when trying to handle masks that don't match the > + width of the original type. */ > + > +struct s { > + long long q; > +}; > + > +struct s x1 = { 1 }; > +struct s xm1 = { -1 }; > +struct s x8 = { 8 }; > +struct s x0 = { 0 }; > + > +bool f(struct s *p) > +{ > + int q = (int)p->q; > + return (q < 0) || (q & 7); > +} > + > +int main () > +{ > + if (!f (&x1)) > + __builtin_abort (); > + if (!f (&xm1)) > + __builtin_abort (); > + if (f (&x8)) > + __builtin_abort (); > + if (f (&x0)) > + __builtin_abort (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-2.c b/gcc/testsuite/gcc.dg/field-merge-2.c > new file mode 100644 > index 0000000000000..80c573b31ddc7 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-2.c > @@ -0,0 +1,31 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +struct TL { > + unsigned short a; > + unsigned short b; > +} __attribute__ ((packed, aligned (8))); > + > +struct TB { > + unsigned char p; > + unsigned short a; > + unsigned short b; > +} __attribute__ ((packed, aligned (8))); > + > +#define vc 0xaa > + > +struct TL vL = { vc, vc }; > +struct TB vB = { vc, vc, vc }; > + > +void f (void) { > + if (0 > + || vL.b != vB.b > + || vL.a != vB.a > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-3.c b/gcc/testsuite/gcc.dg/field-merge-3.c > new file mode 100644 > index 0000000000000..f26e8a96ea04f > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-3.c > @@ -0,0 +1,36 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +const int BIG_ENDIAN_P = (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__); > + > +struct T1 { > + unsigned char p[2]; > + unsigned short a; > + unsigned int z; > +} __attribute__((__aligned__(8))); > + > +struct T2 { > + unsigned short p; > + unsigned short a; > + unsigned int z; > +} __attribute__((__aligned__(8))); > + > +#define vc 0xaa > +#define vi 0x12345678 > + > +struct T1 v1 = { { vc + !BIG_ENDIAN_P, vc + BIG_ENDIAN_P }, vc, vi }; > +struct T2 v2 = { (vc << 8) | (vc - 1), vc, vi }; > + > +void f (void) { > + if (0 > + || v1.p[!BIG_ENDIAN_P] != v2.p >> 8 > + || v1.a != v2.a > + || ((v1.z ^ v2.z) & 0xff00ff00) != 0 > + || ((v1.z ^ v2.z) & 0x00ff00ff) != 0) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-4.c b/gcc/testsuite/gcc.dg/field-merge-4.c > new file mode 100644 > index 0000000000000..c629069e52b2c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-4.c > @@ -0,0 +1,40 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +struct T1 { > + unsigned int zn; > + unsigned char p; > + unsigned char qn; > + unsigned short a; > + unsigned int z; > +} __attribute__((__packed__, __aligned__(4))); > + > +struct T2 { > + unsigned int zn; > + unsigned char rn; > + unsigned char p; > + unsigned char qn; > + unsigned short a; > + unsigned int z; > +} __attribute__((__packed__, __aligned__(4))); > + > +#define vc 0xaa > +#define vs 0xccdd > +#define vi 0x12345678 > + > +struct T1 v1 = { -1, vc, 1, vs, vi }; > +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; > + > +void f (void) { > + if (0 > + || v1.p != v2.p > + || v1.a != v2.a > + || v1.z != v2.z > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-5.c b/gcc/testsuite/gcc.dg/field-merge-5.c > new file mode 100644 > index 0000000000000..1580b14bcc935 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-5.c > @@ -0,0 +1,40 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +struct T1 { > + unsigned int zn; > + unsigned char p; > + unsigned char qn; > + unsigned short a; > + unsigned int z; > +} __attribute__((__packed__, __aligned__(8))); > + > +struct T2 { > + unsigned int zn; > + unsigned char rn; > + unsigned char p; > + unsigned char qn; > + unsigned short a; > + unsigned int z; > +} __attribute__((__packed__, __aligned__(8))); > + > +#define vc 0xaa > +#define vs 0xccdd > +#define vi 0x12345678 > + > +struct T1 v1 = { -1, vc, 1, vs, vi }; > +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; > + > +void f (void) { > + if (0 > + || v1.p != v2.p > + || v1.a != v2.a > + || v1.z != v2.z > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c > new file mode 100644 > index 0000000000000..7fd48a138d14a > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-6.c > @@ -0,0 +1,26 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > +/* { dg-shouldfail } */ > + > +/* Check that the third compare won't be pulled ahead of the second one and > + prevent, which would prevent the NULL pointer dereference that should cause > + the execution to fail. */ > + > +struct s { > + char a, b; > + int *p; > +}; > + > +struct s a = { 0, 1, 0 }; > +struct s b = { 0, 0, 0 }; > + > +int f () { > + return (a.a != b.a > + || *b.p != *a.p > + || a.b != b.b); > +} > + > +int main() { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c > new file mode 100644 > index 0000000000000..728a29b6fafa9 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-7.c > @@ -0,0 +1,23 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ > + > +/* Check that the third compare won't be combined with the first one. */ > + > +struct s { > + char a, b; > + int p; > +}; > + > +struct s a = { 0, 0, 0 }; > +struct s b = { 0, 0, 0 }; > + > +int f () { > + return (a.a != b.a || (a.p != b.p && a.b != b.b)); > +} > + > +int g () { > + return (a.a == b.a && (a.p == b.p || a.b == b.b)); > +} > + > +/* { dg-final { scan-tree-dump-not "optimizing" "ifcombine" } } */ > +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ > diff --git a/gcc/testsuite/gcc.dg/field-merge-8.c b/gcc/testsuite/gcc.dg/field-merge-8.c > new file mode 100644 > index 0000000000000..ae270e10070e4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-8.c > @@ -0,0 +1,25 @@ > +/* { dg-do run } */ > +/* { dg-options "-O" } */ > + > +/* Check that conversions are not thrown away. */ > + > +struct s { > + unsigned char a; > + unsigned short b; > +} __attribute__ ((aligned (4))); > + > +struct s p = { 42, 0xfe }; > +struct s q = { 42, 0xfe | (2 << (__CHAR_BIT__ - 1)) }; > + > +void f (void) { > + if (0 > + || p.a != q.a > + || (unsigned char)p.b != (unsigned char)q.b > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/field-merge-9.c b/gcc/testsuite/gcc.dg/field-merge-9.c > new file mode 100644 > index 0000000000000..b9e08d8fa37d2 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/field-merge-9.c > @@ -0,0 +1,38 @@ > +/* { dg-do run } */ > +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ > + > +/* Check that conversions and selections of similar bit ranges across different > + types don't prevent combination. */ > + > +struct s1 { > + unsigned char b[2]; > + unsigned char a; > +} __attribute__ ((aligned (4))); > + > +struct s2 { > + unsigned short b; > + unsigned char a; > +} __attribute__ ((aligned (4))); > + > +static const char le = __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? 1 : 0; > + > +struct s1 p = { { -!le , -le }, 42 }; > +struct s2 q = { (le > + ? -2 << (__CHAR_BIT__ - 1) > + : -1 & ((1 << (__CHAR_BIT__ - 1) << 1) - 1)), 42 }; > + > +void f (void) { > + if (0 > + || p.a != q.a > + || p.b[!le] != (unsigned char)q.b > + || p.b[le] != (unsigned char)((q.b >> (__CHAR_BIT__ - 1)) >> 1) > + ) > + __builtin_abort (); > +} > + > +int main () { > + f (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "optimizing two comparisons" 2 "ifcombine" } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/long_branch_1.c b/gcc/testsuite/gcc.target/aarch64/long_branch_1.c > index 49d8b6a2278ad..9fd9ec0d1d3f6 100644 > --- a/gcc/testsuite/gcc.target/aarch64/long_branch_1.c > +++ b/gcc/testsuite/gcc.target/aarch64/long_branch_1.c > @@ -1,6 +1,6 @@ > /* { dg-do assemble } */ > /* { dg-timeout-factor 2.0 } */ > -/* { dg-options "-O1 -fno-reorder-blocks -fno-tree-cselim --save-temps" } */ > +/* { dg-options "-O1 -fno-reorder-blocks -fno-tree-cselim -fdisable-tree-ifcombine --save-temps" } */ > > > __attribute__((noinline, noclone)) int > diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc > index a87bf1210776f..de8db2be5572a 100644 > --- a/gcc/tree-ssa-ifcombine.cc > +++ b/gcc/tree-ssa-ifcombine.cc > @@ -971,7 +971,19 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, > outer_cond_code, > gimple_cond_lhs (outer_cond), > gimple_cond_rhs (outer_cond), > - gimple_bb (outer_cond)))) > + gimple_bb (outer_cond))) > + && !(t = (fold_truth_andor_for_ifcombine > + (TRUTH_ANDIF_EXPR, boolean_type_node, > + gimple_location (outer_cond), > + outer_cond_code, > + gimple_cond_lhs (outer_cond), > + gimple_cond_rhs (outer_cond), > + gimple_location (inner_cond), > + inner_cond_code, > + gimple_cond_lhs (inner_cond), > + gimple_cond_rhs (inner_cond), > + single_pred (inner_cond_bb) != outer_cond_bb > + ? &ts : 0)))) > { > /* Only combine conditions in this fallback case if the blocks are > neighbors. */ > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > More tolerance and less prejudice are key for inclusion and diversity > Excluding neuro-others for not behaving ""normal"" is *not* inclusive
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 0d255561b1bdc..be7916e957d66 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1477,10 +1477,6 @@ Wtemplates C++ ObjC++ Var(warn_templates) Warning Warn on primary template declaration. -Wtautological-compare -C ObjC C++ ObjC++ Var(warn_tautological_compare) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) -Warn if a comparison always evaluates to true or false. - Wtemplate-body C++ ObjC++ Var(warn_template_body) Warning Init(1) Diagnose errors when parsing a template. diff --git a/gcc/common.opt b/gcc/common.opt index bb226ac61e6a1..915ce5bffb4e0 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -812,6 +812,10 @@ Wsystem-headers Common Var(warn_system_headers) Warning Do not suppress warnings from system headers. +Wtautological-compare +Common Var(warn_tautological_compare) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) +Warn if a comparison always evaluates to true or false. + Wtrampolines Common Var(warn_trampolines) Warning Warn whenever a trampoline is generated. diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 1e8ae1ab493ba..6449664598641 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -137,7 +137,6 @@ static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, tree, tree, tree, tree); -static tree unextend (tree, int, int, tree); static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); static tree fold_binary_op_with_conditional_arg (location_t, @@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg) is the original memory reference used to preserve the alias set of the access. */ -static tree +tree make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, HOST_WIDE_INT bitsize, poly_int64 bitpos, int unsignedp, int reversep) @@ -4961,136 +4960,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, return lhs; } -/* Subroutine for fold_truth_andor_1: decode a field reference. - - If EXP is a comparison reference, we return the innermost reference. - - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is - set to the starting bit number. - - If the innermost field can be completely contained in a mode-sized - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. - - *PVOLATILEP is set to 1 if the any expression encountered is volatile; - otherwise it is not changed. - - *PUNSIGNEDP is set to the signedness of the field. - - *PREVERSEP is set to the storage order of the field. - - *PMASK is set to the mask used. This is either contained in a - BIT_AND_EXPR or derived from the width of the field. - - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. - - Return 0 if this is not a component reference or is one that we can't - do anything with. */ - -static tree -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, - HOST_WIDE_INT *pbitpos, machine_mode *pmode, - int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask) -{ - tree exp = *exp_; - tree outer_type = 0; - tree and_mask = 0; - tree mask, inner, offset; - tree unsigned_type; - unsigned int precision; - - /* All the optimizations using this function assume integer fields. - There are problems with FP fields since the type_for_size call - below can fail for, e.g., XFmode. */ - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) - return NULL_TREE; - - /* We are interested in the bare arrangement of bits, so strip everything - that doesn't affect the machine mode. However, record the type of the - outermost expression if it may matter below. */ - if (CONVERT_EXPR_P (exp) - || TREE_CODE (exp) == NON_LVALUE_EXPR) - outer_type = TREE_TYPE (exp); - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) - { - and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return NULL_TREE; - } - - poly_int64 poly_bitsize, poly_bitpos; - inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, - pmode, punsignedp, preversep, pvolatilep); - if ((inner == exp && and_mask == 0) - || !poly_bitsize.is_constant (pbitsize) - || !poly_bitpos.is_constant (pbitpos) - || *pbitsize < 0 - || offset != 0 - || TREE_CODE (inner) == PLACEHOLDER_EXPR - /* We eventually want to build a larger reference and need to take - the address of this. */ - || (!REFERENCE_CLASS_P (inner) && !DECL_P (inner)) - /* Reject out-of-bound accesses (PR79731). */ - || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) - && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), - *pbitpos + *pbitsize) < 0)) - return NULL_TREE; - - unsigned_type = lang_hooks.types.type_for_size (*pbitsize, 1); - if (unsigned_type == NULL_TREE) - return NULL_TREE; - - *exp_ = exp; - - /* If the number of bits in the reference is the same as the bitsize of - the outer type, then the outer type gives the signedness. Otherwise - (in case of a small bitfield) the signedness is unchanged. */ - if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) - *punsignedp = TYPE_UNSIGNED (outer_type); - - /* Compute the mask to access the bitfield. */ - precision = TYPE_PRECISION (unsigned_type); - - mask = build_int_cst_type (unsigned_type, -1); - - mask = const_binop (LSHIFT_EXPR, mask, size_int (precision - *pbitsize)); - mask = const_binop (RSHIFT_EXPR, mask, size_int (precision - *pbitsize)); - - /* Merge it with the mask we found in the BIT_AND_EXPR, if any. */ - if (and_mask != 0) - mask = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, - fold_convert_loc (loc, unsigned_type, and_mask), mask); - - *pmask = mask; - *pand_mask = and_mask; - return inner; -} - -/* Return nonzero if MASK represents a mask of SIZE ones in the low-order - bit positions and MASK is SIGNED. */ - -static bool -all_ones_mask_p (const_tree mask, unsigned int size) -{ - tree type = TREE_TYPE (mask); - unsigned int precision = TYPE_PRECISION (type); - - /* If this function returns true when the type of the mask is - UNSIGNED, then there will be errors. In particular see - gcc.c-torture/execute/990326-1.c. There does not appear to be - any documentation paper trail as to why this is so. But the pre - wide-int worked with that restriction and it has been preserved - here. */ - if (size > precision || TYPE_SIGN (type) == UNSIGNED) - return false; - - return wi::mask (size, false, precision) == wi::to_wide (mask); -} - /* Subroutine for fold: determine if VAL is the INTEGER_CONST that represents the sign bit of EXP's type. If EXP represents a sign or zero extension, also test VAL against the unextended type. @@ -6400,48 +6269,6 @@ fold_range_test (location_t loc, enum tree_code code, tree type, return 0; } -/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P - bit value. Arrange things so the extra bits will be set to zero if and - only if C is signed-extended to its full width. If MASK is nonzero, - it is an INTEGER_CST that should be AND'ed with the extra bits. */ - -static tree -unextend (tree c, int p, int unsignedp, tree mask) -{ - tree type = TREE_TYPE (c); - int modesize = GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE (type)); - tree temp; - - if (p == modesize || unsignedp) - return c; - - /* We work by getting just the sign bit into the low-order bit, then - into the high-order bit, then sign-extend. We then XOR that value - with C. */ - temp = build_int_cst (TREE_TYPE (c), - wi::extract_uhwi (wi::to_wide (c), p - 1, 1)); - - /* We must use a signed type in order to get an arithmetic right shift. - However, we must also avoid introducing accidental overflows, so that - a subsequent call to integer_zerop will work. Hence we must - do the type conversion here. At this point, the constant is either - zero or one, and the conversion to a signed type can never overflow. - We could get an overflow if this conversion is done anywhere else. */ - if (TYPE_UNSIGNED (type)) - temp = fold_convert (signed_type_for (type), temp); - - temp = const_binop (LSHIFT_EXPR, temp, size_int (modesize - 1)); - temp = const_binop (RSHIFT_EXPR, temp, size_int (modesize - p - 1)); - if (mask != 0) - temp = const_binop (BIT_AND_EXPR, temp, - fold_convert (TREE_TYPE (c), mask)); - /* If necessary, convert the type back to match the type of C. */ - if (TYPE_UNSIGNED (type)) - temp = fold_convert (type, temp); - - return fold_convert (type, const_binop (BIT_XOR_EXPR, c, temp)); -} - /* For an expression that has the form (A && B) || ~B or @@ -6512,20 +6339,13 @@ merge_truthop_with_opposite_arm (location_t loc, tree op, tree cmpop, lhs, rhs); return NULL_TREE; } - + /* Find ways of folding logical expressions of LHS and RHS: Try to merge two comparisons to the same innermost item. Look for range tests like "ch >= '0' && ch <= '9'". Look for combinations of simple terms on machines with expensive branches and evaluate the RHS unconditionally. - For example, if we have p->a == 2 && p->b == 4 and we can make an - object large enough to span both A and B, we can do this with a comparison - against the object ANDed with the a mask. - - If we have p->a == q->a && p->b == q->b, we may be able to use bit masking - operations to do this with one comparison. - We check for both normal comparisons and the BIT_AND_EXPRs made this by function and the one above. @@ -6550,24 +6370,9 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" comparison for one-bit fields. */ - enum tree_code wanted_code; enum tree_code lcode, rcode; tree ll_arg, lr_arg, rl_arg, rr_arg; - tree ll_inner, lr_inner, rl_inner, rr_inner; - HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; - HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; - HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; - HOST_WIDE_INT lnbitsize, lnbitpos, rnbitsize, rnbitpos; - int ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; - int ll_reversep, lr_reversep, rl_reversep, rr_reversep; - machine_mode ll_mode, lr_mode, rl_mode, rr_mode; - scalar_int_mode lnmode, rnmode; - tree ll_mask, lr_mask, rl_mask, rr_mask; - tree ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; - tree l_const, r_const; - tree lntype, rntype, result; - HOST_WIDE_INT first_bit, end_bit; - int volatilep; + tree result; /* Start by getting the comparison codes. Fail if anything is volatile. If one operand is a BIT_AND_EXPR with the constant one, treat it as if @@ -6662,316 +6467,7 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, build_int_cst (TREE_TYPE (ll_arg), 0)); } - /* See if the comparisons can be merged. Then get all the parameters for - each side. */ - - if ((lcode != EQ_EXPR && lcode != NE_EXPR) - || (rcode != EQ_EXPR && rcode != NE_EXPR)) - return 0; - - ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; - volatilep = 0; - ll_inner = decode_field_reference (loc, &ll_arg, - &ll_bitsize, &ll_bitpos, &ll_mode, - &ll_unsignedp, &ll_reversep, &volatilep, - &ll_mask, &ll_and_mask); - lr_inner = decode_field_reference (loc, &lr_arg, - &lr_bitsize, &lr_bitpos, &lr_mode, - &lr_unsignedp, &lr_reversep, &volatilep, - &lr_mask, &lr_and_mask); - rl_inner = decode_field_reference (loc, &rl_arg, - &rl_bitsize, &rl_bitpos, &rl_mode, - &rl_unsignedp, &rl_reversep, &volatilep, - &rl_mask, &rl_and_mask); - rr_inner = decode_field_reference (loc, &rr_arg, - &rr_bitsize, &rr_bitpos, &rr_mode, - &rr_unsignedp, &rr_reversep, &volatilep, - &rr_mask, &rr_and_mask); - - /* It must be true that the inner operation on the lhs of each - comparison must be the same if we are to be able to do anything. - Then see if we have constants. If not, the same must be true for - the rhs's. */ - if (volatilep - || ll_reversep != rl_reversep - || ll_inner == 0 || rl_inner == 0 - || ! operand_equal_p (ll_inner, rl_inner, 0)) - return 0; - - if (TREE_CODE (lr_arg) == INTEGER_CST - && TREE_CODE (rr_arg) == INTEGER_CST) - { - l_const = lr_arg, r_const = rr_arg; - lr_reversep = ll_reversep; - } - else if (lr_reversep != rr_reversep - || lr_inner == 0 || rr_inner == 0 - || ! operand_equal_p (lr_inner, rr_inner, 0)) - return 0; - else - l_const = r_const = 0; - - /* If either comparison code is not correct for our logical operation, - fail. However, we can convert a one-bit comparison against zero into - the opposite comparison against that bit being set in the field. */ - - wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); - if (lcode != wanted_code) - { - if (l_const && integer_zerop (l_const) && integer_pow2p (ll_mask)) - { - /* Make the left operand unsigned, since we are only interested - in the value of one bit. Otherwise we are doing the wrong - thing below. */ - ll_unsignedp = 1; - l_const = ll_mask; - } - else - return 0; - } - - /* This is analogous to the code for l_const above. */ - if (rcode != wanted_code) - { - if (r_const && integer_zerop (r_const) && integer_pow2p (rl_mask)) - { - rl_unsignedp = 1; - r_const = rl_mask; - } - else - return 0; - } - - /* See if we can find a mode that contains both fields being compared on - the left. If we can't, fail. Otherwise, update all constants and masks - to be relative to a field of that size. */ - first_bit = MIN (ll_bitpos, rl_bitpos); - end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, - TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, - volatilep, &lnmode)) - return 0; - - lnbitsize = GET_MODE_BITSIZE (lnmode); - lnbitpos = first_bit & ~ (lnbitsize - 1); - lntype = lang_hooks.types.type_for_size (lnbitsize, 1); - xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; - - if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) - { - xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; - xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; - } - - ll_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, ll_mask), - size_int (xll_bitpos)); - rl_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, rl_mask), - size_int (xrl_bitpos)); - if (ll_mask == NULL_TREE || rl_mask == NULL_TREE) - return 0; - - if (l_const) - { - l_const = fold_convert_loc (loc, lntype, l_const); - l_const = unextend (l_const, ll_bitsize, ll_unsignedp, ll_and_mask); - l_const = const_binop (LSHIFT_EXPR, l_const, size_int (xll_bitpos)); - if (l_const == NULL_TREE) - return 0; - if (! integer_zerop (const_binop (BIT_AND_EXPR, l_const, - fold_build1_loc (loc, BIT_NOT_EXPR, - lntype, ll_mask)))) - { - warning (0, "comparison is always %d", wanted_code == NE_EXPR); - - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); - } - } - if (r_const) - { - r_const = fold_convert_loc (loc, lntype, r_const); - r_const = unextend (r_const, rl_bitsize, rl_unsignedp, rl_and_mask); - r_const = const_binop (LSHIFT_EXPR, r_const, size_int (xrl_bitpos)); - if (r_const == NULL_TREE) - return 0; - if (! integer_zerop (const_binop (BIT_AND_EXPR, r_const, - fold_build1_loc (loc, BIT_NOT_EXPR, - lntype, rl_mask)))) - { - warning (0, "comparison is always %d", wanted_code == NE_EXPR); - - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); - } - } - - /* If the right sides are not constant, do the same for it. Also, - disallow this optimization if a size, signedness or storage order - mismatch occurs between the left and right sides. */ - if (l_const == 0) - { - if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize - || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp - || ll_reversep != lr_reversep - /* Make sure the two fields on the right - correspond to the left without being swapped. */ - || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) - return 0; - - first_bit = MIN (lr_bitpos, rr_bitpos); - end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, - TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, - volatilep, &rnmode)) - return 0; - - rnbitsize = GET_MODE_BITSIZE (rnmode); - rnbitpos = first_bit & ~ (rnbitsize - 1); - rntype = lang_hooks.types.type_for_size (rnbitsize, 1); - xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; - - if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) - { - xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; - xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; - } - - lr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, - rntype, lr_mask), - size_int (xlr_bitpos)); - rr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, - rntype, rr_mask), - size_int (xrr_bitpos)); - if (lr_mask == NULL_TREE || rr_mask == NULL_TREE) - return 0; - - /* Make a mask that corresponds to both fields being compared. - Do this for both items being compared. If the operands are the - same size and the bits being compared are in the same position - then we can do this by masking both and comparing the masked - results. */ - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); - lr_mask = const_binop (BIT_IOR_EXPR, lr_mask, rr_mask); - if (lnbitsize == rnbitsize - && xll_bitpos == xlr_bitpos - && lnbitpos >= 0 - && rnbitpos >= 0) - { - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, - lntype, lnbitsize, lnbitpos, - ll_unsignedp || rl_unsignedp, ll_reversep); - if (! all_ones_mask_p (ll_mask, lnbitsize)) - lhs = build2 (BIT_AND_EXPR, lntype, lhs, ll_mask); - - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, - rntype, rnbitsize, rnbitpos, - lr_unsignedp || rr_unsignedp, lr_reversep); - if (! all_ones_mask_p (lr_mask, rnbitsize)) - rhs = build2 (BIT_AND_EXPR, rntype, rhs, lr_mask); - - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); - } - - /* There is still another way we can do something: If both pairs of - fields being compared are adjacent, we may be able to make a wider - field containing them both. - - Note that we still must mask the lhs/rhs expressions. Furthermore, - the mask must be shifted to account for the shift done by - make_bit_field_ref. */ - if (((ll_bitsize + ll_bitpos == rl_bitpos - && lr_bitsize + lr_bitpos == rr_bitpos) - || (ll_bitpos == rl_bitpos + rl_bitsize - && lr_bitpos == rr_bitpos + rr_bitsize)) - && ll_bitpos >= 0 - && rl_bitpos >= 0 - && lr_bitpos >= 0 - && rr_bitpos >= 0) - { - tree type; - - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, lntype, - ll_bitsize + rl_bitsize, - MIN (ll_bitpos, rl_bitpos), - ll_unsignedp, ll_reversep); - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, rntype, - lr_bitsize + rr_bitsize, - MIN (lr_bitpos, rr_bitpos), - lr_unsignedp, lr_reversep); - - ll_mask = const_binop (RSHIFT_EXPR, ll_mask, - size_int (MIN (xll_bitpos, xrl_bitpos))); - lr_mask = const_binop (RSHIFT_EXPR, lr_mask, - size_int (MIN (xlr_bitpos, xrr_bitpos))); - if (ll_mask == NULL_TREE || lr_mask == NULL_TREE) - return 0; - - /* Convert to the smaller type before masking out unwanted bits. */ - type = lntype; - if (lntype != rntype) - { - if (lnbitsize > rnbitsize) - { - lhs = fold_convert_loc (loc, rntype, lhs); - ll_mask = fold_convert_loc (loc, rntype, ll_mask); - type = rntype; - } - else if (lnbitsize < rnbitsize) - { - rhs = fold_convert_loc (loc, lntype, rhs); - lr_mask = fold_convert_loc (loc, lntype, lr_mask); - type = lntype; - } - } - - if (! all_ones_mask_p (ll_mask, ll_bitsize + rl_bitsize)) - lhs = build2 (BIT_AND_EXPR, type, lhs, ll_mask); - - if (! all_ones_mask_p (lr_mask, lr_bitsize + rr_bitsize)) - rhs = build2 (BIT_AND_EXPR, type, rhs, lr_mask); - - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); - } - - return 0; - } - - /* Handle the case of comparisons with constants. If there is something in - common between the masks, those bits of the constants must be the same. - If not, the condition is always false. Test for this to avoid generating - incorrect code below. */ - result = const_binop (BIT_AND_EXPR, ll_mask, rl_mask); - if (! integer_zerop (result) - && simple_cst_equal (const_binop (BIT_AND_EXPR, result, l_const), - const_binop (BIT_AND_EXPR, result, r_const)) != 1) - { - if (wanted_code == NE_EXPR) - { - warning (0, "%<or%> of unmatched not-equal tests is always 1"); - return constant_boolean_node (true, truth_type); - } - else - { - warning (0, "%<and%> of mutually exclusive equal-tests is always 0"); - return constant_boolean_node (false, truth_type); - } - } - - if (lnbitpos < 0) - return 0; - - /* Construct the expression we will return. First get the component - reference we will make. Unless the mask is all ones the width of - that field, perform the mask operation. Then compare with the - merged constant. */ - result = make_bit_field_ref (loc, ll_inner, ll_arg, - lntype, lnbitsize, lnbitpos, - ll_unsignedp || rl_unsignedp, ll_reversep); - - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); - if (! all_ones_mask_p (ll_mask, lnbitsize)) - result = build2_loc (loc, BIT_AND_EXPR, lntype, result, ll_mask); - - return build2_loc (loc, wanted_code, truth_type, result, - const_binop (BIT_IOR_EXPR, l_const, r_const)); + return 0; } /* T is an integer expression that is being multiplied, divided, or taken a diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b042..77a5c916cbd88 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -253,11 +253,21 @@ extern tree fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE extern tree_code minmax_from_comparison (tree_code, tree, tree, tree, tree); +extern tree make_bit_field_ref (location_t, tree, tree, tree, + HOST_WIDE_INT, poly_int64, int, int); + /* In gimple-fold.cc. */ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_for_ifcombine (enum tree_code, tree, + location_t, enum tree_code, + tree, tree, + location_t, enum tree_code, + tree, tree, + tree *); + /* Class used to compare gimple operands. */ diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 3c72cd6c79ae6..a31fc283d51b0 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -7438,7 +7438,1155 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, return NULL_TREE; } + +/* Return TRUE and set op[0] if T, following all SSA edges, is a type + conversion. */ +static bool +gimple_convert_def_p (tree t, tree op[1]) +{ + if (TREE_CODE (t) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (t)) + if (gassign *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (t))) + switch (gimple_assign_rhs_code (def)) + { + CASE_CONVERT: + op[0] = gimple_assign_rhs1 (def); + return true; + case VIEW_CONVERT_EXPR: + op[0] = TREE_OPERAND (gimple_assign_rhs1 (def), 0); + return true; + default: + break; + } + return false; +} + +/* Return TRUE and set op[*] if T, following all SSA edges, resolves to a + binary expression with code CODE. */ + +static bool +gimple_binop_def_p (enum tree_code code, tree t, tree op[2]) +{ + if (TREE_CODE (t) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (t)) + if (gimple *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (t))) + if (gimple_assign_rhs_code (def) == code) + { + op[0] = gimple_assign_rhs1 (def); + op[1] = gimple_assign_rhs2 (def); + return true; + } + return false; +} +/* Subroutine for fold_truth_andor_1: decode a field reference. + + If *PEXP is a comparison reference, we return the innermost reference. + + *PBITSIZE is set to the number of bits in the reference, *PBITPOS is + set to the starting bit number. + + *PVOLATILEP is set to 1 if the any expression encountered is volatile; + otherwise it is not changed. + + *PUNSIGNEDP is set to the signedness of the field. + + *PREVERSEP is set to the storage order of the field. + + *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. + If PAND_MASK *is NULL, BIT_AND_EXPR is not recognized. + + *XOR_P is to be FALSE if EXP might be a XOR used in a compare, in which + case, if XOR_CMP_OP is a zero constant, it will be overridden with *PEXP, + *XOR_P will be set to TRUE, and the left-hand operand of the XOR will be + decoded. If *XOR_P is TRUE, XOR_CMP_OP is supposed to be NULL, and then the + right-hand operand of the XOR will be decoded. + + *LOAD is set to the load stmt of the innermost reference, if any, + *and NULL otherwise. + + LOC[0..3] are filled in as conversion, masking, shifting and loading + operations are located. + + Return 0 if this is not a component reference or is one that we can't + do anything with. */ + +static tree +decode_field_reference (tree *pexp, HOST_WIDE_INT *pbitsize, + HOST_WIDE_INT *pbitpos, + bool *punsignedp, bool *preversep, bool *pvolatilep, + wide_int *pand_mask, bool *xor_p, tree *xor_cmp_op, + gimple **load, location_t loc[4]) +{ + tree exp = *pexp; + tree outer_type = 0; + wide_int and_mask; + tree inner, offset; + int shiftrt = 0; + tree res_ops[2]; + machine_mode mode; + + *load = NULL; + + /* All the optimizations using this function assume integer fields. + There are problems with FP fields since the type_for_size call + below can fail for, e.g., XFmode. */ + if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) + return NULL_TREE; + + /* Drop casts, only save the outermost type. We need not worry about + narrowing then widening casts, or vice-versa, for those that are not + essential for the compare have already been optimized out at this + point. */ + if (gimple_convert_def_p (exp, res_ops)) + { + if (!outer_type) + { + outer_type = TREE_TYPE (exp); + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); + } + exp = res_ops[0]; + } + + /* Recognize and save a masking operation. */ + if (pand_mask && gimple_binop_def_p (BIT_AND_EXPR, exp, res_ops) + && uniform_integer_cst_p (res_ops[1])) + { + loc[1] = gimple_location (SSA_NAME_DEF_STMT (exp)); + exp = res_ops[0]; + and_mask = wi::to_wide (res_ops[1]); + } + + /* Turn (a ^ b) [!]= 0 into a [!]= b. */ + if (xor_p && gimple_binop_def_p (BIT_XOR_EXPR, exp, res_ops) + && uniform_integer_cst_p (res_ops[1])) + { + /* No location recorded for this one, it's entirely subsumed by the + compare. */ + if (*xor_p) + { + exp = res_ops[1]; + gcc_checking_assert (!xor_cmp_op); + } + else if (!xor_cmp_op) + /* Not much we can do when xor appears in the right-hand compare + operand. */ + return NULL_TREE; + else + { + *xor_p = true; + exp = res_ops[0]; + *xor_cmp_op = *pexp; + } + } + + /* Another chance to drop conversions. */ + if (gimple_convert_def_p (exp, res_ops)) + { + if (!outer_type) + { + outer_type = TREE_TYPE (exp); + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); + } + exp = res_ops[0]; + } + + /* Take note of shifts. */ + if (gimple_binop_def_p (RSHIFT_EXPR, exp, res_ops) + && uniform_integer_cst_p (res_ops[1])) + { + loc[2] = gimple_location (SSA_NAME_DEF_STMT (exp)); + exp = res_ops[0]; + if (!tree_fits_shwi_p (res_ops[1])) + return NULL_TREE; + shiftrt = tree_to_shwi (res_ops[1]); + if (shiftrt <= 0) + return NULL_TREE; + } + + /* Yet another chance to drop conversions. */ + if (gimple_convert_def_p (exp, res_ops)) + { + if (!outer_type) + { + outer_type = TREE_TYPE (exp); + loc[0] = gimple_location (SSA_NAME_DEF_STMT (exp)); + } + exp = res_ops[0]; + } + + /* Identify the load, if there is one. */ + if (TREE_CODE (exp) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (exp)) + { + gimple *def = SSA_NAME_DEF_STMT (exp); + if (gimple_assign_load_p (def)) + { + loc[3] = gimple_location (def); + *load = def; + exp = gimple_assign_rhs1 (def); + } + } + + /* Identify the relevant bits. */ + poly_int64 poly_bitsize, poly_bitpos; + int unsignedp, reversep = *preversep, volatilep = *pvolatilep; + inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, + &mode, &unsignedp, &reversep, &volatilep); + + HOST_WIDE_INT bs, bp; + if (!poly_bitsize.is_constant (&bs) + || !poly_bitpos.is_constant (&bp) + || bs <= shiftrt + || offset != 0 + || TREE_CODE (inner) == PLACEHOLDER_EXPR + /* Reject out-of-bound accesses (PR79731). */ + || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) + && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), + bp + bs) < 0)) + return NULL_TREE; + + *pbitsize = bs; + *pbitpos = bp; + *punsignedp = unsignedp; + *preversep = reversep; + *pvolatilep = volatilep; + + /* Adjust shifts... */ + if (shiftrt) + { + if (!*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + *pbitpos += shiftrt; + *pbitsize -= shiftrt; + } + + /* ... and bit position. */ + if (outer_type && *pbitsize > TYPE_PRECISION (outer_type)) + { + HOST_WIDE_INT excess = *pbitsize - TYPE_PRECISION (outer_type); + if (*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + *pbitpos += excess; + *pbitsize -= excess; + } + + *pexp = exp; + + /* If the number of bits in the reference is the same as the bitsize of + the outer type, then the outer type gives the signedness. Otherwise + (in case of a small bitfield) the signedness is unchanged. */ + if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) + *punsignedp = TYPE_UNSIGNED (outer_type); + + /* Make the mask the expected width. */ + if (and_mask.get_precision () != 0) + and_mask = wide_int::from (and_mask, *pbitsize, UNSIGNED); + + if (pand_mask) + *pand_mask = and_mask; + + return inner; +} + +/* Return the one bitpos within bit extents L or R that is at an + ALIGN-bit alignment boundary, or -1 if there is more than one such + boundary, if there isn't any, or if there is any such boundary + between the extents. L and R are given by bitpos and bitsize. If + it doesn't return -1, there are two consecutive ALIGN-bit words + that contain both extents, and at least one of the extents + straddles across the returned alignment boundary. */ + +static inline HOST_WIDE_INT +compute_split_boundary_from_align (HOST_WIDE_INT align, + HOST_WIDE_INT l_bitpos, + HOST_WIDE_INT l_bitsize, + HOST_WIDE_INT r_bitpos, + HOST_WIDE_INT r_bitsize) +{ + HOST_WIDE_INT amask = ~(align - 1); + + HOST_WIDE_INT first_bit = MIN (l_bitpos, r_bitpos); + HOST_WIDE_INT end_bit = MAX (l_bitpos + l_bitsize, r_bitpos + r_bitsize); + + HOST_WIDE_INT boundary = (end_bit - 1) & amask; + + /* Make sure we're crossing no more than one alignment boundary. + + ??? We don't have logic to recombine loads of two adjacent + fields that each crosses a different alignment boundary, so + as to load the middle word only once, if other words can't be + otherwise recombined. */ + if (boundary - first_bit > align) + return -1; + + HOST_WIDE_INT l_start_word = l_bitpos & amask; + HOST_WIDE_INT l_end_word = (l_bitpos + l_bitsize - 1) & amask; + + HOST_WIDE_INT r_start_word = r_bitpos & amask; + HOST_WIDE_INT r_end_word = (r_bitpos + r_bitsize - 1) & amask; + + /* If neither field straddles across an alignment boundary, it's no + use to even try to merge them. */ + if (l_start_word == l_end_word && r_start_word == r_end_word) + return -1; + + return boundary; +} + +/* Make a bit_field_ref. If POINT is NULL, return the BIT_FIELD_REF. + Otherwise, build and insert a load stmt before POINT, and return + the SSA_NAME. ??? Rewrite LOAD in terms of the bitfield? */ + +static tree +make_bit_field_load (location_t loc, tree inner, tree orig_inner, tree type, + HOST_WIDE_INT bitsize, poly_int64 bitpos, + bool unsignedp, bool reversep, gimple *point) +{ + if (point && loc == UNKNOWN_LOCATION) + loc = gimple_location (point); + + tree ref = make_bit_field_ref (loc, unshare_expr (inner), + unshare_expr (orig_inner), + type, bitsize, bitpos, + unsignedp, reversep); + if (!point) + return ref; + + gimple_seq stmts = NULL; + tree ret = force_gimple_operand (ref, &stmts, true, NULL_TREE); + + /* We know the vuse is supposed to end up being the same as that at the + original load at the insertion point, but if we don't set it, it will be a + generic placeholder that only the global SSA update at the end of the pass + would make equal, too late for us to use in further combinations. So go + ahead and copy the vuse. */ + + tree reaching_vuse = gimple_vuse (point); + for (gimple_stmt_iterator i = gsi_start (stmts); + !gsi_end_p (i); gsi_next (&i)) + { + gimple *new_stmt = gsi_stmt (i); + if (gimple_has_mem_ops (new_stmt)) + gimple_set_vuse (new_stmt, reaching_vuse); + } + + gimple_stmt_iterator gsi = gsi_for_stmt (point); + gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT); + return ret; +} + +/* Initialize ln_arg[0] and ln_arg[1] to a pair of newly-created (at + LOC) loads from INNER (from ORIG_INNER), of modes MODE and MODE2, + respectively, starting at BIT_POS, using reversed endianness if + REVERSEP. Also initialize BITPOS (the starting position of each + part into INNER), BITSIZ (the bit count starting at BITPOS), + TOSHIFT[1] (the amount by which the part and its mask are to be + shifted right to bring its least-significant bit to bit zero) and + SHIFTED (the amount by which the part, by separate loading, has + already been shifted right, but that the mask needs shifting to + match). */ + +static inline void +build_split_load (tree /* out */ ln_arg[2], + HOST_WIDE_INT /* out */ bitpos[2], + HOST_WIDE_INT /* out */ bitsiz[2], + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], + HOST_WIDE_INT /* out */ shifted[2], + location_t loc, tree inner, tree orig_inner, + scalar_int_mode mode, scalar_int_mode mode2, + HOST_WIDE_INT bit_pos, bool reversep, + gimple *point[2]) +{ + scalar_int_mode modes[2] = { mode, mode2 }; + bitsiz[0] = GET_MODE_BITSIZE (mode); + bitsiz[1] = GET_MODE_BITSIZE (mode2); + + for (int i = 0; i < 2; i++) + { + tree type = lang_hooks.types.type_for_mode (modes[i], 1); + if (!type) + { + type = build_nonstandard_integer_type (bitsiz[0], 1); + gcc_assert (type); + } + bitpos[i] = bit_pos; + ln_arg[i] = make_bit_field_load (loc, inner, orig_inner, + type, bitsiz[i], + bit_pos, 1, reversep, point[i]); + bit_pos += bitsiz[i]; + } + + toshift[1] = toshift[0]; + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + shifted[0] = bitsiz[1]; + shifted[1] = 0; + toshift[0] = 0; + } + else + { + shifted[1] = bitsiz[0]; + shifted[0] = 0; + toshift[1] = 0; + } +} + +/* Make arrangements to split at bit BOUNDARY a single loaded word + (with REVERSEP bit order) LN_ARG[0], to be shifted right by + TOSHIFT[0] to bring the field of interest to the least-significant + bit. The expectation is that the same loaded word will be + propagated from part 0 to part 1, with just different shifting and + masking to extract both parts. MASK is not expected to do more + than masking out the bits that belong to the other part. See + build_split_load for more information on the other fields. */ + +static inline void +reuse_split_load (tree /* in[0] out[1] */ ln_arg[2], + HOST_WIDE_INT /* in[0] out[1] */ bitpos[2], + HOST_WIDE_INT /* in[0] out[1] */ bitsiz[2], + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], + HOST_WIDE_INT /* out */ shifted[2], + wide_int /* out */ mask[2], + HOST_WIDE_INT boundary, bool reversep) +{ + unsigned prec = TYPE_PRECISION (TREE_TYPE (ln_arg[0])); + + ln_arg[1] = ln_arg[0]; + bitpos[1] = bitpos[0]; + bitsiz[1] = bitsiz[0]; + shifted[1] = shifted[0] = 0; + + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + toshift[1] = toshift[0]; + toshift[0] = bitpos[0] + bitsiz[0] - boundary; + mask[0] = wi::mask (toshift[0], true, prec); + mask[1] = wi::mask (toshift[0], false, prec); + } + else + { + toshift[1] = boundary - bitpos[1]; + mask[1] = wi::mask (toshift[1], true, prec); + mask[0] = wi::mask (toshift[1], false, prec); + } +} + +/* Find ways of folding logical expressions of LHS and RHS: + + Try to merge two comparisons to nearby fields. + + For example, if we have p->a == 2 && p->b == 4 and we can load both A and B + at once, we can do this with a comparison against the object ANDed with the + a mask. + + If we have p->a == q->a && p->b == q->b, we may be able to use bit masking + operations to do this with one comparison, loading both fields from P at + once, and likewise from Q. + + Herein, loading at once means loading from within the same alignment + boundary for the enclosing object. If (packed) fields cross such alignment + boundaries, we may still recombine the compares, so that loads do not cross + the boundaries. + + CODE is the logical operation being done. It can be TRUTH_ANDIF_EXPR, + TRUTH_AND_EXPR, TRUTH_ORIF_EXPR, or TRUTH_OR_EXPR. + + TRUTH_TYPE is the type of the logical operand. + + LHS is denoted as LL_ARG LCODE LR_ARG. + + RHS is denoted as RL_ARG RCODE RR_ARG. + + LHS is assumed to dominate RHS. + + Combined loads are inserted next to preexisting loads, once we determine + that the combination is viable, and the combined condition references new + SSA_NAMEs that hold the loaded values. Since the original loads are + verified to have the same gimple_vuse, the insertion point doesn't matter + for correctness. ??? The loads may be a lot earlier than the compares, and + it's conceivable that one or two loads for RHS appear before those for LHS. + It could be advantageous to try to place the loads optimally, taking + advantage of knowing whether RHS is accessed before LHS, or that both are + accessed before both compares, but we don't do that (yet?). + + SEPARATEP should be NULL if the combined condition must be returned as a + single expression, even if it is a compound condition. This must only be + done if LHS and RHS are adjacent, without intervening conditions, and the + combined condition is to replace RHS, while LHS is dropped altogether. + + Otherwise, SEPARATEP must be a non-NULL pointer to a NULL_TREE, that may be + replaced by a part of the compound condition that could replace RHS, while + the returned expression replaces LHS. This works whether or not LHS and RHS + are adjacent, as long as there aren't VDEFs or other side effects between + them. + + If the "words" accessed by RHS are already accessed by LHS, this won't + matter, but if RHS accesses "words" that LHS doesn't, then *SEPARATEP will + be set to the compares that should take RHS's place. By "words" we mean + contiguous bits that do not cross a an TYPE_ALIGN boundary of the accessed + object's type. + + We return the simplified tree or 0 if no optimization is possible. */ + +tree +fold_truth_andor_for_ifcombine (enum tree_code code, tree truth_type, + location_t lloc, enum tree_code lcode, + tree ll_arg, tree lr_arg, + location_t rloc, enum tree_code rcode, + tree rl_arg, tree rr_arg, + tree *separatep) +{ + /* If this is the "or" of two comparisons, we can do something if + the comparisons are NE_EXPR. If this is the "and", we can do something + if the comparisons are EQ_EXPR. I.e., + (a->b == 2 && a->c == 4) can become (a->new == NEW). + + WANTED_CODE is this operation code. For single bit fields, we can + convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" + comparison for one-bit fields. */ + + enum tree_code orig_code = code; + enum tree_code wanted_code; + tree ll_inner, lr_inner, rl_inner, rr_inner; + gimple *ll_load, *lr_load, *rl_load, *rr_load; + HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; + HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; + HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; + HOST_WIDE_INT lnbitsize, lnbitpos, lnprec; + HOST_WIDE_INT rnbitsize, rnbitpos, rnprec; + bool ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; + bool ll_reversep, lr_reversep, rl_reversep, rr_reversep; + scalar_int_mode lnmode, lnmode2, rnmode; + wide_int ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; + wide_int l_const, r_const; + tree lntype, rntype, result; + HOST_WIDE_INT first_bit, end_bit; + bool volatilep; + bool l_split_load; + + /* These are indexed by: conv, mask, shft, load. */ + location_t ll_loc[4] = { lloc, lloc, lloc, UNKNOWN_LOCATION }; + location_t lr_loc[4] = { lloc, lloc, lloc, UNKNOWN_LOCATION }; + location_t rl_loc[4] = { rloc, rloc, rloc, UNKNOWN_LOCATION }; + location_t rr_loc[4] = { rloc, rloc, rloc, UNKNOWN_LOCATION }; + + gcc_checking_assert (!separatep || !*separatep); + + /* Start by getting the comparison codes. Fail if anything is volatile. + If one operand is a BIT_AND_EXPR with the constant one, treat it as if + it were surrounded with a NE_EXPR. */ + + if (TREE_CODE_CLASS (lcode) != tcc_comparison + || TREE_CODE_CLASS (rcode) != tcc_comparison) + return 0; + + /* We don't normally find TRUTH_*IF_EXPR in gimple, but these codes may be + given by our caller to denote conditions from different blocks. */ + switch (code) + { + case TRUTH_AND_EXPR: + case TRUTH_ANDIF_EXPR: + code = TRUTH_AND_EXPR; + break; + + case TRUTH_OR_EXPR: + case TRUTH_ORIF_EXPR: + code = TRUTH_OR_EXPR; + break; + + default: + return 0; + } + + bool lsignbit = false, rsignbit = false; + if ((lcode == LT_EXPR || lcode == GE_EXPR) + && integer_zerop (lr_arg) + && INTEGRAL_TYPE_P (TREE_TYPE (ll_arg)) + && !TYPE_UNSIGNED (TREE_TYPE (ll_arg))) + { + lsignbit = true; + lcode = (lcode == LT_EXPR ? NE_EXPR : EQ_EXPR); + } + if ((rcode == LT_EXPR || rcode == GE_EXPR) + && integer_zerop (rr_arg) + && INTEGRAL_TYPE_P (TREE_TYPE (rl_arg)) + && !TYPE_UNSIGNED (TREE_TYPE (rl_arg))) + { + rsignbit = true; + rcode = (rcode == LT_EXPR ? NE_EXPR : EQ_EXPR); + } + + /* See if the comparisons can be merged. Then get all the parameters for + each side. */ + + if ((lcode != EQ_EXPR && lcode != NE_EXPR) + || (rcode != EQ_EXPR && rcode != NE_EXPR)) + return 0; + + ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; + volatilep = 0; + bool l_xor = false, r_xor = false; + ll_inner = decode_field_reference (&ll_arg, &ll_bitsize, &ll_bitpos, + &ll_unsignedp, &ll_reversep, &volatilep, + &ll_and_mask, &l_xor, &lr_arg, + &ll_load, ll_loc); + lr_inner = decode_field_reference (&lr_arg, &lr_bitsize, &lr_bitpos, + &lr_unsignedp, &lr_reversep, &volatilep, + &lr_and_mask, &l_xor, 0, + &lr_load, lr_loc); + rl_inner = decode_field_reference (&rl_arg, &rl_bitsize, &rl_bitpos, + &rl_unsignedp, &rl_reversep, &volatilep, + &rl_and_mask, &r_xor, &rr_arg, + &rl_load, rl_loc); + rr_inner = decode_field_reference (&rr_arg, &rr_bitsize, &rr_bitpos, + &rr_unsignedp, &rr_reversep, &volatilep, + &rr_and_mask, &r_xor, 0, + &rr_load, rr_loc); + + /* It must be true that the inner operation on the lhs of each + comparison must be the same if we are to be able to do anything. + Then see if we have constants. If not, the same must be true for + the rhs's. */ + if (volatilep + || ll_reversep != rl_reversep + || ll_inner == 0 || rl_inner == 0 + || ! operand_equal_p (ll_inner, rl_inner, 0) + || (ll_load && rl_load + && gimple_vuse (ll_load) != gimple_vuse (rl_load))) + return 0; + + if (TREE_CODE (lr_arg) == INTEGER_CST + && TREE_CODE (rr_arg) == INTEGER_CST) + { + l_const = wi::to_wide (lr_arg); + /* We don't expect masks on constants, but if there are any, apply + them now. */ + if (lr_and_mask.get_precision ()) + l_const &= wide_int::from (lr_and_mask, + l_const.get_precision (), UNSIGNED); + r_const = wi::to_wide (rr_arg); + if (rr_and_mask.get_precision ()) + r_const &= wide_int::from (rr_and_mask, + r_const.get_precision (), UNSIGNED); + lr_reversep = ll_reversep; + } + else if (lr_reversep != rr_reversep + || lr_inner == 0 || rr_inner == 0 + || ! operand_equal_p (lr_inner, rr_inner, 0) + || (lr_load && rr_load + && gimple_vuse (lr_load) != gimple_vuse (rr_load))) + return 0; + + /* If we found sign tests, finish turning them into bit tests. */ + + if (lsignbit) + { + wide_int sign = wi::mask (ll_bitsize - 1, true, ll_bitsize); + if (!ll_and_mask.get_precision ()) + ll_and_mask = sign; + else + ll_and_mask &= sign; + } + + if (rsignbit) + { + wide_int sign = wi::mask (rl_bitsize - 1, true, rl_bitsize); + if (!rl_and_mask.get_precision ()) + rl_and_mask = sign; + else + rl_and_mask &= sign; + } + + /* If either comparison code is not correct for our logical operation, + fail. However, we can convert a one-bit comparison against zero into + the opposite comparison against that bit being set in the field. */ + + wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); + if (lcode != wanted_code) + { + if (l_const.get_precision () + && l_const == 0 + && ll_and_mask.get_precision () + && wi::popcount (ll_and_mask) == 1) + { + /* Make the left operand unsigned, since we are only interested + in the value of one bit. Otherwise we are doing the wrong + thing below. */ + ll_unsignedp = 1; + l_const = ll_and_mask; + } + else + return 0; + } + + /* This is analogous to the code for l_const above. */ + if (rcode != wanted_code) + { + if (r_const.get_precision () + && r_const == 0 + && rl_and_mask.get_precision () + && wi::popcount (rl_and_mask) == 1) + { + rl_unsignedp = 1; + r_const = rl_and_mask; + } + else + return 0; + } + + /* This will be bumped to 2 if any of the field pairs crosses an + alignment boundary, so the merged compare has to be done in two + parts. */ + int parts = 1; + /* Set to true if the second combined compare should come first, + e.g., because the second original compare accesses a word that + the first one doesn't, and the combined compares access those in + cmp[0]. */ + bool first1 = false; + /* Set to true if the first original compare is not the one being + split. */ + bool maybe_separate = false; + + /* The following 2-dimensional arrays use the first index to + identify left(0)- vs right(1)-hand compare operands, and the + second one to identify merged compare parts. */ + /* The memory loads or constants to be compared. */ + tree ld_arg[2][2]; + /* The first bit of the corresponding inner object that the + corresponding LD_ARG covers. */ + HOST_WIDE_INT bitpos[2][2]; + /* The bit count starting at BITPOS that the corresponding LD_ARG + covers. */ + HOST_WIDE_INT bitsiz[2][2]; + /* The number of bits by which LD_ARG has already been shifted + right, WRT mask. */ + HOST_WIDE_INT shifted[2][2]; + /* The number of bits by which both LD_ARG and MASK need shifting to + bring its least-significant bit to bit zero. */ + HOST_WIDE_INT toshift[2][2]; + /* An additional mask to be applied to LD_ARG, to remove any bits + that may have been loaded for use in another compare, but that + don't belong in the corresponding compare. */ + wide_int xmask[2][2] = {}; + + /* The combined compare or compares. */ + tree cmp[2]; + + /* Consider we're comparing two non-contiguous fields of packed + structs, both aligned at 32-bit boundaries: + + ll_arg: an 8-bit field at offset 0 + lr_arg: a 16-bit field at offset 2 + + rl_arg: an 8-bit field at offset 1 + rr_arg: a 16-bit field at offset 3 + + We'll have r_split_load, because rr_arg straddles across an + alignment boundary. + + We'll want to have: + + bitpos = { { 0, 0 }, { 0, 32 } } + bitsiz = { { 32, 32 }, { 32, 8 } } + + And, for little-endian: + + shifted = { { 0, 0 }, { 0, 32 } } + toshift = { { 0, 24 }, { 0, 0 } } + + Or, for big-endian: + + shifted = { { 0, 0 }, { 8, 0 } } + toshift = { { 8, 0 }, { 0, 0 } } + */ + + /* See if we can find a mode that contains both fields being compared on + the left. If we can't, fail. Otherwise, update all constants and masks + to be relative to a field of that size. */ + first_bit = MIN (ll_bitpos, rl_bitpos); + end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); + if (get_best_mode (end_bit - first_bit, first_bit, 0, 0, + TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, + volatilep, &lnmode)) + l_split_load = false; + else + { + /* Consider the possibility of recombining loads if any of the + fields straddles across an alignment boundary, so that either + part can be loaded along with the other field. */ + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (ll_inner)); + HOST_WIDE_INT boundary = compute_split_boundary_from_align + (align, ll_bitpos, ll_bitsize, rl_bitpos, rl_bitsize); + + if (boundary < 0 + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, + align, BITS_PER_WORD, volatilep, &lnmode) + || !get_best_mode (end_bit - boundary, boundary, 0, 0, + align, BITS_PER_WORD, volatilep, &lnmode2)) + return 0; + + /* If we can't have a single load, but can with two, figure out whether + the two compares can be separated, i.e., whether the entirety of the + first original compare is encompassed by the entirety of the first + combined compare. If the first original compare is past the alignment + boundary, arrange to compare that range first, by setting first1 + (meaning make cmp[1] first, instead of cmp[0]). */ + l_split_load = true; + parts = 2; + if (ll_bitpos >= boundary) + maybe_separate = first1 = true; + else if (ll_bitpos + ll_bitsize <= boundary) + maybe_separate = true; + } + + lnbitsize = GET_MODE_BITSIZE (lnmode); + lnbitpos = first_bit & ~ (lnbitsize - 1); + /* Avoid situations that the code below can't handle. */ + if (lnbitpos < 0) + return 0; + + /* Choose the type for the combined compare. Even if we're splitting loads, + make it wide enough to hold both. */ + if (l_split_load) + lnbitsize += GET_MODE_BITSIZE (lnmode2); + lntype = build_nonstandard_integer_type (lnbitsize, 1); + if (!lntype) + return NULL_TREE; + lnprec = TYPE_PRECISION (lntype); + xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; + + /* Adjust bit ranges for reverse endianness. */ + if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; + xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; + } + + /* Adjust masks to match the positions in the combined lntype. */ + wide_int ll_mask, rl_mask, r_mask; + if (ll_and_mask.get_precision ()) + ll_mask = wi::lshift (wide_int::from (ll_and_mask, lnprec, UNSIGNED), + xll_bitpos); + else + ll_mask = wi::shifted_mask (xll_bitpos, ll_bitsize, false, lnprec); + if (rl_and_mask.get_precision ()) + rl_mask = wi::lshift (wide_int::from (rl_and_mask, lnprec, UNSIGNED), + xrl_bitpos); + else + rl_mask = wi::shifted_mask (xrl_bitpos, rl_bitsize, false, lnprec); + + /* Adjust right-hand constants in both original comparisons to match width + and bit position. */ + if (l_const.get_precision ()) + { + /* We're doing bitwise equality tests, so don't bother with sign + extensions. */ + l_const = wide_int::from (l_const, lnprec, UNSIGNED); + if (ll_and_mask.get_precision ()) + l_const &= wide_int::from (ll_and_mask, lnprec, UNSIGNED); + l_const <<= xll_bitpos; + if ((l_const & ~ll_mask) != 0) + { + warning_at (lloc, OPT_Wtautological_compare, + "comparison is always %d", wanted_code == NE_EXPR); + + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); + } + + /* When we set l_const, we also set r_const, so we need not test it + again. */ + gcc_checking_assert (r_const.get_precision ()); + + r_const = wide_int::from (r_const, lnprec, UNSIGNED); + if (rl_and_mask.get_precision ()) + r_const &= wide_int::from (rl_and_mask, lnprec, UNSIGNED); + r_const <<= xrl_bitpos; + if ((r_const & ~rl_mask) != 0) + { + warning_at (rloc, OPT_Wtautological_compare, + "comparison is always %d", wanted_code == NE_EXPR); + + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); + } + + /* If there is something in common between the masks, those bits of the + constants must be the same. If not, the combined condition cannot be + met, and the result is known. Test for this to avoid generating + incorrect code below. */ + wide_int mask = ll_mask & rl_mask; + if (mask != 0 + && (l_const & mask) != (r_const & mask)) + { + if (wanted_code == NE_EXPR) + return constant_boolean_node (true, truth_type); + else + return constant_boolean_node (false, truth_type); + } + + /* The constants are combined so as to line up with the loaded field, so + tentatively use the same parameters for the second combined + compare. */ + ld_arg[1][0] = wide_int_to_tree (lntype, l_const | r_const); + toshift[1][0] = MIN (xll_bitpos, xrl_bitpos); + shifted[1][0] = 0; + bitpos[1][0] = lnbitpos; + bitsiz[1][0] = lnbitsize; + + if (parts > 1) + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], xmask[1], + lnbitpos + GET_MODE_BITSIZE (lnmode), + lr_reversep); + + /* No masking needed, we know the full constants. */ + r_mask = wi::mask (0, true, lnprec); + + /* If the compiler thinks this is used uninitialized below, it's + because it can't realize that parts can only be 2 when + comparing with constants if l_split_load is also true. This + just silences the warning. */ + rnbitpos = 0; + } + + /* Likewise, if the right sides are not constant, align them for the combined + compare. Also, disallow this optimization if a size, signedness or + storage order mismatch occurs between the left and right sides. */ + else + { + if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize + || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp + || ll_reversep != lr_reversep + /* Make sure the two fields on the right + correspond to the left without being swapped. */ + || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) + return 0; + + bool r_split_load; + scalar_int_mode rnmode2; + + /* Figure out how to load the bits for the right-hand size of the + combined compare. As in the left-hand size, we may have to split it, + and then we use two separate compares. */ + first_bit = MIN (lr_bitpos, rr_bitpos); + end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); + if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, + TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, + volatilep, &rnmode)) + { + /* Consider the possibility of recombining loads if any of the + fields straddles across an alignment boundary, so that either + part can be loaded along with the other field. */ + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (lr_inner)); + HOST_WIDE_INT boundary = compute_split_boundary_from_align + (align, lr_bitpos, lr_bitsize, rr_bitpos, rr_bitsize); + + if (boundary < 0 + /* If we're to split both, make sure the split point is + the same. */ + || (l_split_load + && (boundary - lr_bitpos + != (lnbitpos + GET_MODE_BITSIZE (lnmode)) - ll_bitpos)) + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, + align, BITS_PER_WORD, volatilep, &rnmode) + || !get_best_mode (end_bit - boundary, boundary, 0, 0, + align, BITS_PER_WORD, volatilep, &rnmode2)) + return 0; + + r_split_load = true; + parts = 2; + if (lr_bitpos >= boundary) + maybe_separate = first1 = true; + else if (lr_bitpos + lr_bitsize <= boundary) + maybe_separate = true; + } + else + r_split_load = false; + + /* Find a type that can hold the entire right-hand operand. */ + rnbitsize = GET_MODE_BITSIZE (rnmode); + rnbitpos = first_bit & ~ (rnbitsize - 1); + if (r_split_load) + rnbitsize += GET_MODE_BITSIZE (rnmode2); + rntype = build_nonstandard_integer_type (rnbitsize, 1); + if (!rntype) + return 0; + rnprec = TYPE_PRECISION (rntype); + xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; + + /* Adjust for reversed endianness. */ + if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; + xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; + } + + /* Adjust the masks to match the combined type, and combine them. */ + wide_int lr_mask, rr_mask; + if (lr_and_mask.get_precision ()) + lr_mask = wi::lshift (wide_int::from (lr_and_mask, rnprec, UNSIGNED), + xlr_bitpos); + else + lr_mask = wi::shifted_mask (xlr_bitpos, lr_bitsize, false, rnprec); + if (rl_and_mask.get_precision ()) + rr_mask = wi::lshift (wide_int::from (rr_and_mask, rnprec, UNSIGNED), + xrr_bitpos); + else + rr_mask = wi::shifted_mask (xrr_bitpos, rr_bitsize, false, rnprec); + r_mask = lr_mask | rr_mask; + + /* Load the right-hand operand of the combined compare. */ + toshift[1][0] = MIN (xlr_bitpos, xrr_bitpos); + shifted[1][0] = 0; + + if (!r_split_load) + { + bitpos[1][0] = rnbitpos; + bitsiz[1][0] = rnbitsize; + ld_arg[1][0] = make_bit_field_load (ll_loc[3], lr_inner, lr_arg, + rntype, rnbitsize, rnbitpos, + lr_unsignedp || rr_unsignedp, + lr_reversep, lr_load); + } + + /* ... and the second part of the right-hand operand if needed. */ + if (parts > 1) + { + if (r_split_load) + { + gimple *point[2]; + point[0] = lr_load; + point[1] = rr_load; + build_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], rl_loc[3], lr_inner, lr_arg, + rnmode, rnmode2, rnbitpos, lr_reversep, point); + } + else + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], xmask[1], + lnbitpos + GET_MODE_BITSIZE (lnmode) + - ll_bitpos + lr_bitpos, lr_reversep); + } + } + + /* Now issue the loads for the left-hand combined operand/s. */ + wide_int l_mask = ll_mask | rl_mask; + toshift[0][0] = MIN (xll_bitpos, xrl_bitpos); + shifted[0][0] = 0; + + if (!l_split_load) + { + bitpos[0][0] = lnbitpos; + bitsiz[0][0] = lnbitsize; + ld_arg[0][0] = make_bit_field_load (ll_loc[3], ll_inner, ll_arg, + lntype, lnbitsize, lnbitpos, + ll_unsignedp || rl_unsignedp, + ll_reversep, ll_load); + } + + if (parts > 1) + { + if (l_split_load) + { + gimple *point[2]; + point[0] = ll_load; + point[1] = rl_load; + build_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], + shifted[0], rl_loc[3], ll_inner, ll_arg, + lnmode, lnmode2, lnbitpos, ll_reversep, point); + } + else + reuse_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], + shifted[0], xmask[0], + rnbitpos + GET_MODE_BITSIZE (rnmode) + - lr_bitpos + ll_bitpos, ll_reversep); + } + + /* Compute the compares. */ + for (int i = 0; i < parts; i++) + { + tree op[2] = { ld_arg[0][i], ld_arg[1][i] }; + wide_int mask[2] = { l_mask, r_mask }; + location_t *locs[2] = { i ? rl_loc : ll_loc, i ? rr_loc : lr_loc }; + + /* Figure out the masks, and unshare the original operands. */ + for (int j = 0; j < 2; j++) + { + unsigned prec = TYPE_PRECISION (TREE_TYPE (op[j])); + op[j] = unshare_expr (op[j]); + + /* Mask out the bits belonging to the other part. */ + if (xmask[j][i].get_precision ()) + mask[j] &= xmask[j][i]; + + if (shifted[j][i]) + { + wide_int shift = wide_int::from (shifted[j][i], prec, UNSIGNED); + mask[j] = wi::lrshift (mask[j], shift); + } + mask[j] = wide_int::from (mask[j], prec, UNSIGNED); + } + + /* Line up the operands for a compare. */ + HOST_WIDE_INT shift = (toshift[0][i] - toshift[1][i]); + + if (shift) + { + int j; + if (shift > 0) + j = 0; + else + { + j = 1; + shift = -shift; + } + + tree shiftsz = bitsize_int (shift); + op[j] = fold_build2_loc (locs[j][1], RSHIFT_EXPR, TREE_TYPE (op[j]), + op[j], shiftsz); + mask[j] = wi::lrshift (mask[j], shift); + } + + /* Convert to the smaller type before masking out unwanted + bits. */ + tree type = TREE_TYPE (op[0]); + if (type != TREE_TYPE (op[1])) + { + int j = (TYPE_PRECISION (type) + < TYPE_PRECISION (TREE_TYPE (op[1]))); + if (!j) + type = TREE_TYPE (op[1]); + op[j] = fold_convert_loc (locs[j][0], type, op[j]); + mask[j] = wide_int::from (mask[j], TYPE_PRECISION (type), UNSIGNED); + } + + /* Apply masks. */ + for (int j = 0; j < 2; j++) + if (mask[j] != wi::mask (0, true, mask[j].get_precision ())) + op[j] = build2_loc (locs[j][2], BIT_AND_EXPR, type, + op[j], wide_int_to_tree (type, mask[j])); + + cmp[i] = build2_loc (i ? rloc : lloc, wanted_code, truth_type, + op[0], op[1]); + } + + /* Reorder the compares if needed. */ + if (first1) + std::swap (cmp[0], cmp[1]); + + /* Prepare to return the resulting compares. Combine two parts if + needed. */ + if (parts == 1) + result = cmp[0]; + else if (!separatep || !maybe_separate) + result = build2_loc (rloc, orig_code, truth_type, cmp[0], cmp[1]); + else + { + result = cmp[0]; + *separatep = cmp[1]; + } + + return result; +} + /* Try to simplify the AND of two comparisons, specified by (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively. If this can be simplified to a single expression (without requiring diff --git a/gcc/testsuite/gcc.dg/field-merge-1.c b/gcc/testsuite/gcc.dg/field-merge-1.c new file mode 100644 index 0000000000000..1818e104437e1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-1.c @@ -0,0 +1,64 @@ +/* { dg-do run } */ +/* { dg-options "-O -save-temps -fdump-tree-optimized" } */ + +/* Check that field loads compared with constants are merged, even if + tested out of order, and when fields straddle across alignment + boundaries. */ + +struct TL { + unsigned char p; + unsigned int a; + unsigned char q; + unsigned int b; + unsigned char r; + unsigned int c; + unsigned char s; +} __attribute__ ((packed, aligned (4), scalar_storage_order ("little-endian"))); + +struct TB { + unsigned char p; + unsigned int a; + unsigned char q; + unsigned int b; + unsigned char r; + unsigned int c; + unsigned char s; +} __attribute__ ((packed, aligned (4), scalar_storage_order ("big-endian"))); + +#define vc 0xaa +#define vi 0x12345678 + +struct TL vL = { vc, vi, vc, vi, vc, vi, vc }; +struct TB vB = { vc, vi, vc, vi, vc, vi, vc }; + +void f (void) { + /* Which words of | vL | vB | */ + /* are accessed by |0123|0123| */ + if (0 /* the tests? | | | */ + || vL.p != vc /* |* | | */ + || vB.p != vc /* | |* | */ + || vL.s != vc /* | *| | */ + || vB.q != vc /* | | * | */ + || vL.a != vi /* |^* | | */ + || vB.b != vi /* | | ^* | */ + || vL.c != vi /* | *^| | */ + || vB.c != vi /* | | ^*| */ + || vL.b != vi /* | ^^ | | */ + || vL.q != vc /* | ^ | | */ + || vB.a != vi /* | |^^ | */ + || vB.r != vc /* | | ^ | */ + || vB.s != vc /* | | ^| */ + || vL.r != vc /* | ^ | | */ + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 8 "optimized" } } */ +/* { dg-final { scan-assembler-not "cmpb" { target { i*86-*-* || x86_64-*-* } } } } */ +/* { dg-final { scan-assembler-times "cmpl" 8 { target { i*86-*-* || x86_64-*-* } } } } */ +/* { dg-final { scan-assembler-times "cmpw" 8 { target { powerpc*-*-* || rs6000-*-* } } } } */ diff --git a/gcc/testsuite/gcc.dg/field-merge-10.c b/gcc/testsuite/gcc.dg/field-merge-10.c new file mode 100644 index 0000000000000..dca6febb75859 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-10.c @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +/* Check that we don't move loads across stores. */ + +struct s { + short a; + short b; +} __attribute__ ((aligned (4))); + +struct s p[2] = { + { 42, 0xfe01 }, + { 42, 0xfe10 } +}; + +void f (void) { + short a0 = p[0].a; + short b0 = p[0].b; + short a1 = p[1].a; + short b1 = p[1].b; + __builtin_memset (p, 0, sizeof (p)); + asm ("" : "+m" (p)); + if (0 + || p[0].a != p[1].a + || p[0].b != p[1].b + ) + __builtin_abort (); + if (a0 == a1) + if (b0 == b1) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-11.c b/gcc/testsuite/gcc.dg/field-merge-11.c new file mode 100644 index 0000000000000..fe627cddd7fdf --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-11.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +/* Check that narrowing casts aren't ignored, and that same-field tests at + different widths aren't misoptimized. */ + +struct s { + short a; + unsigned short b; + int c; +} __attribute__ ((aligned (4))); + +struct s p = { 42, (short)(0xef1 - 0x1000), 0x12345678 }; + +void f (void) { + if (0 + || (p.a & 0xcc) != 8 + || p.a != 42 + || (int)(signed char)p.b != (int)(signed char)(0xef1 - 0x1000) + || (unsigned)(unsigned char)p.b != (unsigned)(unsigned char)(0xef1 - 0x1000) + || (unsigned)p.b != (unsigned short)(0xef1 - 0x1000) + || (int)(short)p.b != (int)(0xef1 - 0x1000) + || (long)(unsigned char)(p.c >> 8) != (long)(unsigned char)0x123456 + || p.c != 0x12345678 + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-12.c b/gcc/testsuite/gcc.dg/field-merge-12.c new file mode 100644 index 0000000000000..7056eb607e904 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-12.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +/* Check that we don't crash when trying to handle masks that don't match the + width of the original type. */ + +struct s { + long long q; +}; + +struct s x1 = { 1 }; +struct s xm1 = { -1 }; +struct s x8 = { 8 }; +struct s x0 = { 0 }; + +bool f(struct s *p) +{ + int q = (int)p->q; + return (q < 0) || (q & 7); +} + +int main () +{ + if (!f (&x1)) + __builtin_abort (); + if (!f (&xm1)) + __builtin_abort (); + if (f (&x8)) + __builtin_abort (); + if (f (&x0)) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-2.c b/gcc/testsuite/gcc.dg/field-merge-2.c new file mode 100644 index 0000000000000..80c573b31ddc7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-2.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct TL { + unsigned short a; + unsigned short b; +} __attribute__ ((packed, aligned (8))); + +struct TB { + unsigned char p; + unsigned short a; + unsigned short b; +} __attribute__ ((packed, aligned (8))); + +#define vc 0xaa + +struct TL vL = { vc, vc }; +struct TB vB = { vc, vc, vc }; + +void f (void) { + if (0 + || vL.b != vB.b + || vL.a != vB.a + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-3.c b/gcc/testsuite/gcc.dg/field-merge-3.c new file mode 100644 index 0000000000000..f26e8a96ea04f --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-3.c @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +const int BIG_ENDIAN_P = (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__); + +struct T1 { + unsigned char p[2]; + unsigned short a; + unsigned int z; +} __attribute__((__aligned__(8))); + +struct T2 { + unsigned short p; + unsigned short a; + unsigned int z; +} __attribute__((__aligned__(8))); + +#define vc 0xaa +#define vi 0x12345678 + +struct T1 v1 = { { vc + !BIG_ENDIAN_P, vc + BIG_ENDIAN_P }, vc, vi }; +struct T2 v2 = { (vc << 8) | (vc - 1), vc, vi }; + +void f (void) { + if (0 + || v1.p[!BIG_ENDIAN_P] != v2.p >> 8 + || v1.a != v2.a + || ((v1.z ^ v2.z) & 0xff00ff00) != 0 + || ((v1.z ^ v2.z) & 0x00ff00ff) != 0) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-4.c b/gcc/testsuite/gcc.dg/field-merge-4.c new file mode 100644 index 0000000000000..c629069e52b2c --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-4.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct T1 { + unsigned int zn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(4))); + +struct T2 { + unsigned int zn; + unsigned char rn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(4))); + +#define vc 0xaa +#define vs 0xccdd +#define vi 0x12345678 + +struct T1 v1 = { -1, vc, 1, vs, vi }; +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; + +void f (void) { + if (0 + || v1.p != v2.p + || v1.a != v2.a + || v1.z != v2.z + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-5.c b/gcc/testsuite/gcc.dg/field-merge-5.c new file mode 100644 index 0000000000000..1580b14bcc935 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-5.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct T1 { + unsigned int zn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(8))); + +struct T2 { + unsigned int zn; + unsigned char rn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(8))); + +#define vc 0xaa +#define vs 0xccdd +#define vi 0x12345678 + +struct T1 v1 = { -1, vc, 1, vs, vi }; +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; + +void f (void) { + if (0 + || v1.p != v2.p + || v1.a != v2.a + || v1.z != v2.z + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index 0000000000000..7fd48a138d14a --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +/* Check that the third compare won't be pulled ahead of the second one and + prevent, which would prevent the NULL pointer dereference that should cause + the execution to fail. */ + +struct s { + char a, b; + int *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a + || *b.p != *a.p + || a.b != b.b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c new file mode 100644 index 0000000000000..728a29b6fafa9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-7.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that the third compare won't be combined with the first one. */ + +struct s { + char a, b; + int p; +}; + +struct s a = { 0, 0, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a || (a.p != b.p && a.b != b.b)); +} + +int g () { + return (a.a == b.a && (a.p == b.p || a.b == b.b)); +} + +/* { dg-final { scan-tree-dump-not "optimizing" "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.dg/field-merge-8.c b/gcc/testsuite/gcc.dg/field-merge-8.c new file mode 100644 index 0000000000000..ae270e10070e4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-8.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +/* Check that conversions are not thrown away. */ + +struct s { + unsigned char a; + unsigned short b; +} __attribute__ ((aligned (4))); + +struct s p = { 42, 0xfe }; +struct s q = { 42, 0xfe | (2 << (__CHAR_BIT__ - 1)) }; + +void f (void) { + if (0 + || p.a != q.a + || (unsigned char)p.b != (unsigned char)q.b + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-9.c b/gcc/testsuite/gcc.dg/field-merge-9.c new file mode 100644 index 0000000000000..b9e08d8fa37d2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-9.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that conversions and selections of similar bit ranges across different + types don't prevent combination. */ + +struct s1 { + unsigned char b[2]; + unsigned char a; +} __attribute__ ((aligned (4))); + +struct s2 { + unsigned short b; + unsigned char a; +} __attribute__ ((aligned (4))); + +static const char le = __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? 1 : 0; + +struct s1 p = { { -!le , -le }, 42 }; +struct s2 q = { (le + ? -2 << (__CHAR_BIT__ - 1) + : -1 & ((1 << (__CHAR_BIT__ - 1) << 1) - 1)), 42 }; + +void f (void) { + if (0 + || p.a != q.a + || p.b[!le] != (unsigned char)q.b + || p.b[le] != (unsigned char)((q.b >> (__CHAR_BIT__ - 1)) >> 1) + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "optimizing two comparisons" 2 "ifcombine" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/long_branch_1.c b/gcc/testsuite/gcc.target/aarch64/long_branch_1.c index 49d8b6a2278ad..9fd9ec0d1d3f6 100644 --- a/gcc/testsuite/gcc.target/aarch64/long_branch_1.c +++ b/gcc/testsuite/gcc.target/aarch64/long_branch_1.c @@ -1,6 +1,6 @@ /* { dg-do assemble } */ /* { dg-timeout-factor 2.0 } */ -/* { dg-options "-O1 -fno-reorder-blocks -fno-tree-cselim --save-temps" } */ +/* { dg-options "-O1 -fno-reorder-blocks -fno-tree-cselim -fdisable-tree-ifcombine --save-temps" } */ __attribute__((noinline, noclone)) int diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index a87bf1210776f..de8db2be5572a 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -971,7 +971,19 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, outer_cond_code, gimple_cond_lhs (outer_cond), gimple_cond_rhs (outer_cond), - gimple_bb (outer_cond)))) + gimple_bb (outer_cond))) + && !(t = (fold_truth_andor_for_ifcombine + (TRUTH_ANDIF_EXPR, boolean_type_node, + gimple_location (outer_cond), + outer_cond_code, + gimple_cond_lhs (outer_cond), + gimple_cond_rhs (outer_cond), + gimple_location (inner_cond), + inner_cond_code, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond), + single_pred (inner_cond_bb) != outer_cond_bb + ? &ts : 0)))) { /* Only combine conditions in this fallback case if the blocks are neighbors. */