From patchwork Thu Sep 26 08:49:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 1989718 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=adacore.com header.i=@adacore.com header.a=rsa-sha256 header.s=google header.b=OUo3dUT9; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XDnP25btkz1xt2 for ; Thu, 26 Sep 2024 18:50:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3F1693858406 for ; Thu, 26 Sep 2024 08:50:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by sourceware.org (Postfix) with ESMTPS id A18293858D29 for ; Thu, 26 Sep 2024 08:49:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A18293858D29 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=adacore.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=adacore.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A18293858D29 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::629 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727340599; cv=none; b=Fh/Q5RELDBZzPFxIButpy0bAmb5OpDBzArwt3vbg0WmdrdisPY632a94Tj38Fyr5m6L6VFcQUxx2Y+lTYqNh+U7VUZ0Z+f4KyRGq6ryGFL8zdi6l2JbXxz21HVSXSo4PiIUyh34NF7Qob9wlUxn0NdZLY8+GCUKiAoNvm3wvpqc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727340599; c=relaxed/simple; bh=JfrmDcmlklXxdsldDjXlblKUs/OfjZ8RDLXz6KA8NX0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=WW6qsY6WmswaZ+Vs/8fOCrIMNn4wkbs1XcWoa2Xaf9Ml4l7YgWDWjPW3dTLeKM4o+50lavR4TqHeI29Yf3H0w4CyUaovrlnbBcccdACB5BuFHuxjEFxtbOdujiWa7EfSVneY5S6Vd5gBLx1DMMJyPJElOAD/kJ+3LL4+ZL48kTY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-20b1335e4e4so6762315ad.0 for ; Thu, 26 Sep 2024 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=adacore.com; s=google; t=1727340591; x=1727945391; darn=gcc.gnu.org; h=mime-version:user-agent:message-id:in-reply-to:date:references :organization:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=xszAvYtQmbzhfl5lzQWS1QO8Cu8eL05I3GTUTVirxsw=; b=OUo3dUT9zSuIAreNkAFz32I9wUK9wflsEuiQrrkoQqjYUQFJrggIwSjilWDEE4l+eT 6IYhTypi9I/NdlMzcCpfJEDQAeLHVwCzZ1RCEB2kM4cJ4I6lxZdYhCLKWBRte+IDW0ki hV74fiKG5i9QNd2t5W6ag4BB9Ti+2h6v3X0+dIQkoF6oZgpJIaH7zBIxaxESzDllRxnv hTL1mXPSEtOmrHovcdg8ep0S0aAb6Q6IRAcRqP4+gq/Q+I8aQgc3lxzufgMgY9hLyEFR QFfIERvqnjCTiGbMA8gAGkNiGwwaPT8BcDzfPMPGOHEkfIcV9qvm6uPZ3CbCmngfnDXu TfKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727340591; x=1727945391; h=mime-version:user-agent:message-id:in-reply-to:date:references :organization:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xszAvYtQmbzhfl5lzQWS1QO8Cu8eL05I3GTUTVirxsw=; b=L9qHwv2WthmiicY3AtMg4JumZ67Sm4vNObrVwWbw4hfWPNmXO1jym5a2z2CgGH6Yum nRrw8qXYP4nZwVUWgh6kD4mota0U1UinXkPHQgQCsoao05zghzAoKt+PxJiW+pvCeekf AQ4ynd31Ab07xODTSgjLPPlO+Z+MKtr1VzcwOhkcjhHRsyrCbOgAvOez1KxQS/kgJpn1 SDDlNjbX/yI3iIeOjDobrAXeP0KLDlNtjvP4QQMcabm93ohuTAs+v0+KCs+CKiGeBnS9 /KS6muAVyobP4YsLgjNdHe4mVM3vAmejRdfZafK7UXwojaXZjLYEYyecrb+uZmtHfU5I lPeA== X-Gm-Message-State: AOJu0Yyx76Exio3lXPAUy8GwNvP1jh9vztJBcthhf7SovYD2II080y7f Rq2RhBMVpb6oPnlbWgAqwbR3I+sG1QSnF2gLw2CiK6XkmnBdzcV7pFuCunpclg== X-Google-Smtp-Source: AGHT+IGL9OuJezsLBm2Kn7r9/qin0n5ZIu53LkWCZYubs9QHoXZso9jGtgjOvkxCEqlIvWBc1Y/VNw== X-Received: by 2002:a17:902:d50d:b0:207:6fd:57d5 with SMTP id d9443c01a7336-20afc48e18bmr75189905ad.36.1727340589530; Thu, 26 Sep 2024 01:49:49 -0700 (PDT) Received: from free.home ([2804:14c:4d1:44a5:f4d9:b7a4:4fb8:376f]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20af16e0702sm34533525ad.28.2024.09.26.01.49.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 01:49:48 -0700 (PDT) Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 48Q8naJ93162332 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 26 Sep 2024 05:49:36 -0300 From: Alexandre Oliva To: Richard Biener Cc: GCC Patches Subject: [PATCH] fold fold_truth_andor field merging into ifcombine was: [PATCH] assorted improvements for fold_truth_andor_1) Organization: Free thinker, does not speak for AdaCore References: Date: Thu, 26 Sep 2024 05:49:36 -0300 In-Reply-To: (Alexandre Oliva's message of "Fri, 02 Oct 2020 05:43:32 -0300") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch introduces various improvements to the logic that merges field compares, moving it into ifcombine. Before the patch, we could merge: (a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1) into something like: (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK) if both of A's fields live within the same alignment boundaries, and so do B's, at the same relative positions. Constants may be used instead of the object B. The initial goal of this patch was to enable such combinations when a field crossed alignment boundaries, e.g. for packed types. We can't generally access such fields with a single memory access, so when we come across such a compare, we will attempt to combine each access separately. Some merging opportunities were missed because of right-shifts, compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and narrowing conversions, especially after earlier merges. This patch introduces handlers for several cases involving these. The merging of multiple field accesses into wider bitfield-like accesses is undesirable to do too early in compilation, so we move it from folding to ifcombine, and extend ifcombine to merge noncontiguous compares, absent intervening side effects. VUSEs used to prevent ifcombine; that seemed excessively conservative, since relevant side effects were already tested, including the possibility of trapping loads, so that's removed. Unlike earlier ifcombine, when merging noncontiguous compares the merged compare must replace the earliest compare, which may require moving up the DEFs that contributed to the latter compare. When it is the second of a noncontiguous pair of compares that first accesses a word, we may merge the first compare with part of the second compare that refers to the same word, keeping the compare of the remaining bits at the spot where the second compare used to be. Handling compares with non-constant fields was somewhat generalized from what fold used to do, now handling non-adjacent fields, even if a field of one object crosses an alignment boundary but the other doesn't. The -Wno-error for toplev.o on rs6000 is because of toplev.c's: if ((flag_sanitize & SANITIZE_ADDRESS) && !FRAME_GROWS_DOWNWARD) and rs6000.h's: #define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 \ || (flag_sanitize & SANITIZE_ADDRESS) != 0) The mutually exclusive conditions involving flag_sanitize are now noticed and reported by ifcombine's warning on mutually exclusive compares. i386's needs -Wno-error for insn-attrtab.o for similar reasons. for gcc/ChangeLog * fold-const.cc (make_bit_field): Export. (all_ones_mask_p): Drop. (unextend, decode_field_reference, fold_truth_andor_1): Move field compare merging logic... * gimple-fold.cc: ... here. (ssa_is_substitutable_p, is_cast_p, is_binop_p): New. (prepare_xor, follow_load): New. (compute_split_boundary_from_align): New. (make_bit_field_load, build_split_load): New. (reuse_split_load, mergeable_loads_p): New. (fold_truth_andor_maybe_separate): New. * tree-ssa-ifcombine.cc: Include bitmap.h. (constant_condition_p): New. (recognize_if_then_else_nc, recognize_if_succs): New. (bb_no_side_effects_p): Don't reject VUSEs. (update_profile_after_ifcombine): Adjust for noncontiguous merges. (ifcombine_mark_ssa_name): New. (struct ifcombine_mark_ssa_name_t): New. (ifcombine_mark_ssa_name_walk): New. (ifcombine_replace_cond): Extended for noncontiguous merges after factoring out of... (ifcombine_ifandif): ... this. Drop result_inv arg. Try fold_truth_andor_maybe_separate. (tree_ssa_ifcombine_bb_1): Add outer_succ_bb arg. Call recognize_if_then_else_nc. Adjust ifcombine_ifandif calls. (tree_ssa_ifcombine_bb): Return the earliest affected block. Call recognize_if_then_else_nc. Try noncontiguous blocks. (pass_tree_ifcombine::execute): Retry affected blocks. * config/i386/t-i386 (insn-attrtab.o-warn): Disable errors. * config/rs6000/t-rs6000 (toplev.o-warn): Likewise. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-1.c: New. * gcc.dg/field-merge-2.c: New. * gcc.dg/field-merge-3.c: New. * gcc.dg/field-merge-4.c: New. * gcc.dg/field-merge-5.c: New. * gcc.dg/field-merge-6.c: New. * gcc.dg/field-merge-7.c: New. --- gcc/config/i386/t-i386 | 2 gcc/config/rs6000/t-rs6000 | 4 gcc/fold-const.cc | 512 ------------- gcc/fold-const.h | 8 gcc/gimple-fold.cc | 1315 ++++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/field-merge-1.c | 64 ++ gcc/testsuite/gcc.dg/field-merge-2.c | 31 + gcc/testsuite/gcc.dg/field-merge-3.c | 36 + gcc/testsuite/gcc.dg/field-merge-4.c | 40 + gcc/testsuite/gcc.dg/field-merge-5.c | 40 + gcc/testsuite/gcc.dg/field-merge-6.c | 26 + gcc/testsuite/gcc.dg/field-merge-7.c | 23 + gcc/tree-ssa-ifcombine.cc | 572 ++++++++++++--- 13 files changed, 2040 insertions(+), 633 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/field-merge-1.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-2.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-3.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-4.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-5.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-6.c create mode 100644 gcc/testsuite/gcc.dg/field-merge-7.c diff --git a/gcc/config/i386/t-i386 b/gcc/config/i386/t-i386 index bf4ae109af986..1b904787ec624 100644 --- a/gcc/config/i386/t-i386 +++ b/gcc/config/i386/t-i386 @@ -79,3 +79,5 @@ s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \ $(AWK) -f $^ > tmp-bt.inc $(SHELL) $(srcdir)/../move-if-change tmp-bt.inc i386-builtin-types.inc $(STAMP) $@ + +insn-attrtab.o-warn = -Wno-error diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000 index 155788de40a35..a83968d663a61 100644 --- a/gcc/config/rs6000/t-rs6000 +++ b/gcc/config/rs6000/t-rs6000 @@ -95,6 +95,10 @@ $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \ $(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \ $(srcdir)/config/rs6000/rs6000-tables.opt +# FRAME_GROWS_DOWNWARD tests flag_sanitize in a way that rules out a +# test in toplev.c. +toplev.o-warn = -Wno-error + # The rs6000 backend doesn't cause warnings in these files. insn-conditions.o-warn = diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 0578f42ac0c51..552a706ab6def 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -137,7 +137,6 @@ static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code, tree, tree, tree, tree); -static tree unextend (tree, int, int, tree); static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *); static tree fold_binary_op_with_conditional_arg (location_t, @@ -4701,7 +4700,7 @@ invert_truthvalue_loc (location_t loc, tree arg) is the original memory reference used to preserve the alias set of the access. */ -static tree +tree make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type, HOST_WIDE_INT bitsize, poly_int64 bitpos, int unsignedp, int reversep) @@ -4951,136 +4950,6 @@ optimize_bit_field_compare (location_t loc, enum tree_code code, return lhs; } -/* Subroutine for fold_truth_andor_1: decode a field reference. - - If EXP is a comparison reference, we return the innermost reference. - - *PBITSIZE is set to the number of bits in the reference, *PBITPOS is - set to the starting bit number. - - If the innermost field can be completely contained in a mode-sized - unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. - - *PVOLATILEP is set to 1 if the any expression encountered is volatile; - otherwise it is not changed. - - *PUNSIGNEDP is set to the signedness of the field. - - *PREVERSEP is set to the storage order of the field. - - *PMASK is set to the mask used. This is either contained in a - BIT_AND_EXPR or derived from the width of the field. - - *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. - - Return 0 if this is not a component reference or is one that we can't - do anything with. */ - -static tree -decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, - HOST_WIDE_INT *pbitpos, machine_mode *pmode, - int *punsignedp, int *preversep, int *pvolatilep, - tree *pmask, tree *pand_mask) -{ - tree exp = *exp_; - tree outer_type = 0; - tree and_mask = 0; - tree mask, inner, offset; - tree unsigned_type; - unsigned int precision; - - /* All the optimizations using this function assume integer fields. - There are problems with FP fields since the type_for_size call - below can fail for, e.g., XFmode. */ - if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) - return NULL_TREE; - - /* We are interested in the bare arrangement of bits, so strip everything - that doesn't affect the machine mode. However, record the type of the - outermost expression if it may matter below. */ - if (CONVERT_EXPR_P (exp) - || TREE_CODE (exp) == NON_LVALUE_EXPR) - outer_type = TREE_TYPE (exp); - STRIP_NOPS (exp); - - if (TREE_CODE (exp) == BIT_AND_EXPR) - { - and_mask = TREE_OPERAND (exp, 1); - exp = TREE_OPERAND (exp, 0); - STRIP_NOPS (exp); STRIP_NOPS (and_mask); - if (TREE_CODE (and_mask) != INTEGER_CST) - return NULL_TREE; - } - - poly_int64 poly_bitsize, poly_bitpos; - inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, - pmode, punsignedp, preversep, pvolatilep); - if ((inner == exp && and_mask == 0) - || !poly_bitsize.is_constant (pbitsize) - || !poly_bitpos.is_constant (pbitpos) - || *pbitsize < 0 - || offset != 0 - || TREE_CODE (inner) == PLACEHOLDER_EXPR - /* We eventually want to build a larger reference and need to take - the address of this. */ - || (!REFERENCE_CLASS_P (inner) && !DECL_P (inner)) - /* Reject out-of-bound accesses (PR79731). */ - || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) - && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), - *pbitpos + *pbitsize) < 0)) - return NULL_TREE; - - unsigned_type = lang_hooks.types.type_for_size (*pbitsize, 1); - if (unsigned_type == NULL_TREE) - return NULL_TREE; - - *exp_ = exp; - - /* If the number of bits in the reference is the same as the bitsize of - the outer type, then the outer type gives the signedness. Otherwise - (in case of a small bitfield) the signedness is unchanged. */ - if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) - *punsignedp = TYPE_UNSIGNED (outer_type); - - /* Compute the mask to access the bitfield. */ - precision = TYPE_PRECISION (unsigned_type); - - mask = build_int_cst_type (unsigned_type, -1); - - mask = const_binop (LSHIFT_EXPR, mask, size_int (precision - *pbitsize)); - mask = const_binop (RSHIFT_EXPR, mask, size_int (precision - *pbitsize)); - - /* Merge it with the mask we found in the BIT_AND_EXPR, if any. */ - if (and_mask != 0) - mask = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, - fold_convert_loc (loc, unsigned_type, and_mask), mask); - - *pmask = mask; - *pand_mask = and_mask; - return inner; -} - -/* Return nonzero if MASK represents a mask of SIZE ones in the low-order - bit positions and MASK is SIGNED. */ - -static bool -all_ones_mask_p (const_tree mask, unsigned int size) -{ - tree type = TREE_TYPE (mask); - unsigned int precision = TYPE_PRECISION (type); - - /* If this function returns true when the type of the mask is - UNSIGNED, then there will be errors. In particular see - gcc.c-torture/execute/990326-1.c. There does not appear to be - any documentation paper trail as to why this is so. But the pre - wide-int worked with that restriction and it has been preserved - here. */ - if (size > precision || TYPE_SIGN (type) == UNSIGNED) - return false; - - return wi::mask (size, false, precision) == wi::to_wide (mask); -} - /* Subroutine for fold: determine if VAL is the INTEGER_CONST that represents the sign bit of EXP's type. If EXP represents a sign or zero extension, also test VAL against the unextended type. @@ -6390,48 +6259,6 @@ fold_range_test (location_t loc, enum tree_code code, tree type, return 0; } -/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P - bit value. Arrange things so the extra bits will be set to zero if and - only if C is signed-extended to its full width. If MASK is nonzero, - it is an INTEGER_CST that should be AND'ed with the extra bits. */ - -static tree -unextend (tree c, int p, int unsignedp, tree mask) -{ - tree type = TREE_TYPE (c); - int modesize = GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE (type)); - tree temp; - - if (p == modesize || unsignedp) - return c; - - /* We work by getting just the sign bit into the low-order bit, then - into the high-order bit, then sign-extend. We then XOR that value - with C. */ - temp = build_int_cst (TREE_TYPE (c), - wi::extract_uhwi (wi::to_wide (c), p - 1, 1)); - - /* We must use a signed type in order to get an arithmetic right shift. - However, we must also avoid introducing accidental overflows, so that - a subsequent call to integer_zerop will work. Hence we must - do the type conversion here. At this point, the constant is either - zero or one, and the conversion to a signed type can never overflow. - We could get an overflow if this conversion is done anywhere else. */ - if (TYPE_UNSIGNED (type)) - temp = fold_convert (signed_type_for (type), temp); - - temp = const_binop (LSHIFT_EXPR, temp, size_int (modesize - 1)); - temp = const_binop (RSHIFT_EXPR, temp, size_int (modesize - p - 1)); - if (mask != 0) - temp = const_binop (BIT_AND_EXPR, temp, - fold_convert (TREE_TYPE (c), mask)); - /* If necessary, convert the type back to match the type of C. */ - if (TYPE_UNSIGNED (type)) - temp = fold_convert (type, temp); - - return fold_convert (type, const_binop (BIT_XOR_EXPR, c, temp)); -} - /* For an expression that has the form (A && B) || ~B or @@ -6502,20 +6329,13 @@ merge_truthop_with_opposite_arm (location_t loc, tree op, tree cmpop, lhs, rhs); return NULL_TREE; } - + /* Find ways of folding logical expressions of LHS and RHS: Try to merge two comparisons to the same innermost item. Look for range tests like "ch >= '0' && ch <= '9'". Look for combinations of simple terms on machines with expensive branches and evaluate the RHS unconditionally. - For example, if we have p->a == 2 && p->b == 4 and we can make an - object large enough to span both A and B, we can do this with a comparison - against the object ANDed with the a mask. - - If we have p->a == q->a && p->b == q->b, we may be able to use bit masking - operations to do this with one comparison. - We check for both normal comparisons and the BIT_AND_EXPRs made this by function and the one above. @@ -6540,24 +6360,9 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" comparison for one-bit fields. */ - enum tree_code wanted_code; enum tree_code lcode, rcode; tree ll_arg, lr_arg, rl_arg, rr_arg; - tree ll_inner, lr_inner, rl_inner, rr_inner; - HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; - HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; - HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; - HOST_WIDE_INT lnbitsize, lnbitpos, rnbitsize, rnbitpos; - int ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; - int ll_reversep, lr_reversep, rl_reversep, rr_reversep; - machine_mode ll_mode, lr_mode, rl_mode, rr_mode; - scalar_int_mode lnmode, rnmode; - tree ll_mask, lr_mask, rl_mask, rr_mask; - tree ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; - tree l_const, r_const; - tree lntype, rntype, result; - HOST_WIDE_INT first_bit, end_bit; - int volatilep; + tree result; /* Start by getting the comparison codes. Fail if anything is volatile. If one operand is a BIT_AND_EXPR with the constant one, treat it as if @@ -6652,316 +6457,7 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, build_int_cst (TREE_TYPE (ll_arg), 0)); } - /* See if the comparisons can be merged. Then get all the parameters for - each side. */ - - if ((lcode != EQ_EXPR && lcode != NE_EXPR) - || (rcode != EQ_EXPR && rcode != NE_EXPR)) - return 0; - - ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; - volatilep = 0; - ll_inner = decode_field_reference (loc, &ll_arg, - &ll_bitsize, &ll_bitpos, &ll_mode, - &ll_unsignedp, &ll_reversep, &volatilep, - &ll_mask, &ll_and_mask); - lr_inner = decode_field_reference (loc, &lr_arg, - &lr_bitsize, &lr_bitpos, &lr_mode, - &lr_unsignedp, &lr_reversep, &volatilep, - &lr_mask, &lr_and_mask); - rl_inner = decode_field_reference (loc, &rl_arg, - &rl_bitsize, &rl_bitpos, &rl_mode, - &rl_unsignedp, &rl_reversep, &volatilep, - &rl_mask, &rl_and_mask); - rr_inner = decode_field_reference (loc, &rr_arg, - &rr_bitsize, &rr_bitpos, &rr_mode, - &rr_unsignedp, &rr_reversep, &volatilep, - &rr_mask, &rr_and_mask); - - /* It must be true that the inner operation on the lhs of each - comparison must be the same if we are to be able to do anything. - Then see if we have constants. If not, the same must be true for - the rhs's. */ - if (volatilep - || ll_reversep != rl_reversep - || ll_inner == 0 || rl_inner == 0 - || ! operand_equal_p (ll_inner, rl_inner, 0)) - return 0; - - if (TREE_CODE (lr_arg) == INTEGER_CST - && TREE_CODE (rr_arg) == INTEGER_CST) - { - l_const = lr_arg, r_const = rr_arg; - lr_reversep = ll_reversep; - } - else if (lr_reversep != rr_reversep - || lr_inner == 0 || rr_inner == 0 - || ! operand_equal_p (lr_inner, rr_inner, 0)) - return 0; - else - l_const = r_const = 0; - - /* If either comparison code is not correct for our logical operation, - fail. However, we can convert a one-bit comparison against zero into - the opposite comparison against that bit being set in the field. */ - - wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); - if (lcode != wanted_code) - { - if (l_const && integer_zerop (l_const) && integer_pow2p (ll_mask)) - { - /* Make the left operand unsigned, since we are only interested - in the value of one bit. Otherwise we are doing the wrong - thing below. */ - ll_unsignedp = 1; - l_const = ll_mask; - } - else - return 0; - } - - /* This is analogous to the code for l_const above. */ - if (rcode != wanted_code) - { - if (r_const && integer_zerop (r_const) && integer_pow2p (rl_mask)) - { - rl_unsignedp = 1; - r_const = rl_mask; - } - else - return 0; - } - - /* See if we can find a mode that contains both fields being compared on - the left. If we can't, fail. Otherwise, update all constants and masks - to be relative to a field of that size. */ - first_bit = MIN (ll_bitpos, rl_bitpos); - end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, - TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, - volatilep, &lnmode)) - return 0; - - lnbitsize = GET_MODE_BITSIZE (lnmode); - lnbitpos = first_bit & ~ (lnbitsize - 1); - lntype = lang_hooks.types.type_for_size (lnbitsize, 1); - xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; - - if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) - { - xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; - xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; - } - - ll_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, ll_mask), - size_int (xll_bitpos)); - rl_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, lntype, rl_mask), - size_int (xrl_bitpos)); - if (ll_mask == NULL_TREE || rl_mask == NULL_TREE) - return 0; - - if (l_const) - { - l_const = fold_convert_loc (loc, lntype, l_const); - l_const = unextend (l_const, ll_bitsize, ll_unsignedp, ll_and_mask); - l_const = const_binop (LSHIFT_EXPR, l_const, size_int (xll_bitpos)); - if (l_const == NULL_TREE) - return 0; - if (! integer_zerop (const_binop (BIT_AND_EXPR, l_const, - fold_build1_loc (loc, BIT_NOT_EXPR, - lntype, ll_mask)))) - { - warning (0, "comparison is always %d", wanted_code == NE_EXPR); - - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); - } - } - if (r_const) - { - r_const = fold_convert_loc (loc, lntype, r_const); - r_const = unextend (r_const, rl_bitsize, rl_unsignedp, rl_and_mask); - r_const = const_binop (LSHIFT_EXPR, r_const, size_int (xrl_bitpos)); - if (r_const == NULL_TREE) - return 0; - if (! integer_zerop (const_binop (BIT_AND_EXPR, r_const, - fold_build1_loc (loc, BIT_NOT_EXPR, - lntype, rl_mask)))) - { - warning (0, "comparison is always %d", wanted_code == NE_EXPR); - - return constant_boolean_node (wanted_code == NE_EXPR, truth_type); - } - } - - /* If the right sides are not constant, do the same for it. Also, - disallow this optimization if a size, signedness or storage order - mismatch occurs between the left and right sides. */ - if (l_const == 0) - { - if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize - || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp - || ll_reversep != lr_reversep - /* Make sure the two fields on the right - correspond to the left without being swapped. */ - || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos) - return 0; - - first_bit = MIN (lr_bitpos, rr_bitpos); - end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); - if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, - TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, - volatilep, &rnmode)) - return 0; - - rnbitsize = GET_MODE_BITSIZE (rnmode); - rnbitpos = first_bit & ~ (rnbitsize - 1); - rntype = lang_hooks.types.type_for_size (rnbitsize, 1); - xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; - - if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) - { - xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; - xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; - } - - lr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, - rntype, lr_mask), - size_int (xlr_bitpos)); - rr_mask = const_binop (LSHIFT_EXPR, fold_convert_loc (loc, - rntype, rr_mask), - size_int (xrr_bitpos)); - if (lr_mask == NULL_TREE || rr_mask == NULL_TREE) - return 0; - - /* Make a mask that corresponds to both fields being compared. - Do this for both items being compared. If the operands are the - same size and the bits being compared are in the same position - then we can do this by masking both and comparing the masked - results. */ - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); - lr_mask = const_binop (BIT_IOR_EXPR, lr_mask, rr_mask); - if (lnbitsize == rnbitsize - && xll_bitpos == xlr_bitpos - && lnbitpos >= 0 - && rnbitpos >= 0) - { - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, - lntype, lnbitsize, lnbitpos, - ll_unsignedp || rl_unsignedp, ll_reversep); - if (! all_ones_mask_p (ll_mask, lnbitsize)) - lhs = build2 (BIT_AND_EXPR, lntype, lhs, ll_mask); - - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, - rntype, rnbitsize, rnbitpos, - lr_unsignedp || rr_unsignedp, lr_reversep); - if (! all_ones_mask_p (lr_mask, rnbitsize)) - rhs = build2 (BIT_AND_EXPR, rntype, rhs, lr_mask); - - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); - } - - /* There is still another way we can do something: If both pairs of - fields being compared are adjacent, we may be able to make a wider - field containing them both. - - Note that we still must mask the lhs/rhs expressions. Furthermore, - the mask must be shifted to account for the shift done by - make_bit_field_ref. */ - if (((ll_bitsize + ll_bitpos == rl_bitpos - && lr_bitsize + lr_bitpos == rr_bitpos) - || (ll_bitpos == rl_bitpos + rl_bitsize - && lr_bitpos == rr_bitpos + rr_bitsize)) - && ll_bitpos >= 0 - && rl_bitpos >= 0 - && lr_bitpos >= 0 - && rr_bitpos >= 0) - { - tree type; - - lhs = make_bit_field_ref (loc, ll_inner, ll_arg, lntype, - ll_bitsize + rl_bitsize, - MIN (ll_bitpos, rl_bitpos), - ll_unsignedp, ll_reversep); - rhs = make_bit_field_ref (loc, lr_inner, lr_arg, rntype, - lr_bitsize + rr_bitsize, - MIN (lr_bitpos, rr_bitpos), - lr_unsignedp, lr_reversep); - - ll_mask = const_binop (RSHIFT_EXPR, ll_mask, - size_int (MIN (xll_bitpos, xrl_bitpos))); - lr_mask = const_binop (RSHIFT_EXPR, lr_mask, - size_int (MIN (xlr_bitpos, xrr_bitpos))); - if (ll_mask == NULL_TREE || lr_mask == NULL_TREE) - return 0; - - /* Convert to the smaller type before masking out unwanted bits. */ - type = lntype; - if (lntype != rntype) - { - if (lnbitsize > rnbitsize) - { - lhs = fold_convert_loc (loc, rntype, lhs); - ll_mask = fold_convert_loc (loc, rntype, ll_mask); - type = rntype; - } - else if (lnbitsize < rnbitsize) - { - rhs = fold_convert_loc (loc, lntype, rhs); - lr_mask = fold_convert_loc (loc, lntype, lr_mask); - type = lntype; - } - } - - if (! all_ones_mask_p (ll_mask, ll_bitsize + rl_bitsize)) - lhs = build2 (BIT_AND_EXPR, type, lhs, ll_mask); - - if (! all_ones_mask_p (lr_mask, lr_bitsize + rr_bitsize)) - rhs = build2 (BIT_AND_EXPR, type, rhs, lr_mask); - - return build2_loc (loc, wanted_code, truth_type, lhs, rhs); - } - - return 0; - } - - /* Handle the case of comparisons with constants. If there is something in - common between the masks, those bits of the constants must be the same. - If not, the condition is always false. Test for this to avoid generating - incorrect code below. */ - result = const_binop (BIT_AND_EXPR, ll_mask, rl_mask); - if (! integer_zerop (result) - && simple_cst_equal (const_binop (BIT_AND_EXPR, result, l_const), - const_binop (BIT_AND_EXPR, result, r_const)) != 1) - { - if (wanted_code == NE_EXPR) - { - warning (0, "% of unmatched not-equal tests is always 1"); - return constant_boolean_node (true, truth_type); - } - else - { - warning (0, "% of mutually exclusive equal-tests is always 0"); - return constant_boolean_node (false, truth_type); - } - } - - if (lnbitpos < 0) - return 0; - - /* Construct the expression we will return. First get the component - reference we will make. Unless the mask is all ones the width of - that field, perform the mask operation. Then compare with the - merged constant. */ - result = make_bit_field_ref (loc, ll_inner, ll_arg, - lntype, lnbitsize, lnbitpos, - ll_unsignedp || rl_unsignedp, ll_reversep); - - ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); - if (! all_ones_mask_p (ll_mask, lnbitsize)) - result = build2_loc (loc, BIT_AND_EXPR, lntype, result, ll_mask); - - return build2_loc (loc, wanted_code, truth_type, result, - const_binop (BIT_IOR_EXPR, l_const, r_const)); + return 0; } /* T is an integer expression that is being multiplied, divided, or taken a diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 3e3998b57b042..6372dee74e9eb 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -253,11 +253,19 @@ extern tree fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE extern tree_code minmax_from_comparison (tree_code, tree, tree, tree, tree); +extern tree make_bit_field_ref (location_t, tree, tree, tree, + HOST_WIDE_INT, poly_int64, int, int); + /* In gimple-fold.cc. */ extern void clear_type_padding_in_mask (tree, unsigned char *); extern bool clear_padding_type_may_have_padding_p (tree); extern bool arith_overflowed_p (enum tree_code, const_tree, const_tree, const_tree); +extern tree fold_truth_andor_maybe_separate (location_t, enum tree_code, tree, + enum tree_code, tree, tree, + enum tree_code, tree, tree, + tree *); + /* Class used to compare gimple operands. */ diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 942de7720fd2b..13ffd16e646eb 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see #include "varasm.h" #include "internal-fn.h" #include "gimple-range.h" +#include "tree-ssa-loop-niter.h" // stmt_dominates_stmt_p enum strlen_range_kind { /* Compute the exact constant string length. */ @@ -7384,7 +7385,1321 @@ maybe_fold_comparisons_from_match_pd (tree type, enum tree_code code, return NULL_TREE; } + +/* Return TRUE if expression STMT is suitable for replacement. ??? + Same as ssa_is_replaceable_p, except that we don't insist it has a + single use. */ +static bool +ssa_is_substitutable_p (gimple *stmt) +{ +#if 0 + use_operand_p use_p; +#endif + tree def; +#if 0 + gimple *use_stmt; +#endif + + /* Only consider modify stmts. */ + if (!is_gimple_assign (stmt)) + return false; + + /* If the statement may throw an exception, it cannot be replaced. */ + if (stmt_could_throw_p (cfun, stmt)) + return false; + + /* Punt if there is more than 1 def. */ + def = SINGLE_SSA_TREE_OPERAND (stmt, SSA_OP_DEF); + if (!def) + return false; + +#if 0 + /* Only consider definitions which have a single use. */ + if (!single_imm_use (def, &use_p, &use_stmt)) + return false; + + /* Used in this block, but at the TOP of the block, not the end. */ + if (gimple_code (use_stmt) == GIMPLE_PHI) + return false; +#endif + + /* There must be no VDEFs. */ + if (gimple_vdef (stmt)) + return false; + + /* Float expressions must go through memory if float-store is on. */ + if (flag_float_store + && FLOAT_TYPE_P (TREE_TYPE (def))) + return false; + + /* An assignment with a register variable on the RHS is not + replaceable. */ + if (gimple_assign_rhs_code (stmt) == VAR_DECL + && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))) + return false; + + /* No function calls can be replaced. */ + if (is_gimple_call (stmt)) + return false; + + /* Leave any stmt with volatile operands alone as well. */ + if (gimple_has_volatile_ops (stmt)) + return false; + + return true; +} + +/* Follow substitutable SSA DEFs for *NAME, including type casts, + adjusting *NAME to the single rhs or the type cast operand along + the way. Return the target type of the earliest type cast + found. */ + +static tree +is_cast_p (tree *name) +{ + tree type = 0; + + while (TREE_CODE (*name) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (*name)) + { + gimple *def = SSA_NAME_DEF_STMT (*name); + + if (!ssa_is_substitutable_p (def)) + break; + + if (gimple_num_ops (def) != 2) + break; + + if (gimple_assign_single_p (def)) + { + if (gimple_assign_load_p (def)) + break; + *name = gimple_assign_rhs1 (def); + continue; + } + + if (!gimple_assign_cast_p (def)) + break; + + if (!type) + type = TREE_TYPE (*name); + *name = gimple_assign_rhs1 (def); + } + + return type; +} + +/* Follow substitutable SSA DEFs for *NAME. If we find it is assigned + a CODE binary expr, return the second operand, and set *NAME to the + first operand. */ + +static tree +is_binop_p (enum tree_code code, tree *name) +{ + while (TREE_CODE (*name) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (*name)) + { + gimple *def = SSA_NAME_DEF_STMT (*name); + + if (!ssa_is_substitutable_p (def)) + return 0; + + switch (gimple_num_ops (def)) + { + default: + return 0; + + case 2: + if (gimple_assign_single_p (def) && !gimple_assign_load_p (def)) + { + *name = gimple_assign_rhs1 (def); + continue; + } + return 0; + + case 3: + break; + } + + if (gimple_assign_rhs_code (def) != code) + return 0; + + *name = gimple_assign_rhs1 (def); + return gimple_assign_rhs2 (def); + } + + return 0; +} + +/* If *R_ARG is a constant zero, and L_ARG is a possibly masked + BIT_XOR_EXPR, return 1 and set *r_arg to l_arg. + Otherwise, return 0. + + The returned value should be passed to decode_field_reference for it + to handle l_arg, and then doubled for r_arg. */ + +static int +prepare_xor (tree l_arg, tree *r_arg) +{ + int ret = 0; + + if (!integer_zerop (*r_arg)) + return ret; + + tree exp = l_arg; + + if (tree and_mask = is_binop_p (BIT_AND_EXPR, &exp)) + { + if (TREE_CODE (and_mask) != INTEGER_CST) + return ret; + } + + if (is_binop_p (BIT_XOR_EXPR, &exp)) + { + *r_arg = l_arg; + return 1; + } + + return ret; +} + +/* If EXP is a SSA_NAME whose DEF is a load stmt, set *LOAD to it and + return its RHS, otherwise return EXP. */ + +static tree +follow_load (tree exp, gimple **load) +{ + if (TREE_CODE (exp) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (exp)) + { + gimple *def = SSA_NAME_DEF_STMT (exp); + if (gimple_assign_load_p (def)) + { + *load = def; + exp = gimple_assign_rhs1 (def); + } + } + + return exp; +} + +/* Subroutine for fold_truth_andor_1: decode a field reference. + + If EXP is a comparison reference, we return the innermost reference. + + *PBITSIZE is set to the number of bits in the reference, *PBITPOS is + set to the starting bit number. + + If the innermost field can be completely contained in a mode-sized + unit, *PMODE is set to that mode. Otherwise, it is set to VOIDmode. + + *PVOLATILEP is set to 1 if the any expression encountered is volatile; + otherwise it is not changed. + + *PUNSIGNEDP is set to the signedness of the field. + + *PREVERSEP is set to the storage order of the field. + + *PMASK is set to the mask used. This is either contained in a + BIT_AND_EXPR or derived from the width of the field. + + *PAND_MASK is set to the mask found in a BIT_AND_EXPR, if any. + + XOR_WHICH is 1 or 2 if EXP was found to be a (possibly masked) + BIT_XOR_EXPR compared with zero. We're to take the first or second + operand thereof if so. It should be zero otherwise. + + *LOAD is set to the load stmt of the innermost reference, if any, + *and NULL otherwise. + + Return 0 if this is not a component reference or is one that we can't + do anything with. */ + +static tree +decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize, + HOST_WIDE_INT *pbitpos, machine_mode *pmode, + int *punsignedp, int *preversep, int *pvolatilep, + tree *pmask, tree *pand_mask, int xor_which, + gimple **load) +{ + tree exp = *exp_; + tree outer_type = 0; + tree and_mask = 0; + tree mask, inner, offset; + tree unsigned_type; + unsigned int precision; + HOST_WIDE_INT shiftrt = 0; + + *load = NULL; + + /* All the optimizations using this function assume integer fields. + There are problems with FP fields since the type_for_size call + below can fail for, e.g., XFmode. */ + if (! INTEGRAL_TYPE_P (TREE_TYPE (exp))) + return NULL_TREE; + + /* We are interested in the bare arrangement of bits, so strip everything + that doesn't affect the machine mode. However, record the type of the + outermost expression if it may matter below. */ + outer_type = is_cast_p (&exp); + + if ((and_mask = is_binop_p (BIT_AND_EXPR, &exp))) + { + if (TREE_CODE (and_mask) != INTEGER_CST) + return NULL_TREE; + } + + if (xor_which) + { + tree op2 = is_binop_p (BIT_XOR_EXPR, &exp); + gcc_checking_assert (op2); + if (xor_which > 1) + exp = op2; + } + + if (tree t = is_cast_p (&exp)) + if (!outer_type) + outer_type = t; + + if (tree shift = is_binop_p (RSHIFT_EXPR, &exp)) + { + if (TREE_CODE (shift) != INTEGER_CST || !tree_fits_shwi_p (shift)) + return NULL_TREE; + shiftrt = tree_to_shwi (shift); + if (shiftrt <= 0) + return NULL_TREE; + } + + if (tree t = is_cast_p (&exp)) + if (!outer_type) + outer_type = t; + + exp = follow_load (exp, load); + + poly_int64 poly_bitsize, poly_bitpos; + inner = get_inner_reference (exp, &poly_bitsize, &poly_bitpos, &offset, + pmode, punsignedp, preversep, pvolatilep); + + if ((inner == exp && and_mask == 0) + || !poly_bitsize.is_constant (pbitsize) + || !poly_bitpos.is_constant (pbitpos) + || *pbitsize <= shiftrt + || offset != 0 + || TREE_CODE (inner) == PLACEHOLDER_EXPR + /* Reject out-of-bound accesses (PR79731). */ + || (! AGGREGATE_TYPE_P (TREE_TYPE (inner)) + && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)), + *pbitpos + *pbitsize) < 0)) + return NULL_TREE; + + if (shiftrt) + { + if (!*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + *pbitpos += shiftrt; + *pbitsize -= shiftrt; + } + + if (outer_type && *pbitsize > TYPE_PRECISION (outer_type)) + { + HOST_WIDE_INT excess = *pbitsize - TYPE_PRECISION (outer_type); + if (*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + *pbitpos += excess; + *pbitsize -= excess; + } + + unsigned_type = lang_hooks.types.type_for_size (*pbitsize, 1); + if (unsigned_type == NULL_TREE) + return NULL_TREE; + + *exp_ = exp; + + /* If the number of bits in the reference is the same as the bitsize of + the outer type, then the outer type gives the signedness. Otherwise + (in case of a small bitfield) the signedness is unchanged. */ + if (outer_type && *pbitsize == TYPE_PRECISION (outer_type)) + *punsignedp = TYPE_UNSIGNED (outer_type); + + /* Compute the mask to access the bitfield. */ + precision = TYPE_PRECISION (unsigned_type); + + mask = build_int_cst_type (unsigned_type, -1); + + mask = int_const_binop (LSHIFT_EXPR, mask, size_int (precision - *pbitsize)); + mask = int_const_binop (RSHIFT_EXPR, mask, size_int (precision - *pbitsize)); + + /* Merge it with the mask we found in the BIT_AND_EXPR, if any. */ + if (and_mask != 0) + mask = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, + fold_convert_loc (loc, unsigned_type, and_mask), mask); + + *pmask = mask; + *pand_mask = and_mask; + return inner; +} + +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P + bit value. Arrange things so the extra bits will be set to zero if and + only if C is signed-extended to its full width. If MASK is nonzero, + it is an INTEGER_CST that should be AND'ed with the extra bits. */ + +static tree +unextend (tree c, int p, int unsignedp, tree mask) +{ + tree type = TREE_TYPE (c); + int modesize = GET_MODE_BITSIZE (SCALAR_INT_TYPE_MODE (type)); + tree temp; + + if (p == modesize || unsignedp) + return c; + + /* We work by getting just the sign bit into the low-order bit, then + into the high-order bit, then sign-extend. We then XOR that value + with C. */ + temp = build_int_cst (TREE_TYPE (c), + wi::extract_uhwi (wi::to_wide (c), p - 1, 1)); + + /* We must use a signed type in order to get an arithmetic right shift. + However, we must also avoid introducing accidental overflows, so that + a subsequent call to integer_zerop will work. Hence we must + do the type conversion here. At this point, the constant is either + zero or one, and the conversion to a signed type can never overflow. + We could get an overflow if this conversion is done anywhere else. */ + if (TYPE_UNSIGNED (type)) + temp = fold_convert (signed_type_for (type), temp); + + temp = int_const_binop (LSHIFT_EXPR, temp, size_int (modesize - 1)); + temp = int_const_binop (RSHIFT_EXPR, temp, size_int (modesize - p - 1)); + if (mask != 0) + temp = int_const_binop (BIT_AND_EXPR, temp, + fold_convert (TREE_TYPE (c), mask)); + /* If necessary, convert the type back to match the type of C. */ + if (TYPE_UNSIGNED (type)) + temp = fold_convert (type, temp); + + return fold_convert (type, int_const_binop (BIT_XOR_EXPR, c, temp)); +} + +/* Return the one bitpos within bit extents L or R that is at an + ALIGN-bit alignment boundary, or -1 if there is more than one such + boundary, if there isn't any, or if there is any such boundary + between the extents. L and R are given by bitpos and bitsize. If + it doesn't return -1, there are two consecutive ALIGN-bit words + that contain both extents, and at least one of the extents + straddles across the returned alignment boundary. */ + +static inline HOST_WIDE_INT +compute_split_boundary_from_align (HOST_WIDE_INT align, + HOST_WIDE_INT l_bitpos, + HOST_WIDE_INT l_bitsize, + HOST_WIDE_INT r_bitpos, + HOST_WIDE_INT r_bitsize) +{ + HOST_WIDE_INT amask = ~(align - 1); + + HOST_WIDE_INT first_bit = MIN (l_bitpos, r_bitpos); + HOST_WIDE_INT end_bit = MAX (l_bitpos + l_bitsize, r_bitpos + r_bitsize); + + HOST_WIDE_INT boundary = (end_bit - 1) & amask; + + /* Make sure we're crossing no more than one alignment boundary. + + ??? We don't have logic to recombine loads of two adjacent + fields that each crosses a different alignment boundary, so + as to load the middle word only once, if other words can't be + otherwise recombined. */ + if (boundary - first_bit > align) + return -1; + + HOST_WIDE_INT l_start_word = l_bitpos & amask; + HOST_WIDE_INT l_end_word = (l_bitpos + l_bitsize - 1) & amask; + + HOST_WIDE_INT r_start_word = r_bitpos & amask; + HOST_WIDE_INT r_end_word = (r_bitpos + r_bitsize - 1) & amask; + + /* If neither field straddles across an alignment boundary, it's no + use to even try to merge them. */ + if (l_start_word == l_end_word && r_start_word == r_end_word) + return -1; + + return boundary; +} + +/* Make a bit_field_ref. If POINT is NULL, return the BIT_FIELD_REF. + Otherwise, build and insert a load stmt before POINT, and return + the SSA_NAME. ??? Rewrite LOAD in terms of the bitfield? */ + +static tree +make_bit_field_load (location_t loc, tree inner, tree orig_inner, tree type, + HOST_WIDE_INT bitsize, poly_int64 bitpos, + int unsignedp, int reversep, gimple *point) +{ + tree ref = make_bit_field_ref (loc, unshare_expr (inner), + unshare_expr (orig_inner), + type, bitsize, bitpos, + unsignedp, reversep); + if (!point) + return ref; + + gimple_stmt_iterator gsi = gsi_for_stmt (point); + return force_gimple_operand_gsi (&gsi, ref, + true, NULL_TREE, + true, GSI_SAME_STMT); +} + +/* Initialize ln_arg[0] and ln_arg[1] to a pair of newly-created (at + LOC) loads from INNER (from ORIG_INNER), of modes MODE and MODE2, + respectively, starting at BIT_POS, using reversed endianness if + REVERSEP. Also initialize BITPOS (the starting position of each + part into INNER), BITSIZ (the bit count starting at BITPOS), + TOSHIFT[1] (the amount by which the part and its mask are to be + shifted right to bring its least-significant bit to bit zero) and + SHIFTED (the amount by which the part, by separate loading, has + already been shifted right, but that the mask needs shifting to + match). */ + +static inline void +build_split_load (tree /* out */ ln_arg[2], + HOST_WIDE_INT /* out */ bitpos[2], + HOST_WIDE_INT /* out */ bitsiz[2], + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], + HOST_WIDE_INT /* out */ shifted[2], + location_t loc, tree inner, tree orig_inner, + scalar_int_mode mode, scalar_int_mode mode2, + HOST_WIDE_INT bit_pos, bool reversep, + gimple *point[2]) +{ + bitsiz[0] = GET_MODE_BITSIZE (mode); + bitsiz[1] = GET_MODE_BITSIZE (mode2); + + for (int i = 0; i < 2; i++) + { + tree type = lang_hooks.types.type_for_size (bitsiz[i], 1); + bitpos[i] = bit_pos; + ln_arg[i] = make_bit_field_load (loc, inner, orig_inner, + type, bitsiz[i], + bit_pos, 1, reversep, point[i]); + bit_pos += bitsiz[i]; + } + + toshift[1] = toshift[0]; + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + shifted[0] = bitsiz[1]; + shifted[1] = 0; + toshift[0] = 0; + } + else + { + shifted[1] = bitsiz[0]; + shifted[0] = 0; + toshift[1] = 0; + } +} + +/* Make arrangements to split at bit BOUNDARY a single loaded word + (with REVERSEP bit order) LN_ARG[0], to be shifted right by + TOSHIFT[0] to bring the field of interest to the least-significant + bit. The expectation is that the same loaded word will be + propagated from part 0 to part 1, with just different shifting and + masking to extract both parts. MASK is not expected to do more + than masking out the bits that belong to the other part. See + build_split_load for more information on the other fields. */ + +static inline void +reuse_split_load (tree /* in[0] out[1] */ ln_arg[2], + HOST_WIDE_INT /* in[0] out[1] */ bitpos[2], + HOST_WIDE_INT /* in[0] out[1] */ bitsiz[2], + HOST_WIDE_INT /* in[0] out[0..1] */ toshift[2], + HOST_WIDE_INT /* out */ shifted[2], + tree /* out */ mask[2], + HOST_WIDE_INT boundary, bool reversep) +{ + ln_arg[1] = ln_arg[0]; + bitpos[1] = bitpos[0]; + bitsiz[1] = bitsiz[0]; + shifted[1] = shifted[0] = 0; + + tree basemask = build_int_cst_type (TREE_TYPE (ln_arg[0]), -1); + + if (reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + toshift[1] = toshift[0]; + toshift[0] = bitpos[0] + bitsiz[0] - boundary; + mask[0] = int_const_binop (LSHIFT_EXPR, basemask, + bitsize_int (toshift[0])); + mask[1] = int_const_binop (BIT_XOR_EXPR, basemask, mask[0]); + } + else + { + toshift[1] = boundary - bitpos[1]; + mask[1] = int_const_binop (LSHIFT_EXPR, basemask, + bitsize_int (toshift[1])); + mask[0] = int_const_binop (BIT_XOR_EXPR, basemask, mask[1]); + } +} + +/* Return nonzero if LSTMT and RSTMT, assumed to load from the same + words, can be safely merged into a single load. Return zero if + there are intervening stores, or if neither stmt dominates the + other. If they are mergeable, return -1 if the merged load should + be inserted before LSTMT to dominate both, otherwise return +1. */ + +static int +mergeable_loads_p (gimple *lstmt, gimple *rstmt) +{ + gimple *stmts[2] = { lstmt, rstmt }; + int ret; + + if (stmt_dominates_stmt_p (lstmt, rstmt)) + { + ret = -1; + stmts[0] = lstmt; + stmts[1] = rstmt; + } + else if (stmt_dominates_stmt_p (rstmt, lstmt)) + { + ret = 1; + stmts[1] = lstmt; + stmts[0] = rstmt; + } + else + return 0; + + basic_block bbs[2] = { gimple_bb (stmts[0]), gimple_bb (stmts[1]) }; + if (bbs[0] == bbs[1]) + { + for (gimple_stmt_iterator gsi = gsi_for_stmt (stmts[0]); + gsi_stmt (gsi) != stmts[1]; gsi_next (&gsi)) + if (gimple_vdef (gsi_stmt (gsi))) + return 0; + return ret; + } + + for (gimple_stmt_iterator gsi = gsi_for_stmt (stmts[0]); + !gsi_end_p (gsi); gsi_next (&gsi)) + if (gimple_vdef (gsi_stmt (gsi))) + return 0; + + for (gphi_iterator gpi = gsi_start_phis (bbs[1]); + !gsi_end_p (gpi); gsi_next (&gpi)) + if (gimple_vdef (gsi_stmt (gpi))) + return 0; + + for (gimple_stmt_iterator gsi = gsi_for_stmt (stmts[1]); + !gsi_end_p (gsi); gsi_prev (&gsi)) + if (gimple_vdef (gsi_stmt (gsi))) + return 0; + + for (basic_block bb = bbs[1]; bb != bbs[0]; ) + { + /* ??? For now, stop if any visited block has more than one + predecessor. */ + if (!single_pred_p (bb)) + return 0; + + bb = single_pred (bb); + + for (gphi_iterator gpi = gsi_start_phis (bbs[1]); + !gsi_end_p (gpi); gsi_next (&gpi)) + if (gimple_vdef (gsi_stmt (gpi))) + return 0; + + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); + !gsi_end_p (gsi); gsi_next (&gsi)) + if (gimple_vdef (gsi_stmt (gsi))) + return 0; + } + + return ret; +} + +/* Find ways of folding logical expressions of LHS and RHS: + Try to merge two comparisons to the same innermost item. + Look for range tests like "ch >= '0' && ch <= '9'". + Look for combinations of simple terms on machines with expensive branches + and evaluate the RHS unconditionally. + + For example, if we have p->a == 2 && p->b == 4 and we can make an + object large enough to span both A and B, we can do this with a comparison + against the object ANDed with the a mask. + + If we have p->a == q->a && p->b == q->b, we may be able to use bit masking + operations to do this with one comparison. + + We check for both normal comparisons and the BIT_AND_EXPRs made this by + function and the one above. + + CODE is the logical operation being done. It can be TRUTH_ANDIF_EXPR, + TRUTH_AND_EXPR, TRUTH_ORIF_EXPR, or TRUTH_OR_EXPR. + + TRUTH_TYPE is the type of the logical operand and LHS and RHS are its + two operands. + + SEPARATEP should be NULL if LHS and RHS are adjacent in + CODE-chained compares, and a non-NULL pointer to NULL_TREE + otherwise. If the "words" accessed by RHS are already accessed by + LHS, this won't matter, but if RHS accesses "words" that LHS + doesn't, then *SEPARATEP will be set to the compares that should + take RHS's place. By "words" we mean contiguous bits that do not + cross a an TYPE_ALIGN boundary of the accessed object's type. + + We return the simplified tree or 0 if no optimization is possible. */ + +tree +fold_truth_andor_maybe_separate (location_t loc, + enum tree_code code, tree truth_type, + enum tree_code lcode, tree ll_arg, tree lr_arg, + enum tree_code rcode, tree rl_arg, tree rr_arg, + tree *separatep) +{ + /* If this is the "or" of two comparisons, we can do something if + the comparisons are NE_EXPR. If this is the "and", we can do something + if the comparisons are EQ_EXPR. I.e., + (a->b == 2 && a->c == 4) can become (a->new == NEW). + + WANTED_CODE is this operation code. For single bit fields, we can + convert EQ_EXPR to NE_EXPR so we need not reject the "wrong" + comparison for one-bit fields. */ + + enum tree_code orig_code = code; + enum tree_code wanted_code; + tree ll_inner, lr_inner, rl_inner, rr_inner; + gimple *ll_load, *lr_load, *rl_load, *rr_load; + int l_mergeable = 0, r_mergeable = 0; + HOST_WIDE_INT ll_bitsize, ll_bitpos, lr_bitsize, lr_bitpos; + HOST_WIDE_INT rl_bitsize, rl_bitpos, rr_bitsize, rr_bitpos; + HOST_WIDE_INT xll_bitpos, xlr_bitpos, xrl_bitpos, xrr_bitpos; + HOST_WIDE_INT lnbitsize, lnbitpos, rnbitsize, rnbitpos; + int ll_unsignedp, lr_unsignedp, rl_unsignedp, rr_unsignedp; + int ll_reversep, lr_reversep, rl_reversep, rr_reversep; + machine_mode ll_mode, lr_mode, rl_mode, rr_mode; + scalar_int_mode lnmode, lnmode2, rnmode; + tree ll_mask, lr_mask, rl_mask, rr_mask; + tree ll_and_mask, lr_and_mask, rl_and_mask, rr_and_mask; + tree l_const, r_const; + tree lntype, rntype, result; + HOST_WIDE_INT first_bit, end_bit; + int volatilep; + bool l_split_load; + + gcc_checking_assert (!separatep || !*separatep); + + /* Start by getting the comparison codes. Fail if anything is volatile. + If one operand is a BIT_AND_EXPR with the constant one, treat it as if + it were surrounded with a NE_EXPR. */ + + if (TREE_SIDE_EFFECTS (ll_arg) || TREE_SIDE_EFFECTS (lr_arg) + || TREE_SIDE_EFFECTS (rl_arg) || TREE_SIDE_EFFECTS (rr_arg)) + return 0; + + if (TREE_CODE_CLASS (lcode) != tcc_comparison + || TREE_CODE_CLASS (rcode) != tcc_comparison) + return 0; + + code = ((code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR) + ? TRUTH_AND_EXPR : TRUTH_OR_EXPR); + + bool lsignbit = false, rsignbit = false; + if ((lcode == LT_EXPR || lcode == GE_EXPR) + && integer_zerop (lr_arg) + && INTEGRAL_TYPE_P (TREE_TYPE (ll_arg)) + && !TYPE_UNSIGNED (TREE_TYPE (ll_arg))) + { + lsignbit = true; + lcode = (lcode == LT_EXPR ? NE_EXPR : EQ_EXPR); + } + if ((rcode == LT_EXPR || rcode == GE_EXPR) + && integer_zerop (rr_arg) + && INTEGRAL_TYPE_P (TREE_TYPE (rl_arg)) + && !TYPE_UNSIGNED (TREE_TYPE (rl_arg))) + { + rsignbit = true; + rcode = (rcode == LT_EXPR ? NE_EXPR : EQ_EXPR); + } + + /* See if the comparisons can be merged. Then get all the parameters for + each side. */ + + if ((lcode != EQ_EXPR && lcode != NE_EXPR) + || (rcode != EQ_EXPR && rcode != NE_EXPR)) + return 0; + + ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0; + volatilep = 0; + int l_xor = prepare_xor (ll_arg, &lr_arg); + ll_inner = decode_field_reference (loc, &ll_arg, + &ll_bitsize, &ll_bitpos, &ll_mode, + &ll_unsignedp, &ll_reversep, &volatilep, + &ll_mask, &ll_and_mask, l_xor, + &ll_load); + lr_inner = decode_field_reference (loc, &lr_arg, + &lr_bitsize, &lr_bitpos, &lr_mode, + &lr_unsignedp, &lr_reversep, &volatilep, + &lr_mask, &lr_and_mask, 2 * l_xor, + &lr_load); + int r_xor = prepare_xor (rl_arg, &rr_arg); + rl_inner = decode_field_reference (loc, &rl_arg, + &rl_bitsize, &rl_bitpos, &rl_mode, + &rl_unsignedp, &rl_reversep, &volatilep, + &rl_mask, &rl_and_mask, r_xor, + &rl_load); + rr_inner = decode_field_reference (loc, &rr_arg, + &rr_bitsize, &rr_bitpos, &rr_mode, + &rr_unsignedp, &rr_reversep, &volatilep, + &rr_mask, &rr_and_mask, 2 * r_xor, + &rr_load); + + /* It must be true that the inner operation on the lhs of each + comparison must be the same if we are to be able to do anything. + Then see if we have constants. If not, the same must be true for + the rhs's. */ + if (volatilep + || ll_reversep != rl_reversep + || ll_inner == 0 || rl_inner == 0 + || ! operand_equal_p (ll_inner, rl_inner, 0) + || (ll_load && rl_load + && ! (l_mergeable = mergeable_loads_p (ll_load, rl_load)))) + return 0; + + if (TREE_CODE (lr_arg) == INTEGER_CST + && TREE_CODE (rr_arg) == INTEGER_CST) + { + l_const = lr_arg, r_const = rr_arg; + lr_reversep = ll_reversep; + } + else if (lr_reversep != rr_reversep + || lr_inner == 0 || rr_inner == 0 + || ! operand_equal_p (lr_inner, rr_inner, 0) + || (lr_load && rr_load + && ! (r_mergeable = mergeable_loads_p (lr_load, rr_load)))) + return 0; + else + l_const = r_const = 0; + + if (lsignbit) + { + tree mask = build_int_cst_type (TREE_TYPE (ll_arg), -1); + tree sign = int_const_binop (LSHIFT_EXPR, mask, + bitsize_int (ll_bitsize - 1)); + if (!ll_mask) + ll_mask = sign; + else + ll_mask = int_const_binop (BIT_AND_EXPR, ll_mask, sign); + if (!ll_and_mask) + ll_and_mask = sign; + else + ll_and_mask = int_const_binop (BIT_AND_EXPR, ll_and_mask, sign); + } + + if (rsignbit) + { + tree mask = build_int_cst_type (TREE_TYPE (rl_arg), -1); + tree sign = int_const_binop (LSHIFT_EXPR, mask, + bitsize_int (rl_bitsize - 1)); + if (!rl_mask) + rl_mask = sign; + else + rl_mask = int_const_binop (BIT_AND_EXPR, rl_mask, sign); + if (!rl_and_mask) + rl_and_mask = sign; + else + rl_and_mask = int_const_binop (BIT_AND_EXPR, rl_and_mask, sign); + } + + /* If either comparison code is not correct for our logical operation, + fail. However, we can convert a one-bit comparison against zero into + the opposite comparison against that bit being set in the field. */ + + wanted_code = (code == TRUTH_AND_EXPR ? EQ_EXPR : NE_EXPR); + if (lcode != wanted_code) + { + if (l_const && integer_zerop (l_const) && integer_pow2p (ll_mask)) + { + /* Make the left operand unsigned, since we are only interested + in the value of one bit. Otherwise we are doing the wrong + thing below. */ + ll_unsignedp = 1; + l_const = ll_mask; + } + else + return 0; + } + + /* This is analogous to the code for l_const above. */ + if (rcode != wanted_code) + { + if (r_const && integer_zerop (r_const) && integer_pow2p (rl_mask)) + { + rl_unsignedp = 1; + r_const = rl_mask; + } + else + return 0; + } + + /* This will be bumped to 2 if any of the field pairs crosses an + alignment boundary, so the merged compare has to be done in two + parts. */ + int parts = 1; + /* Set to true if the second combined compare should come first, + e.g., because the second original compare accesses a word that + the first one doesn't, and the combined compares access those in + cmp[0]. */ + bool first1 = false; + /* Set to true if the first original compare is not the one being + split. */ + bool maybe_separate = false; + + /* The following 2-dimensional arrays use the first index to + identify left(0)- vs right(1)-hand compare operands, and the + second one to identify merged compare parts. */ + /* The memory loads or constants to be compared. */ + tree ld_arg[2][2]; + /* The first bit of the corresponding inner object that the + corresponding LD_ARG covers. */ + HOST_WIDE_INT bitpos[2][2]; + /* The bit count starting at BITPOS that the corresponding LD_ARG + covers. */ + HOST_WIDE_INT bitsiz[2][2]; + /* The number of bits by which LD_ARG has already been shifted + right, WRT mask. */ + HOST_WIDE_INT shifted[2][2]; + /* The number of bits by which both LD_ARG and MASK need shifting to + bring its least-significant bit to bit zero. */ + HOST_WIDE_INT toshift[2][2]; + /* An additional mask to be applied to LD_ARG, to remove any bits + that may have been loaded for use in another compare, but that + don't belong in the corresponding compare. */ + tree xmask[2][2] = {}; + + /* The combined compare or compares. */ + tree cmp[2]; + + /* Consider we're comparing two non-contiguous fields of packed + structs, both aligned at 32-bit boundaries: + + ll_arg: an 8-bit field at offset 0 + lr_arg: a 16-bit field at offset 2 + + rl_arg: an 8-bit field at offset 1 + rr_arg: a 16-bit field at offset 3 + + We'll have r_split_load, because rr_arg straddles across an + alignment boundary. + + We'll want to have: + + bitpos = { { 0, 0 }, { 0, 32 } } + bitsiz = { { 32, 32 }, { 32, 8 } } + + And, for little-endian: + + shifted = { { 0, 0 }, { 0, 32 } } + toshift = { { 0, 24 }, { 0, 0 } } + + Or, for big-endian: + + shifted = { { 0, 0 }, { 8, 0 } } + toshift = { { 8, 0 }, { 0, 0 } } + */ + + /* See if we can find a mode that contains both fields being compared on + the left. If we can't, fail. Otherwise, update all constants and masks + to be relative to a field of that size. */ + first_bit = MIN (ll_bitpos, rl_bitpos); + end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); + if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, + TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD, + volatilep, &lnmode)) + { + /* Consider the possibility of recombining loads if any of the + fields straddles across an alignment boundary, so that either + part can be loaded along with the other field. */ + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (ll_inner)); + HOST_WIDE_INT boundary = compute_split_boundary_from_align + (align, ll_bitpos, ll_bitsize, rl_bitpos, rl_bitsize); + + if (boundary < 0 + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, + align, BITS_PER_WORD, volatilep, &lnmode) + || !get_best_mode (end_bit - boundary, boundary, 0, 0, + align, BITS_PER_WORD, volatilep, &lnmode2)) + return 0; + + l_split_load = true; + parts = 2; + if (ll_bitpos >= boundary) + maybe_separate = first1 = true; + else if (ll_bitpos + ll_bitsize <= boundary) + maybe_separate = true; + } + else + l_split_load = false; + + lnbitsize = GET_MODE_BITSIZE (lnmode); + lnbitpos = first_bit & ~ (lnbitsize - 1); + if (l_split_load) + lnbitsize += GET_MODE_BITSIZE (lnmode2); + lntype = lang_hooks.types.type_for_size (lnbitsize, 1); + if (!lntype) + { + gcc_checking_assert (l_split_load); + lntype = build_nonstandard_integer_type (lnbitsize, 1); + } + xll_bitpos = ll_bitpos - lnbitpos, xrl_bitpos = rl_bitpos - lnbitpos; + + if (ll_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + xll_bitpos = lnbitsize - xll_bitpos - ll_bitsize; + xrl_bitpos = lnbitsize - xrl_bitpos - rl_bitsize; + } + + ll_mask = int_const_binop (LSHIFT_EXPR, + fold_convert_loc (loc, lntype, ll_mask), + size_int (xll_bitpos)); + if (!ll_mask) + return 0; + rl_mask = int_const_binop (LSHIFT_EXPR, + fold_convert_loc (loc, lntype, rl_mask), + size_int (xrl_bitpos)); + if (!rl_mask) + return 0; + + if (l_const) + { + l_const = fold_convert_loc (loc, lntype, l_const); + l_const = unextend (l_const, ll_bitsize, ll_unsignedp, ll_and_mask); + l_const = int_const_binop (LSHIFT_EXPR, l_const, size_int (xll_bitpos)); + if (! integer_zerop (int_const_binop (BIT_AND_EXPR, l_const, + fold_build1_loc (loc, BIT_NOT_EXPR, + lntype, ll_mask)))) + { + warning (0, "comparison is always %d", wanted_code == NE_EXPR); + + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); + } + } + if (r_const) + { + r_const = fold_convert_loc (loc, lntype, r_const); + r_const = unextend (r_const, rl_bitsize, rl_unsignedp, rl_and_mask); + r_const = int_const_binop (LSHIFT_EXPR, r_const, size_int (xrl_bitpos)); + if (! integer_zerop (int_const_binop (BIT_AND_EXPR, r_const, + fold_build1_loc (loc, BIT_NOT_EXPR, + lntype, rl_mask)))) + { + warning (0, "comparison is always %d", wanted_code == NE_EXPR); + + return constant_boolean_node (wanted_code == NE_EXPR, truth_type); + } + } + + /* If the right sides are not constant, do the same for it. Also, + disallow this optimization if a size, signedness or storage order + mismatch occurs between the left and right sides. */ + if (l_const == 0) + { + if (ll_bitsize != lr_bitsize || rl_bitsize != rr_bitsize + || ll_unsignedp != lr_unsignedp || rl_unsignedp != rr_unsignedp + || ll_reversep != lr_reversep + /* Make sure the two fields on the right + correspond to the left without being swapped. */ + || ll_bitpos - rl_bitpos != lr_bitpos - rr_bitpos + || lnbitpos < 0) + return 0; + + bool r_split_load; + scalar_int_mode rnmode2; + + first_bit = MIN (lr_bitpos, rr_bitpos); + end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize); + if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0, + TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD, + volatilep, &rnmode)) + { + /* Consider the possibility of recombining loads if any of the + fields straddles across an alignment boundary, so that either + part can be loaded along with the other field. */ + HOST_WIDE_INT align = TYPE_ALIGN (TREE_TYPE (lr_inner)); + HOST_WIDE_INT boundary = compute_split_boundary_from_align + (align, lr_bitpos, lr_bitsize, rr_bitpos, rr_bitsize); + + if (boundary < 0 + /* If we're to split both, make sure the split point is + the same. */ + || (l_split_load + && (boundary - lr_bitpos + != (lnbitpos + GET_MODE_BITSIZE (lnmode)) - ll_bitpos)) + || !get_best_mode (boundary - first_bit, first_bit, 0, 0, + align, BITS_PER_WORD, volatilep, &rnmode) + || !get_best_mode (end_bit - boundary, boundary, 0, 0, + align, BITS_PER_WORD, volatilep, &rnmode2)) + return 0; + + r_split_load = true; + parts = 2; + if (lr_bitpos >= boundary) + maybe_separate = first1 = true; + else if (lr_bitpos + lr_bitsize <= boundary) + maybe_separate = true; + } + else + r_split_load = false; + + rnbitsize = GET_MODE_BITSIZE (rnmode); + rnbitpos = first_bit & ~ (rnbitsize - 1); + if (r_split_load) + rnbitsize += GET_MODE_BITSIZE (rnmode2); + rntype = lang_hooks.types.type_for_size (rnbitsize, 1); + if (!rntype) + { + gcc_checking_assert (r_split_load); + rntype = build_nonstandard_integer_type (rnbitsize, 1); + } + xlr_bitpos = lr_bitpos - rnbitpos, xrr_bitpos = rr_bitpos - rnbitpos; + + if (lr_reversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN) + { + xlr_bitpos = rnbitsize - xlr_bitpos - lr_bitsize; + xrr_bitpos = rnbitsize - xrr_bitpos - rr_bitsize; + } + + lr_mask = int_const_binop (LSHIFT_EXPR, + fold_convert_loc (loc, rntype, lr_mask), + size_int (xlr_bitpos)); + rr_mask = int_const_binop (LSHIFT_EXPR, + fold_convert_loc (loc, rntype, rr_mask), + size_int (xrr_bitpos)); + + lr_mask = int_const_binop (BIT_IOR_EXPR, lr_mask, rr_mask); + + toshift[1][0] = MIN (xlr_bitpos, xrr_bitpos); + shifted[1][0] = 0; + + if (!r_split_load) + { + bitpos[1][0] = rnbitpos; + bitsiz[1][0] = rnbitsize; + ld_arg[1][0] = make_bit_field_load (loc, lr_inner, lr_arg, + rntype, rnbitsize, rnbitpos, + lr_unsignedp || rr_unsignedp, + lr_reversep, + r_mergeable == 0 + ? NULL + : r_mergeable < 0 + ? lr_load + : rr_load); + } + + if (parts > 1) + { + if (r_split_load) + { + gimple *point[2]; + if (r_mergeable == 0) + point[0] = point[1] = NULL; + else if (r_mergeable < 0) + { + point[0] = lr_load; + point[1] = rr_load; + } + else + { + point[0] = rr_load; + point[1] = lr_load; + } + build_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], loc, lr_inner, lr_arg, + rnmode, rnmode2, rnbitpos, lr_reversep, point); + } + else + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], xmask[1], + lnbitpos + GET_MODE_BITSIZE (lnmode) + - ll_bitpos + lr_bitpos, lr_reversep); + } + } + else + { + /* Handle the case of comparisons with constants. If there is + something in common between the masks, those bits of the + constants must be the same. If not, the condition is always + false. Test for this to avoid generating incorrect code + below. */ + result = int_const_binop (BIT_AND_EXPR, ll_mask, rl_mask); + if (! integer_zerop (result) + && simple_cst_equal (int_const_binop (BIT_AND_EXPR, + result, l_const), + int_const_binop (BIT_AND_EXPR, + result, r_const)) != 1) + { + if (wanted_code == NE_EXPR) + { + warning (0, + "% of unmatched not-equal tests" + " is always 1"); + return constant_boolean_node (true, truth_type); + } + else + { + warning (0, + "% of mutually exclusive equal-tests" + " is always 0"); + return constant_boolean_node (false, truth_type); + } + } + + if (lnbitpos < 0) + return 0; + + /* The constants are combined so as to line up with the loaded + field, so use the same parameters. */ + ld_arg[1][0] = int_const_binop (BIT_IOR_EXPR, l_const, r_const); + toshift[1][0] = MIN (xll_bitpos, xrl_bitpos); + shifted[1][0] = 0; + bitpos[1][0] = lnbitpos; + bitsiz[1][0] = lnbitsize; + + if (parts > 1) + reuse_split_load (ld_arg[1], bitpos[1], bitsiz[1], toshift[1], + shifted[1], xmask[1], + lnbitpos + GET_MODE_BITSIZE (lnmode), + lr_reversep); + + lr_mask = build_int_cst_type (TREE_TYPE (ld_arg[1][0]), -1); + + /* If the compiler thinks this is used uninitialized below, it's + because it can't realize that parts can only be 2 when + comparing wiht constants if l_split_load is also true. This + just silences the warning. */ + rnbitpos = 0; + } + + ll_mask = int_const_binop (BIT_IOR_EXPR, ll_mask, rl_mask); + toshift[0][0] = MIN (xll_bitpos, xrl_bitpos); + shifted[0][0] = 0; + + if (!l_split_load) + { + bitpos[0][0] = lnbitpos; + bitsiz[0][0] = lnbitsize; + ld_arg[0][0] = make_bit_field_load (loc, ll_inner, ll_arg, + lntype, lnbitsize, lnbitpos, + ll_unsignedp || rl_unsignedp, + ll_reversep, + l_mergeable == 0 + ? NULL + : l_mergeable < 0 + ? ll_load + : rl_load); + } + + if (parts > 1) + { + if (l_split_load) + { + gimple *point[2]; + if (l_mergeable == 0) + point[0] = point[1] = NULL; + else if (l_mergeable < 0) + { + point[0] = ll_load; + point[1] = rl_load; + } + else + { + point[0] = rl_load; + point[1] = ll_load; + } + build_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], + shifted[0], loc, ll_inner, ll_arg, + lnmode, lnmode2, lnbitpos, ll_reversep, point); + } + else + reuse_split_load (ld_arg[0], bitpos[0], bitsiz[0], toshift[0], + shifted[0], xmask[0], + rnbitpos + GET_MODE_BITSIZE (rnmode) + - lr_bitpos + ll_bitpos, ll_reversep); + } + + for (int i = 0; i < parts; i++) + { + tree op[2] = { ld_arg[0][i], ld_arg[1][i] }; + tree mask[2] = { ll_mask, lr_mask }; + + for (int j = 0; j < 2; j++) + { + op[j] = unshare_expr (op[j]); + + /* Mask out the bits belonging to the other part. */ + if (xmask[j][i]) + mask[j] = int_const_binop (BIT_AND_EXPR, mask[j], xmask[j][i]); + + if (shifted[j][i]) + { + tree shiftsz = bitsize_int (shifted[j][i]); + mask[j] = int_const_binop (RSHIFT_EXPR, mask[j], shiftsz); + } + mask[j] = fold_convert_loc (loc, TREE_TYPE (op[j]), mask[j]); + } + + HOST_WIDE_INT shift = (toshift[0][i] - toshift[1][i]); + + if (shift) + { + int j; + if (shift > 0) + j = 0; + else + { + j = 1; + shift = -shift; + } + + tree shiftsz = bitsize_int (shift); + op[j] = fold_build2_loc (loc, RSHIFT_EXPR, TREE_TYPE (op[j]), + op[j], shiftsz); + mask[j] = int_const_binop (RSHIFT_EXPR, mask[j], shiftsz); + } + + /* Convert to the smaller type before masking out unwanted + bits. */ + tree type = TREE_TYPE (op[0]); + if (type != TREE_TYPE (op[1])) + { + int j = (TYPE_PRECISION (type) + < TYPE_PRECISION (TREE_TYPE (op[1]))); + if (!j) + type = TREE_TYPE (op[1]); + op[j] = fold_convert_loc (loc, type, op[j]); + mask[j] = fold_convert_loc (loc, type, mask[j]); + } + + for (int j = 0; j < 2; j++) + if (! integer_all_onesp (mask[j])) + op[j] = build2_loc (loc, BIT_AND_EXPR, type, + op[j], mask[j]); + + cmp[i] = build2_loc (loc, wanted_code, truth_type, op[0], op[1]); + } + + if (first1) + std::swap (cmp[0], cmp[1]); + + if (parts == 1) + result = cmp[0]; + else if (!separatep || !maybe_separate) + result = build2_loc (loc, orig_code, truth_type, cmp[0], cmp[1]); + else + { + result = cmp[0]; + *separatep = cmp[1]; + } + + return result; +} + /* Try to simplify the AND of two comparisons, specified by (OP1A CODE1 OP1B) and (OP2B CODE2 OP2B), respectively. If this can be simplified to a single expression (without requiring diff --git a/gcc/testsuite/gcc.dg/field-merge-1.c b/gcc/testsuite/gcc.dg/field-merge-1.c new file mode 100644 index 0000000000000..1818e104437e1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-1.c @@ -0,0 +1,64 @@ +/* { dg-do run } */ +/* { dg-options "-O -save-temps -fdump-tree-optimized" } */ + +/* Check that field loads compared with constants are merged, even if + tested out of order, and when fields straddle across alignment + boundaries. */ + +struct TL { + unsigned char p; + unsigned int a; + unsigned char q; + unsigned int b; + unsigned char r; + unsigned int c; + unsigned char s; +} __attribute__ ((packed, aligned (4), scalar_storage_order ("little-endian"))); + +struct TB { + unsigned char p; + unsigned int a; + unsigned char q; + unsigned int b; + unsigned char r; + unsigned int c; + unsigned char s; +} __attribute__ ((packed, aligned (4), scalar_storage_order ("big-endian"))); + +#define vc 0xaa +#define vi 0x12345678 + +struct TL vL = { vc, vi, vc, vi, vc, vi, vc }; +struct TB vB = { vc, vi, vc, vi, vc, vi, vc }; + +void f (void) { + /* Which words of | vL | vB | */ + /* are accessed by |0123|0123| */ + if (0 /* the tests? | | | */ + || vL.p != vc /* |* | | */ + || vB.p != vc /* | |* | */ + || vL.s != vc /* | *| | */ + || vB.q != vc /* | | * | */ + || vL.a != vi /* |^* | | */ + || vB.b != vi /* | | ^* | */ + || vL.c != vi /* | *^| | */ + || vB.c != vi /* | | ^*| */ + || vL.b != vi /* | ^^ | | */ + || vL.q != vc /* | ^ | | */ + || vB.a != vi /* | |^^ | */ + || vB.r != vc /* | | ^ | */ + || vB.s != vc /* | | ^| */ + || vL.r != vc /* | ^ | | */ + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 8 "optimized" } } */ +/* { dg-final { scan-assembler-not "cmpb" { target { i*86-*-* || x86_64-*-* } } } } */ +/* { dg-final { scan-assembler-times "cmpl" 8 { target { i*86-*-* || x86_64-*-* } } } } */ +/* { dg-final { scan-assembler-times "cmpw" 8 { target { powerpc*-*-* || rs6000-*-* } } } } */ diff --git a/gcc/testsuite/gcc.dg/field-merge-2.c b/gcc/testsuite/gcc.dg/field-merge-2.c new file mode 100644 index 0000000000000..80c573b31ddc7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-2.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct TL { + unsigned short a; + unsigned short b; +} __attribute__ ((packed, aligned (8))); + +struct TB { + unsigned char p; + unsigned short a; + unsigned short b; +} __attribute__ ((packed, aligned (8))); + +#define vc 0xaa + +struct TL vL = { vc, vc }; +struct TB vB = { vc, vc, vc }; + +void f (void) { + if (0 + || vL.b != vB.b + || vL.a != vB.a + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-3.c b/gcc/testsuite/gcc.dg/field-merge-3.c new file mode 100644 index 0000000000000..f26e8a96ea04f --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-3.c @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +const int BIG_ENDIAN_P = (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__); + +struct T1 { + unsigned char p[2]; + unsigned short a; + unsigned int z; +} __attribute__((__aligned__(8))); + +struct T2 { + unsigned short p; + unsigned short a; + unsigned int z; +} __attribute__((__aligned__(8))); + +#define vc 0xaa +#define vi 0x12345678 + +struct T1 v1 = { { vc + !BIG_ENDIAN_P, vc + BIG_ENDIAN_P }, vc, vi }; +struct T2 v2 = { (vc << 8) | (vc - 1), vc, vi }; + +void f (void) { + if (0 + || v1.p[!BIG_ENDIAN_P] != v2.p >> 8 + || v1.a != v2.a + || ((v1.z ^ v2.z) & 0xff00ff00) != 0 + || ((v1.z ^ v2.z) & 0x00ff00ff) != 0) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-4.c b/gcc/testsuite/gcc.dg/field-merge-4.c new file mode 100644 index 0000000000000..c629069e52b2c --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-4.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct T1 { + unsigned int zn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(4))); + +struct T2 { + unsigned int zn; + unsigned char rn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(4))); + +#define vc 0xaa +#define vs 0xccdd +#define vi 0x12345678 + +struct T1 v1 = { -1, vc, 1, vs, vi }; +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; + +void f (void) { + if (0 + || v1.p != v2.p + || v1.a != v2.a + || v1.z != v2.z + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-5.c b/gcc/testsuite/gcc.dg/field-merge-5.c new file mode 100644 index 0000000000000..1580b14bcc935 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-5.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ + +struct T1 { + unsigned int zn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(8))); + +struct T2 { + unsigned int zn; + unsigned char rn; + unsigned char p; + unsigned char qn; + unsigned short a; + unsigned int z; +} __attribute__((__packed__, __aligned__(8))); + +#define vc 0xaa +#define vs 0xccdd +#define vi 0x12345678 + +struct T1 v1 = { -1, vc, 1, vs, vi }; +struct T2 v2 = { -1, 0, vc, 1, vs, vi }; + +void f (void) { + if (0 + || v1.p != v2.p + || v1.a != v2.a + || v1.z != v2.z + ) + __builtin_abort (); +} + +int main () { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-6.c b/gcc/testsuite/gcc.dg/field-merge-6.c new file mode 100644 index 0000000000000..7fd48a138d14a --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-6.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O" } */ +/* { dg-shouldfail } */ + +/* Check that the third compare won't be pulled ahead of the second one and + prevent, which would prevent the NULL pointer dereference that should cause + the execution to fail. */ + +struct s { + char a, b; + int *p; +}; + +struct s a = { 0, 1, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a + || *b.p != *a.p + || a.b != b.b); +} + +int main() { + f (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/field-merge-7.c b/gcc/testsuite/gcc.dg/field-merge-7.c new file mode 100644 index 0000000000000..728a29b6fafa9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/field-merge-7.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ifcombine-details" } */ + +/* Check that the third compare won't be combined with the first one. */ + +struct s { + char a, b; + int p; +}; + +struct s a = { 0, 0, 0 }; +struct s b = { 0, 0, 0 }; + +int f () { + return (a.a != b.a || (a.p != b.p && a.b != b.b)); +} + +int g () { + return (a.a == b.a && (a.p == b.p || a.b == b.b)); +} + +/* { dg-final { scan-tree-dump-not "optimizing" "ifcombine" } } */ +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "ifcombine" } } */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 6a3bc99190d9e..58222445daf97 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa.h" #include "attribs.h" #include "asan.h" +#include "bitmap.h" #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ @@ -49,6 +50,28 @@ along with GCC; see the file COPYING3. If not see false) >= 2) #endif +/* Return TRUE iff COND is NULL, or the condition in it is constant. */ + +static bool +constant_condition_p (gcond *cond) +{ + if (!cond) + return true; + + return (CONSTANT_CLASS_P (gimple_cond_lhs (cond)) + && CONSTANT_CLASS_P (gimple_cond_rhs (cond))); +} + +/* Return FALSE iff the condition in the COND stmt that ends COND_BB is not + constant. */ + +static bool +constant_condition_p (basic_block cond_bb) +{ + gcond *cond = safe_dyn_cast (*gsi_last_bb (cond_bb)); + return constant_condition_p (cond); +} + /* This pass combines COND_EXPRs to simplify control flow. It currently recognizes bit tests and comparisons in chains that represent logical and or logical or of two COND_EXPRs. @@ -110,6 +133,37 @@ recognize_if_then_else (basic_block cond_bb, return true; } +/* Same as recognize_if_then_else, but check that the condition is not + constant. It is not useful to combine constant conditions. */ + +static bool +recognize_if_then_else_nc (basic_block cond_bb, + basic_block *then_bb, basic_block *else_bb) +{ + return recognize_if_then_else (cond_bb, then_bb, else_bb) + && !constant_condition_p (cond_bb); +} + +/* Same as recognize_if_then_else, but don't associate the blocks with then and + else, check both possibilities. */ + +static bool +recognize_if_succs (basic_block cond_bb, + basic_block succ1, basic_block succ2) +{ + basic_block t, e; + + if (EDGE_COUNT (cond_bb->succs) != 2) + return false; + + /* Find both succs. */ + t = EDGE_SUCC (cond_bb, 0)->dest; + e = EDGE_SUCC (cond_bb, 1)->dest; + + return ((t == succ1 && e == succ2) + || (t == succ2 && e == succ1)); +} + /* Verify if the basic block BB does not have side-effects. Return true in this case, else false. */ @@ -129,7 +183,7 @@ bb_no_side_effects_p (basic_block bb) enum tree_code rhs_code; if (gimple_has_side_effects (stmt) || gimple_could_trap_p (stmt) - || gimple_vuse (stmt) + /* || gimple_vuse (stmt) */ /* We need to rewrite stmts with undefined overflow to use unsigned arithmetic but cannot do so for signed division. */ || ((ass = dyn_cast (stmt)) @@ -356,14 +410,28 @@ recognize_bits_test (gcond *cond, tree *name, tree *bits, bool inv) } -/* Update profile after code in outer_cond_bb was adjusted so - outer_cond_bb has no condition. */ +/* Update profile after code in either outer_cond_bb or inner_cond_bb was + adjusted so that it has no condition. */ static void update_profile_after_ifcombine (basic_block inner_cond_bb, basic_block outer_cond_bb) { - edge outer_to_inner = find_edge (outer_cond_bb, inner_cond_bb); + /* In the following we assume that inner_cond_bb has single predecessor. */ + gcc_assert (single_pred_p (inner_cond_bb)); + + basic_block outer_to_inner_bb = inner_cond_bb; + profile_probability prob = profile_probability::always (); + for (;;) + { + basic_block parent = single_pred (outer_to_inner_bb); + prob *= find_edge (parent, outer_to_inner_bb)->probability; + if (parent == outer_cond_bb) + break; + outer_to_inner_bb = parent; + } + + edge outer_to_inner = find_edge (outer_cond_bb, outer_to_inner_bb); edge outer2 = (EDGE_SUCC (outer_cond_bb, 0) == outer_to_inner ? EDGE_SUCC (outer_cond_bb, 1) : EDGE_SUCC (outer_cond_bb, 0)); @@ -374,40 +442,294 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, std::swap (inner_taken, inner_not_taken); gcc_assert (inner_taken->dest == outer2->dest); - /* In the following we assume that inner_cond_bb has single predecessor. */ - gcc_assert (single_pred_p (inner_cond_bb)); + if (outer_to_inner_bb == inner_cond_bb + && constant_condition_p (outer_cond_bb)) + { + /* Path outer_cond_bb->(outer2) needs to be merged into path + outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) + and probability of inner_not_taken updated. */ + + inner_cond_bb->count = outer_cond_bb->count; + + /* Handle special case where inner_taken probability is always. In this + case we know that the overall outcome will be always as well, but + combining probabilities will be conservative because it does not know + that outer2->probability is inverse of + outer_to_inner->probability. */ + if (inner_taken->probability == profile_probability::always ()) + ; + else + inner_taken->probability = outer2->probability + + outer_to_inner->probability * inner_taken->probability; + inner_not_taken->probability = profile_probability::always () + - inner_taken->probability; + + outer_to_inner->probability = profile_probability::always (); + outer2->probability = profile_probability::never (); + } + else if (constant_condition_p (inner_cond_bb)) + { + /* Path inner_cond_bb->(inner_taken) needs to be merged into path + outer_cond_bb->(outer2). We've accumulated the probabilities from + outer_cond_bb->(outer)->...->inner_cond_bb in prob, so we have to + adjust that by inner_taken, and make inner unconditional. */ + + prob *= inner_taken->probability; + outer2->probability += prob; + outer_to_inner->probability = profile_probability::always () + - outer2->probability; + + inner_taken->probability = profile_probability::never (); + inner_not_taken->probability = profile_probability::always (); + } + else + { + /* We've moved part of the inner cond to outer, but we don't know the + probabilities for each part, so estimate the effects by moving half of + the odds of inner_taken to outer. */ + + inner_taken->probability *= profile_probability::even (); + inner_not_taken->probability = profile_probability::always () + - inner_taken->probability; + + prob *= inner_taken->probability; + outer2->probability += prob; + outer_to_inner->probability = profile_probability::always () + - outer2->probability; + } +} + +/* Set NAME's bit in USED if OUTER dominates it. */ + +static void +ifcombine_mark_ssa_name (bitmap used, tree name, basic_block outer) +{ + if (SSA_NAME_IS_DEFAULT_DEF (name)) + return; + + gimple *def = SSA_NAME_DEF_STMT (name); + basic_block bb = gimple_bb (def); + if (!dominated_by_p (CDI_DOMINATORS, bb, outer)) + return; + + bitmap_set_bit (used, SSA_NAME_VERSION (name)); +} + +/* Data structure passed to ifcombine_mark_ssa_name. */ +struct ifcombine_mark_ssa_name_t +{ + /* SSA_NAMEs that have been referenced. */ + bitmap used; + /* Dominating block of DEFs that might need moving. */ + basic_block outer; +}; + +/* Mark in DATA->used any SSA_NAMEs used in *t. */ + +static tree +ifcombine_mark_ssa_name_walk (tree *t, int *, void *data_) +{ + ifcombine_mark_ssa_name_t *data = (ifcombine_mark_ssa_name_t *)data_; + + if (*t && TREE_CODE (*t) == SSA_NAME) + ifcombine_mark_ssa_name (data->used, *t, data->outer); + + return NULL; +} + +/* Replace the conditions in INNER_COND and OUTER_COND with COND and COND2. + COND and COND2 are computed for insertion at INNER_COND, with OUTER_COND + replaced with a constant, but if there are intervening blocks, it's best to + adjust COND for insertion at OUTER_COND, placing COND2 at INNER_COND. */ + +static tree +ifcombine_replace_cond (gcond *inner_cond, bool inner_inv, + gcond *outer_cond, bool outer_inv, + tree cond, bool must_canon, + tree cond2) +{ + tree ret = cond; + if (cond2) + ret = fold_build2 (TRUTH_AND_EXPR, TREE_TYPE (ret), ret, cond2); + + /* Split cond into cond2 if they're contiguous. ??? We might be able to + handle ORIF as well, inverting both conditions, but it's not clear that + this would be enough, and it never comes up. */ + if (!cond2 + && TREE_CODE (cond) == TRUTH_ANDIF_EXPR + && single_pred (gimple_bb (inner_cond)) == gimple_bb (outer_cond)) + { + cond2 = TREE_OPERAND (cond, 1); + cond = TREE_OPERAND (cond, 0); + } + + bool outer_p = cond2 || (single_pred (gimple_bb (inner_cond)) + != gimple_bb (outer_cond)); + bool result_inv = outer_p ? outer_inv : inner_inv; + + if (result_inv) + cond = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (cond), cond); + + if (tree tcanon = canonicalize_cond_expr_cond (cond)) + cond = tcanon; + else if (must_canon) + return NULL_TREE; + + if (outer_p) + { + { + auto_bitmap used; + basic_block outer_bb = gimple_bb (outer_cond); - /* Path outer_cond_bb->(outer2) needs to be merged into path - outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken) - and probability of inner_not_taken updated. */ + /* Mark SSA DEFs that are referenced by cond and may thus need to be + moved to outer. */ + { + ifcombine_mark_ssa_name_t data = { used, outer_bb }; + walk_tree (&cond, ifcombine_mark_ssa_name_walk, &data, NULL); + } + + if (!bitmap_empty_p (used)) + { + /* Iterate up from inner_cond, moving DEFs identified as used by + cond, and marking USEs in the DEFs for moving as well. */ + gimple_stmt_iterator gsins = gsi_for_stmt (outer_cond); + for (basic_block bb = gimple_bb (inner_cond); + bb != outer_bb; bb = single_pred (bb)) + { + for (gimple_stmt_iterator gsitr = gsi_last_bb (bb); + !gsi_end_p (gsitr); gsi_prev (&gsitr)) + { + gimple *stmt = gsi_stmt (gsitr); + bool move = false; + tree t; + ssa_op_iter it; + + FOR_EACH_SSA_TREE_OPERAND (t, stmt, it, SSA_OP_DEF) + if (bitmap_bit_p (used, SSA_NAME_VERSION (t))) + { + move = true; + break; + } + + if (!move) + continue; + + /* Mark uses in STMT before moving it. */ + FOR_EACH_SSA_TREE_OPERAND (t, stmt, it, SSA_OP_USE) + ifcombine_mark_ssa_name (used, t, outer_bb); + + gsi_move_before (&gsitr, &gsins, GSI_NEW_STMT); + } + + /* Surprisingly, there may be PHI nodes in single-predecessor + bocks, as in pr50682.C. Fortunately, since they can't + involve back edges, there won't be references to parallel + nodes that we'd have to pay special attention to to keep + them parallel. We can't move the PHI nodes, but we can turn + them into assignments. */ + for (gphi_iterator gsi = gsi_start_phis (bb); + !gsi_end_p (gsi);) + { + gphi *phi = gsi.phi (); + + gcc_assert (gimple_phi_num_args (phi) == 1); + tree def = gimple_phi_result (phi); + + if (!bitmap_bit_p (used, SSA_NAME_VERSION (def))) + { + gsi_next (&gsi); + continue; + } + + /* Mark uses in STMT before moving it. */ + use_operand_p use_p; + ssa_op_iter it; + FOR_EACH_PHI_ARG (use_p, phi, it, SSA_OP_USE) + ifcombine_mark_ssa_name (used, USE_FROM_PTR (use_p), + outer_bb); + + tree use = gimple_phi_arg_def (phi, 0); + location_t loc = gimple_phi_arg_location (phi, 0); + + remove_phi_node (&gsi, false); + + gassign *a = gimple_build_assign (def, use); + gimple_set_location (a, loc); + gsi_insert_before (&gsins, a, GSI_NEW_STMT); + } + } + } + } - inner_cond_bb->count = outer_cond_bb->count; + if (!is_gimple_condexpr_for_cond (cond)) + { + gimple_stmt_iterator gsi = gsi_for_stmt (outer_cond); + cond = force_gimple_operand_gsi_1 (&gsi, cond, + is_gimple_condexpr_for_cond, + NULL, true, GSI_SAME_STMT); + } - /* Handle special case where inner_taken probability is always. In this case - we know that the overall outcome will be always as well, but combining - probabilities will be conservative because it does not know that - outer2->probability is inverse of outer_to_inner->probability. */ - if (inner_taken->probability == profile_probability::always ()) - ; + /* Leave CFG optimization to cfg_cleanup. */ + gimple_cond_set_condition_from_tree (outer_cond, cond); + update_stmt (outer_cond); + + if (cond2) + { + if (inner_inv) + cond2 = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (cond2), cond2); + + if (tree tcanon = canonicalize_cond_expr_cond (cond2)) + cond2 = tcanon; + if (!is_gimple_condexpr_for_cond (cond2)) + { + gimple_stmt_iterator gsi = gsi_for_stmt (inner_cond); + cond2 = force_gimple_operand_gsi_1 (&gsi, cond2, + is_gimple_condexpr_for_cond, + NULL, true, GSI_SAME_STMT); + } + gimple_cond_set_condition_from_tree (inner_cond, cond2); + } + else + gimple_cond_set_condition_from_tree (inner_cond, + inner_inv + ? boolean_false_node + : boolean_true_node); + update_stmt (inner_cond); + } else - inner_taken->probability = outer2->probability + outer_to_inner->probability - * inner_taken->probability; - inner_not_taken->probability = profile_probability::always () - - inner_taken->probability; + { + if (!is_gimple_condexpr_for_cond (cond)) + { + gimple_stmt_iterator gsi = gsi_for_stmt (inner_cond); + cond = force_gimple_operand_gsi_1 (&gsi, cond, + is_gimple_condexpr_for_cond, + NULL, true, GSI_SAME_STMT); + } + gimple_cond_set_condition_from_tree (inner_cond, cond); + update_stmt (inner_cond); + + /* Leave CFG optimization to cfg_cleanup. */ + gimple_cond_set_condition_from_tree (outer_cond, + outer_inv + ? boolean_false_node + : boolean_true_node); + update_stmt (outer_cond); + } + + update_profile_after_ifcombine (gimple_bb (inner_cond), + gimple_bb (outer_cond)); - outer_to_inner->probability = profile_probability::always (); - outer2->probability = profile_probability::never (); + return ret; } /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. - inner_inv, outer_inv and result_inv indicate whether the conditions - are inverted. + inner_inv, outer_inv indicate whether the conditions are inverted. Returns true if the edges to the common else basic-block were merged. */ static bool ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, - basic_block outer_cond_bb, bool outer_inv, bool result_inv) + basic_block outer_cond_bb, bool outer_inv) { gimple_stmt_iterator gsi; tree name1, name2, bit1, bit2, bits1, bits2; @@ -446,26 +768,13 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, t2 = fold_build2 (BIT_AND_EXPR, TREE_TYPE (name1), name1, t); t2 = force_gimple_operand_gsi (&gsi, t2, true, NULL_TREE, true, GSI_SAME_STMT); - t = fold_build2 (result_inv ? NE_EXPR : EQ_EXPR, - boolean_type_node, t2, t); - t = canonicalize_cond_expr_cond (t); - if (!t) - return false; - if (!is_gimple_condexpr_for_cond (t)) - { - gsi = gsi_for_stmt (inner_cond); - t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, - NULL, true, GSI_SAME_STMT); - } - gimple_cond_set_condition_from_tree (inner_cond, t); - update_stmt (inner_cond); - /* Leave CFG optimization to cfg_cleanup. */ - gimple_cond_set_condition_from_tree (outer_cond, - outer_inv ? boolean_false_node : boolean_true_node); - update_stmt (outer_cond); + t = fold_build2 (EQ_EXPR, boolean_type_node, t2, t); - update_profile_after_ifcombine (inner_cond_bb, outer_cond_bb); + if (!ifcombine_replace_cond (inner_cond, inner_inv, + outer_cond, outer_inv, + t, true, NULL_TREE)) + return false; if (dump_file) { @@ -485,9 +794,8 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, In that case remove the outer test and change the inner one to test for name & (bits1 | bits2) != 0. */ else if (recognize_bits_test (inner_cond, &name1, &bits1, !inner_inv) - && recognize_bits_test (outer_cond, &name2, &bits2, !outer_inv)) + && recognize_bits_test (outer_cond, &name2, &bits2, !outer_inv)) { - gimple_stmt_iterator gsi; tree t; if ((TREE_CODE (name1) == SSA_NAME @@ -530,33 +838,14 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, bits1 = fold_convert (TREE_TYPE (bits2), bits1); } - /* Do it. */ - gsi = gsi_for_stmt (inner_cond); t = fold_build2 (BIT_IOR_EXPR, TREE_TYPE (name1), bits1, bits2); - t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, - true, GSI_SAME_STMT); t = fold_build2 (BIT_AND_EXPR, TREE_TYPE (name1), name1, t); - t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, - true, GSI_SAME_STMT); - t = fold_build2 (result_inv ? NE_EXPR : EQ_EXPR, boolean_type_node, t, + t = fold_build2 (EQ_EXPR, boolean_type_node, t, build_int_cst (TREE_TYPE (t), 0)); - t = canonicalize_cond_expr_cond (t); - if (!t) + if (!ifcombine_replace_cond (inner_cond, inner_inv, + outer_cond, outer_inv, + t, false, NULL_TREE)) return false; - if (!is_gimple_condexpr_for_cond (t)) - { - gsi = gsi_for_stmt (inner_cond); - t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, - NULL, true, GSI_SAME_STMT); - } - gimple_cond_set_condition_from_tree (inner_cond, t); - update_stmt (inner_cond); - - /* Leave CFG optimization to cfg_cleanup. */ - gimple_cond_set_condition_from_tree (outer_cond, - outer_inv ? boolean_false_node : boolean_true_node); - update_stmt (outer_cond); - update_profile_after_ifcombine (inner_cond_bb, outer_cond_bb); if (dump_file) { @@ -576,7 +865,7 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, else if (TREE_CODE_CLASS (gimple_cond_code (inner_cond)) == tcc_comparison && TREE_CODE_CLASS (gimple_cond_code (outer_cond)) == tcc_comparison) { - tree t; + tree t, ts = NULL_TREE; enum tree_code inner_cond_code = gimple_cond_code (inner_cond); enum tree_code outer_cond_code = gimple_cond_code (outer_cond); @@ -599,10 +888,20 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, outer_cond_code, gimple_cond_lhs (outer_cond), gimple_cond_rhs (outer_cond), - gimple_bb (outer_cond)))) + gimple_bb (outer_cond))) + && !(t = (fold_truth_andor_maybe_separate + (UNKNOWN_LOCATION, TRUTH_ANDIF_EXPR, + boolean_type_node, + outer_cond_code, + gimple_cond_lhs (outer_cond), + gimple_cond_rhs (outer_cond), + inner_cond_code, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond), + &ts)))) { + { tree t1, t2; - gimple_stmt_iterator gsi; bool logical_op_non_short_circuit = LOGICAL_OP_NON_SHORT_CIRCUIT; if (param_logical_op_non_short_circuit != -1) logical_op_non_short_circuit @@ -624,34 +923,13 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, gimple_cond_rhs (outer_cond)); t = fold_build2_loc (gimple_location (inner_cond), TRUTH_AND_EXPR, boolean_type_node, t1, t2); - if (result_inv) - { - t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t); - result_inv = false; - } - gsi = gsi_for_stmt (inner_cond); - t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, - NULL, true, GSI_SAME_STMT); + } } - if (result_inv) - t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t); - t = canonicalize_cond_expr_cond (t); - if (!t) - return false; - if (!is_gimple_condexpr_for_cond (t)) - { - gsi = gsi_for_stmt (inner_cond); - t = force_gimple_operand_gsi_1 (&gsi, t, is_gimple_condexpr_for_cond, - NULL, true, GSI_SAME_STMT); - } - gimple_cond_set_condition_from_tree (inner_cond, t); - update_stmt (inner_cond); - /* Leave CFG optimization to cfg_cleanup. */ - gimple_cond_set_condition_from_tree (outer_cond, - outer_inv ? boolean_false_node : boolean_true_node); - update_stmt (outer_cond); - update_profile_after_ifcombine (inner_cond_bb, outer_cond_bb); + if (!(t = ifcombine_replace_cond (inner_cond, inner_inv, + outer_cond, outer_inv, + t, false, ts))) + return false; if (dump_file) { @@ -669,19 +947,21 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, /* Helper function for tree_ssa_ifcombine_bb. Recognize a CFG pattern and dispatch to the appropriate if-conversion helper for a particular set of INNER_COND_BB, OUTER_COND_BB, THEN_BB and ELSE_BB. - PHI_PRED_BB should be one of INNER_COND_BB, THEN_BB or ELSE_BB. */ + PHI_PRED_BB should be one of INNER_COND_BB, THEN_BB or ELSE_BB. + OUTER_SUCC_BB is the successor of OUTER_COND_BB on the path towards + INNER_COND_BB. */ static bool tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, basic_block then_bb, basic_block else_bb, - basic_block phi_pred_bb) + basic_block phi_pred_bb, basic_block outer_succ_bb) { /* The && form is characterized by a common else_bb with the two edges leading to it mergable. The latter is guaranteed by matching PHI arguments in the else_bb and the inner cond_bb having no side-effects. */ if (phi_pred_bb != else_bb - && recognize_if_then_else (outer_cond_bb, &inner_cond_bb, &else_bb) + && recognize_if_then_else_nc (outer_cond_bb, &outer_succ_bb, &else_bb) && same_phi_args_p (outer_cond_bb, phi_pred_bb, else_bb)) { /* We have @@ -693,13 +973,12 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, false, - false); + return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, false); } /* And a version where the outer condition is negated. */ if (phi_pred_bb != else_bb - && recognize_if_then_else (outer_cond_bb, &else_bb, &inner_cond_bb) + && recognize_if_then_else_nc (outer_cond_bb, &else_bb, &outer_succ_bb) && same_phi_args_p (outer_cond_bb, phi_pred_bb, else_bb)) { /* We have @@ -711,8 +990,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, true, - false); + return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, true); } /* The || form is characterized by a common then_bb with the @@ -720,7 +998,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, by matching PHI arguments in the then_bb and the inner cond_bb having no side-effects. */ if (phi_pred_bb != then_bb - && recognize_if_then_else (outer_cond_bb, &then_bb, &inner_cond_bb) + && recognize_if_then_else_nc (outer_cond_bb, &then_bb, &outer_succ_bb) && same_phi_args_p (outer_cond_bb, phi_pred_bb, then_bb)) { /* We have @@ -731,13 +1009,12 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, true, - true); + return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, true); } /* And a version where the outer condition is negated. */ if (phi_pred_bb != then_bb - && recognize_if_then_else (outer_cond_bb, &inner_cond_bb, &then_bb) + && recognize_if_then_else_nc (outer_cond_bb, &outer_succ_bb, &then_bb) && same_phi_args_p (outer_cond_bb, phi_pred_bb, then_bb)) { /* We have @@ -748,8 +1025,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, false, - true); + return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, false); } return false; @@ -759,13 +1035,13 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, if-conversion helper. We start with BB as the innermost worker basic-block. Returns true if a transformation was done. */ -static bool +static basic_block tree_ssa_ifcombine_bb (basic_block inner_cond_bb) { basic_block then_bb = NULL, else_bb = NULL; - if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb)) - return false; + if (!recognize_if_then_else_nc (inner_cond_bb, &then_bb, &else_bb)) + return NULL; /* Recognize && and || of two conditions with a common then/else block which entry edges we can merge. That is: @@ -775,14 +1051,41 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) if (a && b) ; This requires a single predecessor of the inner cond_bb. */ - if (single_pred_p (inner_cond_bb) - && bb_no_side_effects_p (inner_cond_bb)) + for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL; + single_pred_p (bb) && bb_no_side_effects_p (bb) + && (!exit_bb || same_phi_args_p (bb, inner_cond_bb, exit_bb)); + bb = outer_cond_bb) { - basic_block outer_cond_bb = single_pred (inner_cond_bb); + outer_cond_bb = single_pred (bb); + + /* Skip blocks without conditions. */ + if (single_succ_p (outer_cond_bb)) + continue; + + /* When considering noncontiguous conditions, make sure that all + non-final conditions lead to the same successor of the final + condition, when not taking the path to inner_bb, so that we can + combine C into A, both in A && (B && C), and in A || (B || C), but + neither in A && (B || C), nor A || (B && C). Say, if C goes to + THEN_BB or ELSE_BB, then B must go to either of these, say X, besides + C (whether C is then or else), and A must go to X and B (whether then + or else). + + We test for this, while allowing intervening nonconditional blocks, by + first taking note of which of the successors of the inner conditional + block is the exit path taken by the first considered outer conditional + block. + + Having ve identified and saved the exit block in exit_bb at the end of + the loop, here we test that subsequent conditional blocks under + consideration also use the exit block as a successor, besides the + block that leads to inner_cond_bb. */ + if (exit_bb && !recognize_if_succs (outer_cond_bb, bb, exit_bb)) + break; if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, - then_bb, else_bb, inner_cond_bb)) - return true; + then_bb, else_bb, inner_cond_bb, bb)) + return outer_cond_bb; if (forwarder_block_to (else_bb, then_bb)) { @@ -793,8 +1096,8 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) For same_phi_args_p we look at equality of arguments between edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, - then_bb, else_bb)) - return true; + then_bb, else_bb, bb)) + return outer_cond_bb; } else if (forwarder_block_to (then_bb, else_bb)) { @@ -805,12 +1108,25 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb) For same_phi_args_p we look at equality of arguments between edge from outer_cond_bb and the forwarder block. */ if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb, - then_bb, then_bb)) - return true; + then_bb, then_bb, bb)) + return outer_cond_bb; + } + + /* Record the exit path taken by the outer condition. In the loop + condition test, we'll check whether INNER_COND_BB and the current + OUTER_COND_BB (then BB) share the same PHI args at EXIT_BB. */ + if (!exit_bb) + { + if (recognize_if_succs (outer_cond_bb, then_bb, bb)) + exit_bb = then_bb; + else if (recognize_if_succs (outer_cond_bb, bb, else_bb)) + exit_bb = else_bb; + else + break; } } - return false; + return NULL; } /* Main entry for the tree if-conversion pass. */ @@ -866,7 +1182,7 @@ pass_tree_ifcombine::execute (function *fun) basic_block bb = bbs[i]; if (safe_is_a (*gsi_last_bb (bb))) - if (tree_ssa_ifcombine_bb (bb)) + if (basic_block outer_bb = tree_ssa_ifcombine_bb (bb)) { /* Clear range info from all stmts in BB which is now executed conditional on a always true/false condition. */ @@ -885,6 +1201,12 @@ pass_tree_ifcombine::execute (function *fun) rewrite_to_defined_overflow (&gsi); } cfg_changed |= true; + /* Go back to outer_bb, in case it could be further optimized, but + only at -O2+, since it could get quadratic. */ + do + i++; + while (optimize >= 2 && bbs[i] != outer_bb); + continue; } } From patchwork Fri Oct 25 11:51:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 2002320 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=adacore.com header.i=@adacore.com header.a=rsa-sha256 header.s=google header.b=emoc2jmE; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XZlml6ldbz1xwF for ; Sat, 26 Oct 2024 01:39:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EBF593858405 for ; Fri, 25 Oct 2024 14:39:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id BE3523858D33 for ; Fri, 25 Oct 2024 14:39:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BE3523858D33 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=adacore.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=adacore.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BE3523858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::42d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729867152; cv=none; b=Ij42ksZwfcPaY/W+Knpu71eAD3ghmRP6mFTqa0s4N7LNmKprylvQfJYIiyfpQQDgZkTjSiM8oV0y2vXAO107TVDga8DCmSivG+aO2uBSly1MSZhLvZpiq2dwjzAR7FjpbSluLRqGx0/Wt8jBFR+y1y47ip3XO+LoqfESv0j5uGE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729867152; c=relaxed/simple; bh=YFnEDrndROefegGxh8RtlMYNKBpqFM91jUOTM3mixtk=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=DzAuC5moQUZXEqoDvWMqmAYWsvm+TQLrk9MLlE+P2hqW2zBj+JZdRf7au88JBVinlA9e5eY70kCTRkilx7ijJJ4zMKR/NqQkwcdfa5rMg0ET5bNoBj3JkRSqywY452S2wNT7gc1cBDouo/c9CHrcifeWXAZ7HIB9fZZnljR3gmc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-71e49ef3b2bso1511615b3a.2 for ; Fri, 25 Oct 2024 07:39:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=adacore.com; s=google; t=1729867144; x=1730471944; darn=gcc.gnu.org; h=mime-version:user-agent:message-id:in-reply-to:date:references :organization:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=3IyZt5mJMjdiEnuh4EeKQiE7Yrk8biPBjxrmTO15hcc=; b=emoc2jmEyoZJK9JSlmuiCcVOZpZKFrFkJgZufVgGzpQvtxNwlk4e1wIibk4AlSCiXt QceStmrgR92+3KA4WpYfibE6tkWRuGm9EFn1KFEyhYD9Oej9jMUcJ+0beO9kqq8oB0yr Wd+HbMwdTF4ATamCW3j5kkSYyoVzLlGgeHeM67+xmqhia6fvE9wni69rhFkNTmR+cGzf fbJLkuoBp2xIQm/g7rtIhMFsSinGxhRApvAc1ypO8OKabzck4oBBFpfotNO60VH2Gc74 pmYg59DtrCKnMlqRJqpD5GozfxY/1XKuw89eXPUrCooSNwmc1nPuDsBQhbJBKCeRVmTd K/9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729867144; x=1730471944; h=mime-version:user-agent:message-id:in-reply-to:date:references :organization:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3IyZt5mJMjdiEnuh4EeKQiE7Yrk8biPBjxrmTO15hcc=; b=sdPIyLuoqbeSa2khofWAqiaCiMw3QOO/UlXH/TBreIa0dgTFeu+fHzJBNesH6RVLHb 9gKsTeps3IKJ0fSPnGBc3GMIxeuj+8BNNlpYZB91Xj/x78pIOXRZG3+gy1QDvLVKGsDO bMeId3v0FxOJIYmVnx0NCssVN2aptLM+gQZuE8ic68ir6BEuzdy0VZPCSyNX1EQx5vl+ SDhVsHWBu2YiX6rfwBSfzZUA9qW6tdGTkG2so11sW8KDeyeGjpfsq4awVZBYIXQua3Js AgABQ+ym8cR5SlJFaLlutluuqpzevDly+QEYAjdMODeKCnT1Zg2kD/hWQanSy/KOW7dD wl/Q== X-Gm-Message-State: AOJu0Yzd4FtWZWfxxBR0GeHH8972NMDGLx7UJdYxW4ULK5AdKBMyIVFL T+ODcnI/pqxRciBwuyMpGYgtg4mOWFPbU39kSJBs1tXjIvSgvCTC2lKriLRvTzuoP8GRDlWJj5+ jAw== X-Google-Smtp-Source: AGHT+IEG/Gswr/tcWDli3UNDkp0Rg71AdGqCrXZDC8xVR4156Y8ubgbSQt+muNuDfMdAgNgTVjq6fA== X-Received: by 2002:a05:6a20:e617:b0:1d9:2be8:6e0a with SMTP id adf61e73a8af0-1d978bd5cc4mr12617775637.46.1729867144400; Fri, 25 Oct 2024 07:39:04 -0700 (PDT) Received: from free.home ([2804:14c:4d1:44a5::1002]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7205791d9basm1123267b3a.43.2024.10.25.07.39.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Oct 2024 07:39:04 -0700 (PDT) Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 49PBpE4Y1633791 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 25 Oct 2024 08:51:14 -0300 From: Alexandre Oliva To: Richard Biener Cc: GCC Patches Subject: [PATCH #2/7] drop redundant ifcombine_ifandif parm (was: Re: [PATCH] fold fold_truth_andor field merging into ifcombine) Organization: Free thinker, does not speak for AdaCore References: Date: Fri, 25 Oct 2024 08:51:14 -0300 In-Reply-To: (Alexandre Oliva's message of "Fri, 25 Oct 2024 08:07:13 -0300") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, WEIRD_QUOTING autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org In preparation to changes that may modify both inner and outer conditions in ifcombine, drop the redundant parameter result_inv, that is always identical to inner_inv. for gcc/ChangeLog * tree-ssa-ifcombine.cc (ifcombine_ifandif): Drop redundant result_inv parm. Adjust all callers. --- gcc/tree-ssa-ifcombine.cc | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index ed20a231951a3..0a2ba970548c8 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -401,14 +401,14 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. - inner_inv, outer_inv and result_inv indicate whether the conditions - are inverted. + inner_inv, outer_inv indicate whether the conditions are inverted. Returns true if the edges to the common else basic-block were merged. */ static bool ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, - basic_block outer_cond_bb, bool outer_inv, bool result_inv) + basic_block outer_cond_bb, bool outer_inv) { + bool result_inv = inner_inv; gimple_stmt_iterator gsi; tree name1, name2, bit1, bit2, bits1, bits2; @@ -693,8 +693,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, false, - false); + return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, false); } /* And a version where the outer condition is negated. */ @@ -711,8 +710,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, true, - false); + return ifcombine_ifandif (inner_cond_bb, false, outer_cond_bb, true); } /* The || form is characterized by a common then_bb with the @@ -731,8 +729,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, true, - true); + return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, true); } /* And a version where the outer condition is negated. */ @@ -748,8 +745,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, basic_block outer_cond_bb, ... */ - return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, false, - true); + return ifcombine_ifandif (inner_cond_bb, true, outer_cond_bb, false); } return false;