From patchwork Wed Jul 17 11:23:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1961592 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=KN1/IPYm; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=+uqqoF+l; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=2SS5SEq0; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=hqsNkt5E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WPD8s6pzGz1ySl for ; Wed, 17 Jul 2024 21:23:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 13611385C6E1 for ; Wed, 17 Jul 2024 11:23:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id D083F3858403 for ; Wed, 17 Jul 2024 11:23:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D083F3858403 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D083F3858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721215409; cv=none; b=D82sph20krVTjDbNsS4mVsB4cHGJa81iW0QOt8k4bS+4O8g3HTS59J5VRJTIp06hwKvEpJ+9J38A/XAuf5S8iJQ5nyr6jvhylhvexia6VUvqGt1Tay2+eZyWeHeeK/25M7ZeFQQE9dYljk9PKhXK8MWUKL4B1sTRpQSjUjm3ZEk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721215409; c=relaxed/simple; bh=Mr617wZ+ATk4K6/gL6XE8gJCeyp1cRmKMNeH7BxmGD4=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=S7zp0DW0zh/QJylY3xSWhmJo9svYzizqIaoCMVrHWGhjf6WXCJbXkvJe0yQ/GTw66NR3P14IhtCdC39ADyjweP57kTSUJhP41ai9shQrMeSPrttexJD/Mfy/R0/C+RMf8ewbv2HVwDhyY0sX1GtUC6inDzBaKyIbXjjV0IqS0pI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from murzim.nue2.suse.org (unknown [10.168.4.243]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D1DEA219DB for ; Wed, 17 Jul 2024 11:23:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1721215405; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=DAceaWRs2qGzLg1x3z26/w4hU797IAtEyLR7Pq2tvYM=; b=KN1/IPYmK49/8/bMvX2hsvpxL8LtZV5vIexq8qZngVI3DvnMr5+DBCzSoUx5aNVi9bymxx GZEpQtOkELqcqrVXujZ7euGFOABXwxHdkY6poErNGUBDo9vcKsJula/tvXfRqBlj47efIN 5pq8t6GBP4FdgJlCwO8m9rqywLnJW80= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1721215405; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=DAceaWRs2qGzLg1x3z26/w4hU797IAtEyLR7Pq2tvYM=; b=+uqqoF+lGGqf6hhCEyKGLZcNhf/cD2J4n9hVjGmBLYRCqwvezTS2B4NjrhnK8mzEeuCfqd PLLZj8n5fIYoYkCQ== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1721215404; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=DAceaWRs2qGzLg1x3z26/w4hU797IAtEyLR7Pq2tvYM=; b=2SS5SEq07SF5paG5QQ7zpUifCMhLpC9ijm3VuVl/hZc5irwCvv88OQmMebf2t7f29CwxFH KkrUYDraSIItWc3XtOPY+GIdzF3cUVVtt93RoYaxSj964/mb48Fwd9cKl3CGoeZN2Bm3Cy RLHVKWv8yqPT4QQMvnkwnM6mMHWtrKg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1721215404; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=DAceaWRs2qGzLg1x3z26/w4hU797IAtEyLR7Pq2tvYM=; b=hqsNkt5Exsoleo+iXXazgUVtwd66U9O6R5TDaLm6c4edMvKnROD9DV7xO0DNFd8k3JGILW kBgTQlKtOoSnkQDg== Date: Wed, 17 Jul 2024 13:23:24 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/104515 - store motion and clobbers MIME-Version: 1.0 X-Spam-Score: 4.54 X-Spamd-Result: default: False [4.54 / 50.00]; MISSING_MID(2.50)[]; NEURAL_SPAM_SHORT(2.14)[0.712]; MIME_GOOD(-0.10)[text/plain]; RCVD_COUNT_ZERO(0.00)[0]; ARC_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MISSING_XM_UA(0.00)[]; FROM_HAS_DN(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_NONE(0.00)[]; MIME_TRACE(0.00)[0:+] X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Message-Id: <20240717112356.13611385C6E1@sourceware.org> The following addresses an old regression when end-of-object/storage clobbers were introduced. In particular when there's an end-of-object clobber in a loop but no corresponding begin-of-object we can still perform store motion of may-aliased refs when we re-issue the end-of-object/storage on the exits but elide it from the loop. This should be the safest way to deal with this considering stack-slot sharing and it should not cause missed dead store eliminations given DSE can now follow multiple paths in case there are multiple exits. Note when the clobber is re-materialized only on one exit but not on anther we are erroring on the side of removing the clobber on such path. This should be OK (removing clobbers is always OK). Note there's no corresponding code to handle begin-of-object/storage during the hoisting part of loads that are part of a store motion optimization, so this only enables stored-only store motion or cases without such clobber inside the loop. Bootstrapped and tested on x86_64-unknown-linux-gnu, I'll push when the pre-commit CI is happy. Richard. PR tree-optimization/104515 * tree-ssa-loop-im.cc (execute_sm_exit): Add clobbers_to_prune parameter and handle re-materializing of clobbers. (sm_seq_valid_bb): end-of-storage/object clobbers are OK inside an ordered sequence of stores. (sm_seq_push_down): Refuse to push down clobbers. (hoist_memory_references): Prune clobbers from the loop body we re-materialized on an exit. * g++.dg/opt/pr104515.C: New testcase. --- gcc/testsuite/g++.dg/opt/pr104515.C | 18 ++++++ gcc/tree-ssa-loop-im.cc | 86 ++++++++++++++++++++++++----- 2 files changed, 89 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/g++.dg/opt/pr104515.C diff --git a/gcc/testsuite/g++.dg/opt/pr104515.C b/gcc/testsuite/g++.dg/opt/pr104515.C new file mode 100644 index 00000000000..f5455a45aa6 --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr104515.C @@ -0,0 +1,18 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-O2 -fdump-tree-lim2-details" } + +using T = int; +struct Vec { + T* end; +}; +void pop_back_many(Vec& v, unsigned n) +{ + for (unsigned i = 0; i < n; ++i) { + --v.end; + // The end-of-object clobber prevented store motion of v + v.end->~T(); + } +} + +// { dg-final { scan-tree-dump "Executing store motion of v" "lim2" } } +// { dg-final { scan-tree-dump "Re-issueing dependent" "lim2" } } diff --git a/gcc/tree-ssa-loop-im.cc b/gcc/tree-ssa-loop-im.cc index 61c6339bc35..c53efbb8d59 100644 --- a/gcc/tree-ssa-loop-im.cc +++ b/gcc/tree-ssa-loop-im.cc @@ -2368,7 +2368,8 @@ struct seq_entry static void execute_sm_exit (class loop *loop, edge ex, vec &seq, hash_map &aux_map, sm_kind kind, - edge &append_cond_position, edge &last_cond_fallthru) + edge &append_cond_position, edge &last_cond_fallthru, + bitmap clobbers_to_prune) { /* Sink the stores to exit from the loop. */ for (unsigned i = seq.length (); i > 0; --i) @@ -2377,15 +2378,35 @@ execute_sm_exit (class loop *loop, edge ex, vec &seq, if (seq[i-1].second == sm_other) { gcc_assert (kind == sm_ord && seq[i-1].from != NULL_TREE); - if (dump_file && (dump_flags & TDF_DETAILS)) + gassign *store; + if (ref->mem.ref == error_mark_node) { - fprintf (dump_file, "Re-issueing dependent store of "); - print_generic_expr (dump_file, ref->mem.ref); - fprintf (dump_file, " from loop %d on exit %d -> %d\n", - loop->num, ex->src->index, ex->dest->index); + tree lhs = gimple_assign_lhs (ref->accesses_in_loop[0].stmt); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Re-issueing dependent "); + print_generic_expr (dump_file, unshare_expr (seq[i-1].from)); + fprintf (dump_file, " of "); + print_generic_expr (dump_file, lhs); + fprintf (dump_file, " from loop %d on exit %d -> %d\n", + loop->num, ex->src->index, ex->dest->index); + } + store = gimple_build_assign (unshare_expr (lhs), + unshare_expr (seq[i-1].from)); + bitmap_set_bit (clobbers_to_prune, seq[i-1].first); + } + else + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Re-issueing dependent store of "); + print_generic_expr (dump_file, ref->mem.ref); + fprintf (dump_file, " from loop %d on exit %d -> %d\n", + loop->num, ex->src->index, ex->dest->index); + } + store = gimple_build_assign (unshare_expr (ref->mem.ref), + seq[i-1].from); } - gassign *store = gimple_build_assign (unshare_expr (ref->mem.ref), - seq[i-1].from); gsi_insert_on_edge (ex, store); } else @@ -2426,6 +2447,12 @@ sm_seq_push_down (vec &seq, unsigned ptr, unsigned *at) break; /* We may not ignore self-dependences here. */ if (new_cand.first == against.first + /* ??? We could actually handle clobbers here, but not easily + with LIMs dependence analysis. */ + || (memory_accesses.refs_list[new_cand.first]->mem.ref + == error_mark_node) + || (memory_accesses.refs_list[against.first]->mem.ref + == error_mark_node) || !refs_independent_p (memory_accesses.refs_list[new_cand.first], memory_accesses.refs_list[against.first], false)) @@ -2656,13 +2683,17 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree vdef, return 1; } lim_aux_data *data = get_lim_data (def); - gcc_assert (data); + im_mem_ref *ref = memory_accesses.refs_list[data->ref]; if (data->ref == UNANALYZABLE_MEM_ID) return -1; /* Stop at memory references which we can't move. */ - else if (memory_accesses.refs_list[data->ref]->mem.ref == error_mark_node - || TREE_THIS_VOLATILE - (memory_accesses.refs_list[data->ref]->mem.ref)) + else if ((ref->mem.ref == error_mark_node + /* We can move end-of-storage/object down. */ + && !gimple_clobber_p (ref->accesses_in_loop[0].stmt, + CLOBBER_STORAGE_END) + && !gimple_clobber_p (ref->accesses_in_loop[0].stmt, + CLOBBER_OBJECT_END)) + || TREE_THIS_VOLATILE (ref->mem.ref)) { /* Mark refs_not_in_seq as unsupported. */ bitmap_ior_into (refs_not_supported, refs_not_in_seq); @@ -2818,6 +2849,7 @@ hoist_memory_references (class loop *loop, bitmap mem_refs, /* Materialize ordered store sequences on exits. */ edge e; + auto_bitmap clobbers_to_prune; FOR_EACH_VEC_ELT (exits, i, e) { edge append_cond_position = NULL; @@ -2836,10 +2868,21 @@ hoist_memory_references (class loop *loop, bitmap mem_refs, append_cond_position, last_cond_fallthru)); execute_sm_exit (loop, insert_e, seq, aux_map, sm_ord, - append_cond_position, last_cond_fallthru); + append_cond_position, last_cond_fallthru, + clobbers_to_prune); gsi_commit_one_edge_insert (insert_e, NULL); } + /* Remove clobbers inside the loop we re-materialized on exits. */ + EXECUTE_IF_SET_IN_BITMAP (clobbers_to_prune, 0, i, bi) + { + gimple *stmt = memory_accesses.refs_list[i]->accesses_in_loop[0].stmt; + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + unlink_stmt_vdef (stmt); + release_defs (stmt); + gsi_remove (&gsi, true); + } + for (hash_map::iterator iter = aux_map.begin (); iter != aux_map.end (); ++iter) delete (*iter).second; @@ -2990,6 +3033,7 @@ hoist_memory_references (class loop *loop, bitmap mem_refs, } /* Materialize ordered store sequences on exits. */ + auto_bitmap clobbers_to_prune; FOR_EACH_VEC_ELT (exits, i, e) { edge append_cond_position = NULL; @@ -2998,17 +3042,29 @@ hoist_memory_references (class loop *loop, bitmap mem_refs, { gcc_assert (sms[i].first == e); execute_sm_exit (loop, e, sms[i].second, aux_map, sm_ord, - append_cond_position, last_cond_fallthru); + append_cond_position, last_cond_fallthru, + clobbers_to_prune); sms[i].second.release (); } if (!unord_refs.is_empty ()) execute_sm_exit (loop, e, unord_refs, aux_map, sm_unord, - append_cond_position, last_cond_fallthru); + append_cond_position, last_cond_fallthru, + clobbers_to_prune); /* Commit edge inserts here to preserve the order of stores when an exit exits multiple loops. */ gsi_commit_one_edge_insert (e, NULL); } + /* Remove clobbers inside the loop we re-materialized on exits. */ + EXECUTE_IF_SET_IN_BITMAP (clobbers_to_prune, 0, i, bi) + { + gimple *stmt = memory_accesses.refs_list[i]->accesses_in_loop[0].stmt; + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + unlink_stmt_vdef (stmt); + release_defs (stmt); + gsi_remove (&gsi, true); + } + for (hash_map::iterator iter = aux_map.begin (); iter != aux_map.end (); ++iter) delete (*iter).second;