From patchwork Mon Nov 11 12:53:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 2009808 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=Yti04ocp; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=sQtj6975; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=Yti04ocp; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=sQtj6975; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xn8ct4mZDz1xyB for ; Mon, 11 Nov 2024 23:54:05 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 303903858C56 for ; Mon, 11 Nov 2024 12:54:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 358893858D21 for ; Mon, 11 Nov 2024 12:53:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 358893858D21 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 358893858D21 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731329619; cv=none; b=G5svsasLvYlrQFng7bfkim0YYTfiH5AOvVtmk0o6agdgGoKEJvkMj3aHMwUSGjq7Z1Y+vdcn/edkeCDO/gr4KufDmmL26cEt59GWCKDwILDY0gGI0IRNmib0BZGi6GO+6wJ0+i7bYarAL0rywyi55JvxCFWrw1FMGoNwf6m3vEc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731329619; c=relaxed/simple; bh=zgJeibGWON/+MQ0M6TN9ucLD5kHMext6sNfATrWdEVg=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=RYOqAyMclCrM3uZO6eaJpjLvtbmfBwWp0YEqX5eHMJSodA0s/M7BsEiBCVptm+IqFiuc/owgq6y3sk3uquKxPLS+dIQ4+bWvBVqZdrqf6Y5ST6fU2WqZQalsfarqLzZ/HdUZby3H3rB+cfDp8zVs4pqzHn/D5Fjxv0imzNmKQgs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from murzim.nue2.suse.org (unknown [10.168.4.243]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 364051F457 for ; Mon, 11 Nov 2024 12:53:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1731329616; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=dq1tGbn/ck4rUhm0+rmx3cpK0KsXCWvk5R9OlG2wXCM=; b=Yti04ocpxVp6TyPuERHjMDftJq8wCC7gqdd+RSTZg5DHr3CAH3Dc5EZm+0BjfWsB2v+h68 OJ6faJ5eufK6EPn6QZlAqOM6vc0XbNgoS/pUjy1unQoqQIsRaWE4keVyV5uBTPaA+YoP8E atM8lqcyAnNezZMGs3oA3hp2AENKNNw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1731329616; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=dq1tGbn/ck4rUhm0+rmx3cpK0KsXCWvk5R9OlG2wXCM=; b=sQtj6975tRaOLyOvO/ypg7ZuoUFK8KpCJ6NXFtaCGX2vOje4xIRQWhrYeoTrDbrY6qUOKp nOrtydCT4FoDcjDA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1731329616; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=dq1tGbn/ck4rUhm0+rmx3cpK0KsXCWvk5R9OlG2wXCM=; b=Yti04ocpxVp6TyPuERHjMDftJq8wCC7gqdd+RSTZg5DHr3CAH3Dc5EZm+0BjfWsB2v+h68 OJ6faJ5eufK6EPn6QZlAqOM6vc0XbNgoS/pUjy1unQoqQIsRaWE4keVyV5uBTPaA+YoP8E atM8lqcyAnNezZMGs3oA3hp2AENKNNw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1731329616; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=dq1tGbn/ck4rUhm0+rmx3cpK0KsXCWvk5R9OlG2wXCM=; b=sQtj6975tRaOLyOvO/ypg7ZuoUFK8KpCJ6NXFtaCGX2vOje4xIRQWhrYeoTrDbrY6qUOKp nOrtydCT4FoDcjDA== Date: Mon, 11 Nov 2024 13:53:36 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH][v2] tree-optimization/117484 - issue with SLP discovery of permuted .MASK_LOAD MIME-Version: 1.0 X-Spamd-Result: default: False [-0.16 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MISSING_MID(2.50)[]; NEURAL_SPAM_LONG(0.63)[0.179]; NEURAL_HAM_SHORT(-0.19)[-0.932]; MIME_GOOD(-0.10)[text/plain]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_ONE(0.00)[1]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; MISSING_XM_UA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_NONE(0.00)[]; MIME_TRACE(0.00)[0:+] X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Message-Id: <20241111125403.303903858C56@sourceware.org> When we do SLP discovery of a .MASK_LOAD for a dataref group with gaps the discovery for the mask will have gaps as well and this was unexpected in a few places. The following re-organizes things slightly to accomodate for this. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. PR tree-optimization/117484 * tree-vect-slp.cc (vect_build_slp_tree_2): Handle gaps in mask discovery. Fix condition to release the load permutation. (vect_lower_load_permutations): Assert we get no load permutation for the unpermuted node. * tree-vect-slp-patterns.cc (linear_loads_p): Properly identify loads (without permutation). (compatible_complex_nodes_p): Likewise. * gcc.dg/vect/pr117484-1.c: New testcase. * gcc.dg/vect/pr117484-2.c: Likewise. --- gcc/testsuite/gcc.dg/vect/pr117484-1.c | 13 +++++++++++++ gcc/testsuite/gcc.dg/vect/pr117484-2.c | 16 ++++++++++++++++ gcc/tree-vect-slp-patterns.cc | 14 ++++++++++---- gcc/tree-vect-slp.cc | 22 +++++++++++++--------- 4 files changed, 52 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr117484-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/pr117484-2.c diff --git a/gcc/testsuite/gcc.dg/vect/pr117484-1.c b/gcc/testsuite/gcc.dg/vect/pr117484-1.c new file mode 100644 index 00000000000..453556c50f9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr117484-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ + +extern int a; +extern short b[]; +extern signed char c[], d[]; +int main() +{ + for (long j = 3; j < 1024; j += 3) + if (c[j] ? b[j] : 0) { + b[j] = d[j - 2]; + a = d[j]; + } +} diff --git a/gcc/testsuite/gcc.dg/vect/pr117484-2.c b/gcc/testsuite/gcc.dg/vect/pr117484-2.c new file mode 100644 index 00000000000..baffe7597ba --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr117484-2.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ + +int a; +extern int d[]; +extern int b[]; +extern _Bool c[]; +extern char h[]; +int main() +{ + for (int i = 0; i < 1024; i += 4) + if (h[i] || c[i]) + { + a = d[i]; + b[i] = d[i - 3]; + } +} diff --git a/gcc/tree-vect-slp-patterns.cc b/gcc/tree-vect-slp-patterns.cc index 8adae8a6ec0..d62682be43c 100644 --- a/gcc/tree-vect-slp-patterns.cc +++ b/gcc/tree-vect-slp-patterns.cc @@ -221,9 +221,15 @@ linear_loads_p (slp_tree_to_load_perm_map_t *perm_cache, slp_tree root) perm_cache->put (root, retval); /* If it's a load node, then just read the load permute. */ - if (SLP_TREE_LOAD_PERMUTATION (root).exists ()) + if (SLP_TREE_DEF_TYPE (root) == vect_internal_def + && SLP_TREE_CODE (root) != VEC_PERM_EXPR + && STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (root)) + && DR_IS_READ (STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (root)))) { - retval = is_linear_load_p (SLP_TREE_LOAD_PERMUTATION (root)); + if (SLP_TREE_LOAD_PERMUTATION (root).exists ()) + retval = is_linear_load_p (SLP_TREE_LOAD_PERMUTATION (root)); + else + retval = PERM_EVENODD; perm_cache->put (root, retval); return retval; } @@ -798,8 +804,8 @@ compatible_complex_nodes_p (slp_compat_nodes_map_t *compat_cache, return false; } - if (!SLP_TREE_LOAD_PERMUTATION (a).exists () - || !SLP_TREE_LOAD_PERMUTATION (b).exists ()) + if (!STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (a)) + || !STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (b))) { for (unsigned i = 0; i < gimple_num_args (a_stmt); i++) { diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index ffe9e718575..8e4ad05e2a4 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -2004,14 +2004,15 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, = STMT_VINFO_GROUPED_ACCESS (stmt_info) ? DR_GROUP_FIRST_ELEMENT (stmt_info) : stmt_info; bool any_permute = false; - bool any_null = false; FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), j, load_info) { int load_place; if (! load_info) { - load_place = j; - any_null = true; + if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) + load_place = j; + else + load_place = 0; } else if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) load_place = vect_get_place_in_interleaving_chain @@ -2022,11 +2023,6 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, any_permute |= load_place != j; load_permutation.quick_push (load_place); } - if (any_null) - { - gcc_assert (!any_permute); - load_permutation.release (); - } if (gcall *stmt = dyn_cast (stmt_info->stmt)) { @@ -2081,6 +2077,8 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, followed by 'node' being the desired final permutation. */ if (unperm_load) { + gcc_assert + (!SLP_TREE_LOAD_PERMUTATION (unperm_load).exists ()); lane_permutation_t lperm; lperm.create (group_size); for (unsigned j = 0; j < load_permutation.length (); ++j) @@ -2101,6 +2099,10 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, } else { + if (!any_permute + && STMT_VINFO_GROUPED_ACCESS (stmt_info) + && group_size == DR_GROUP_SIZE (first_stmt_info)) + load_permutation.release (); SLP_TREE_LOAD_PERMUTATION (node) = load_permutation; return node; } @@ -2675,7 +2677,8 @@ out: tree op0; tree uniform_val = op0 = oprnd_info->ops[0]; for (j = 1; j < oprnd_info->ops.length (); ++j) - if (!operand_equal_p (uniform_val, oprnd_info->ops[j])) + if (oprnd_info->ops[j] + && !operand_equal_p (uniform_val, oprnd_info->ops[j])) { uniform_val = NULL_TREE; break; @@ -4510,6 +4513,7 @@ vect_lower_load_permutations (loop_vec_info loop_vinfo, group_lanes, &max_nunits, matches, &limit, &tree_size, bst_map); + gcc_assert (!SLP_TREE_LOAD_PERMUTATION (l0).exists ()); if (ld_lanes_lanes != 0) {