From patchwork Tue May 21 12:44:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1937399 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=dlCjUm9a; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=yrZOLFq3; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=2X1axU3I; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=DTKDrdEq; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VkDfl3DlXz1ynR for ; Tue, 21 May 2024 22:45:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A19A9384AB6F for ; Tue, 21 May 2024 12:45:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 2EBB33858D1E for ; Tue, 21 May 2024 12:44:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2EBB33858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2EBB33858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295482; cv=none; b=oA9Imvf6K+uP46qV/8i6phVd/7LLyv6qXoDE2KTrTIWkdmNQT7ki97jN6+outW6gq1SweXYpC87IyuDXjKXI/8CDlQ0Acjiku8n5ZCspyVCVhb06Mj1889ItI9PqagFS3dRU7zcSGx0BSEm5OfeCR9ncW9O8q7arkL3omnXcY9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295482; c=relaxed/simple; bh=I7twqwuYx2UVoMuJeng9mkgGHyjD/CmmJF47TmLOrfQ=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=MTEFmySn8TaEQH9rE5CmZCeF0HslxdxakqvxfNZHrQSXVFjGtgmmsHz1kQJZct24UdpgvhQBNeKrU2QyoQGSlMxqLkrqMnQFSpqPezZZnfIxAO3tuiNGvxmU2mf72O6F6fqZZGGDz+IpMcbbr9wme1Kq3sdRnse+YqSr/b+Z8hQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E6E2D34727 for ; Tue, 21 May 2024 12:44:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295479; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=64/pFKC5NfNyagCMBO7Wq8LETPlDbiwhtXRG8YO/h+s=; b=dlCjUm9ayl4dHBX5Nq3wICrXniBTyV4+R8eceaKfhVdWIjFG28Ho/ya+gNRW4byQWcTjbz xMoyeTd5KrwAYFMnCFlHMM1hhwXo3JqYssPPbwXBd30vCCx4Pz4O/tRyYay3JdRP51NWjI EYUiQsSklyfNbvoQrxnjqo76OyiNBmw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295479; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=64/pFKC5NfNyagCMBO7Wq8LETPlDbiwhtXRG8YO/h+s=; b=yrZOLFq3eNeEzHdVOfjmJ2mecAvoAHxcAX6Yk1kLBNYb9W0mP1BptDtzMvpCVw1iKyB7cn TM3JfzRCaprjQXCQ== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=2X1axU3I; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=DTKDrdEq DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295478; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=64/pFKC5NfNyagCMBO7Wq8LETPlDbiwhtXRG8YO/h+s=; b=2X1axU3IWe1o/RjNmBtk7YkPOhld+AW2/LMtDptYc4uR55ZopjYOyngU8z++P4dd/wsy4d n0cLP8FUNSLUSoux2RrmrX6x1o8nn/X0bGIoPxo6SRAQhOGwGWozANHWXqBfjsevbU+30T 5+mwxjCDkekujtgSoFGUu6WB9MxuK3A= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295478; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=64/pFKC5NfNyagCMBO7Wq8LETPlDbiwhtXRG8YO/h+s=; b=DTKDrdEqExkqEX/HqwYjy6GvZs1+3K4J4pk03W22oFjaHrlB3D8Y5WxMonmsLPrPzllCfl 0NsDyviD92YyCiAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id CDADB13A21 for ; Tue, 21 May 2024 12:44:38 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 9I+aMDaXTGYJcAAAD6G6ig (envelope-from ) for ; Tue, 21 May 2024 12:44:38 +0000 Date: Tue, 21 May 2024 14:44:34 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/4] Avoid requiring VEC_PERM represenatives MIME-Version: 1.0 Message-Id: <20240521124438.CDADB13A21@imap1.dmz-prg2.suse.org> X-Spam-Score: -6.34 X-Rspamd-Action: no action X-Rspamd-Queue-Id: E6E2D34727 X-Spam-Level: X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-6.34 / 50.00]; BAYES_HAM(-2.83)[99.27%]; DWL_DNSWL_MED(-2.00)[suse.de:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCPT_COUNT_ONE(0.00)[1]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[gcc-patches@gcc.gnu.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; MISSING_XM_UA(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo, imap1.dmz-prg2.suse.org:rdns, suse.de:dkim] X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The following plugs one hole where we require a VEC_PERM node representative unnecessarily. This is for vect_check_store_rhs which looks at the RHS and checks whether a constant can be native encoded. The fix is to guard that with vect_constant_def additionally and making vect_is_simple_use forgiving for a missing SLP_TREE_REPRESENTATIVE when the child is a VEC_PERM node, initializing the scalar def to error_mark_node. * tree-vect-stmts.cc (vect_check_store_rhs): Look at *rhs only when it's a vec_constant_def. (vect_is_simple_use): When we have no representative for an internal node, fill in *op with error_mark_node. --- gcc/tree-vect-stmts.cc | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 672959501bb..4219ad832db 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2553,7 +2553,8 @@ vect_check_store_rhs (vec_info *vinfo, stmt_vec_info stmt_info, /* In the case this is a store from a constant make sure native_encode_expr can handle it. */ - if (CONSTANT_CLASS_P (*rhs) && native_encode_expr (*rhs, NULL, 64) == 0) + if (rhs_dt == vect_constant_def + && CONSTANT_CLASS_P (*rhs) && native_encode_expr (*rhs, NULL, 64) == 0) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -14002,8 +14003,26 @@ vect_is_simple_use (vec_info *vinfo, stmt_vec_info stmt, slp_tree slp_node, *vectype = SLP_TREE_VECTYPE (child); if (SLP_TREE_DEF_TYPE (child) == vect_internal_def) { - *op = gimple_get_lhs (SLP_TREE_REPRESENTATIVE (child)->stmt); - return vect_is_simple_use (*op, vinfo, dt, def_stmt_info_out); + /* ??? VEC_PERM nodes might be intermediate and their lane value + have no representative (nor do we build a VEC_PERM stmt for + the actual operation). Note for two-operator nodes we set + a representative but leave scalar stmts empty as we'd only + have one for a subset of lanes. Ideally no caller would + require *op for internal defs. */ + if (SLP_TREE_REPRESENTATIVE (child)) + { + *op = gimple_get_lhs (SLP_TREE_REPRESENTATIVE (child)->stmt); + return vect_is_simple_use (*op, vinfo, dt, def_stmt_info_out); + } + else + { + gcc_assert (SLP_TREE_CODE (child) == VEC_PERM_EXPR); + *op = error_mark_node; + *dt = vect_internal_def; + if (def_stmt_info_out) + *def_stmt_info_out = NULL; + return true; + } } else { From patchwork Tue May 21 12:44:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1937400 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=fQtxlDUH; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=sHuPNG84; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=fQtxlDUH; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=sHuPNG84; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VkDfx221gz1ydW for ; Tue, 21 May 2024 22:45:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8FB663857C4F for ; Tue, 21 May 2024 12:45:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 68CAD385840B for ; Tue, 21 May 2024 12:44:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 68CAD385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 68CAD385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295492; cv=none; b=mI25pbHubq9phg7AV8CBbrXOCu7/cJvIUzHx1Rr0ze7F64tQMZhpGAlcU6pgcJbM9s2xCPqDslM43oX9kEncFIF2+mr68fE2Jk7jaaPy/UXIRGd78CvUW+fzg/yUk6xXihcLBJdA0ueDfpNyWzJDXRM4ctoGJeyg04MDGUtCPTU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295492; c=relaxed/simple; bh=lFKQ5u4eukQl4TVdHBAPE9YkPW6eKqehKO/3aUKlvKw=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=rw7lYnTNOBZTmUPB40ggqc4KFwtAb1APGlB+F/PLE378jr5i9ovT33JshhcXbZyqjsHOw4BEquQ0+36CGOd1R7v9isGhCXiGM4yFXBW55A+8cJgRhaGrxMIpWUypef564mdczDX/fDfQyX6Dp7ksjB3czStT/S0wtTLZzJZOUbw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6460E5C139 for ; Tue, 21 May 2024 12:44:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295489; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=ZIBc1Vn8ayWRGsrTlobfPAP1ZlUIG82cUt6TkPMPAyM=; b=fQtxlDUHyCPtGvKSk3DujvYEIoBHjsUBNotyKwqYchfop/BWGYZCFPmESL1Q/GGgYVxQbz ooY+nwaQmb9TJi/o/iYldZVyrVSdeDmLqQSHcqa0QahwYU7AhHhdzA0z3dledyOJPSU4iS ISNZWN/xZhocj7+mwPqzQPHdHHW0X/A= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295489; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=ZIBc1Vn8ayWRGsrTlobfPAP1ZlUIG82cUt6TkPMPAyM=; b=sHuPNG84b3wlqi+ZeH8BYvPsE2eVl/3QvIbtLVoE9jRH5F0s67yy3DxK8+QQy4QfH1s8fN d+NdpvjGID0CVrBA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295489; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=ZIBc1Vn8ayWRGsrTlobfPAP1ZlUIG82cUt6TkPMPAyM=; b=fQtxlDUHyCPtGvKSk3DujvYEIoBHjsUBNotyKwqYchfop/BWGYZCFPmESL1Q/GGgYVxQbz ooY+nwaQmb9TJi/o/iYldZVyrVSdeDmLqQSHcqa0QahwYU7AhHhdzA0z3dledyOJPSU4iS ISNZWN/xZhocj7+mwPqzQPHdHHW0X/A= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295489; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=ZIBc1Vn8ayWRGsrTlobfPAP1ZlUIG82cUt6TkPMPAyM=; b=sHuPNG84b3wlqi+ZeH8BYvPsE2eVl/3QvIbtLVoE9jRH5F0s67yy3DxK8+QQy4QfH1s8fN d+NdpvjGID0CVrBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 4A5F513A21 for ; Tue, 21 May 2024 12:44:49 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id NgCCEEGXTGZIcwAAD6G6ig (envelope-from ) for ; Tue, 21 May 2024 12:44:49 +0000 Date: Tue, 21 May 2024 14:44:44 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/4] Avoid SLP_REPRESENTATIVE access for VEC_PERM in SLP scheduling MIME-Version: 1.0 Message-Id: <20240521124449.4A5F513A21@imap1.dmz-prg2.suse.org> X-Spam-Level: X-Spamd-Result: default: False [-4.18 / 50.00]; BAYES_HAM(-2.88)[99.50%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_ONE(0.00)[1]; ARC_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; MISSING_XM_UA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_NONE(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[gcc-patches@gcc.gnu.org]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo] X-Spam-Score: -4.18 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org SLP permute nodes can end up without a SLP_REPRESENTATIVE now, the following avoids touching it in this case in vect_schedule_slp_node. * tree-vect-slp.cc (vect_schedule_slp_node): Avoid looking at SLP_REPRESENTATIVE for VEC_PERM nodes. --- gcc/tree-vect-slp.cc | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index f34ed54a70b..43f2c153bf0 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -9301,13 +9301,8 @@ vect_schedule_slp_node (vec_info *vinfo, gcc_assert (SLP_TREE_NUMBER_OF_VEC_STMTS (node) != 0); SLP_TREE_VEC_DEFS (node).create (SLP_TREE_NUMBER_OF_VEC_STMTS (node)); - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, - "------>vectorizing SLP node starting from: %G", - stmt_info->stmt); - - if (STMT_VINFO_DATA_REF (stmt_info) - && SLP_TREE_CODE (node) != VEC_PERM_EXPR) + if (SLP_TREE_CODE (node) != VEC_PERM_EXPR + && STMT_VINFO_DATA_REF (stmt_info)) { /* Vectorized loads go before the first scalar load to make it ready early, vectorized stores go before the last scalar @@ -9319,10 +9314,10 @@ vect_schedule_slp_node (vec_info *vinfo, last_stmt_info = vect_find_last_scalar_stmt_in_slp (node); si = gsi_for_stmt (last_stmt_info->stmt); } - else if ((STMT_VINFO_TYPE (stmt_info) == cycle_phi_info_type - || STMT_VINFO_TYPE (stmt_info) == induc_vec_info_type - || STMT_VINFO_TYPE (stmt_info) == phi_info_type) - && SLP_TREE_CODE (node) != VEC_PERM_EXPR) + else if (SLP_TREE_CODE (node) != VEC_PERM_EXPR + && (STMT_VINFO_TYPE (stmt_info) == cycle_phi_info_type + || STMT_VINFO_TYPE (stmt_info) == induc_vec_info_type + || STMT_VINFO_TYPE (stmt_info) == phi_info_type)) { /* For PHI node vectorization we do not use the insertion iterator. */ si = gsi_none (); @@ -9456,6 +9451,9 @@ vect_schedule_slp_node (vec_info *vinfo, /* Handle purely internal nodes. */ if (SLP_TREE_CODE (node) == VEC_PERM_EXPR) { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "------>vectorizing SLP permutation node\n"); /* ??? the transform kind is stored to STMT_VINFO_TYPE which might be shared with different SLP nodes (but usually it's the same operation apart from the case the stmt is only there for denoting @@ -9474,7 +9472,13 @@ vect_schedule_slp_node (vec_info *vinfo, } } else - vect_transform_stmt (vinfo, stmt_info, &si, node, instance); + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "------>vectorizing SLP node starting from: %G", + stmt_info->stmt); + vect_transform_stmt (vinfo, stmt_info, &si, node, instance); + } } /* Replace scalar calls from SLP node NODE with setting of their lhs to zero. From patchwork Tue May 21 12:45:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1937401 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=nEoLqK6j; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=u94XInPv; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=nEoLqK6j; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=u94XInPv; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VkDgy57gQz1ydW for ; Tue, 21 May 2024 22:46:06 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 09AA53858CD1 for ; Tue, 21 May 2024 12:46:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id 9C71E3858CD1 for ; Tue, 21 May 2024 12:45:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C71E3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9C71E3858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295521; cv=none; b=UBYHUfkaI3skjVS0JDvGNJRgpUlNi44qQa/0QN8CkoQ7gKAODAC+cbDkU/hbz7gTK4+yL/cswQRrljx8A9WwtChfw+MEbc0YsZe6hAxVw/cr9/AcpeTk+WYaWyGlGaBa6frpoSmqsSFdGHt1fTD+T8OK7FlFbEHHAqOMWP+7Kk8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295521; c=relaxed/simple; bh=T+yn/FOXVc9MI//sQn59VOC/q5NzZ/6nVS7EDNrIc08=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=JIDUmALKFWjwXLobRNGrVXllyiuwnyTONmJCSuOMo9NMjGYtucm9kTfzzVlovf3wUB+RN+Er6dF0tvDXimW9xbf6buh5ulk0AHTspZ6SA1FNPsYRso2u2cq4+c7hvOTwU5HlzXa50DaBofOxgBZ9G76Rum/5jsSoqFVbojqOZrk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9E1483472A; Tue, 21 May 2024 12:45:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5veh9rZLLcCXtz2836kENn5/V9pIyMK5nnGT4QOGpyA=; b=nEoLqK6jQrbQy1A5Q3g7NSDVz4s+qEkr3RhSpEWrnhXLO/XbyWoUoP+hLPtMxc/XXUMZdO nXREJBkMab+aC9Lg/tFv9viaiX9t/1UMprg+KeadnYs0G5+ROoo0L7KeoPicg00Ib/6Z62 cZcE+yNM/az8TN7urCIelnvrXKLKDwA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5veh9rZLLcCXtz2836kENn5/V9pIyMK5nnGT4QOGpyA=; b=u94XInPvp23LqxDiG0DTXvG2wtT6swlNpYS5vijuVxuDec3Kulr7FJegM8o0zUMTuTPKcN FTZt7Hg7IQWdd2Ag== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5veh9rZLLcCXtz2836kENn5/V9pIyMK5nnGT4QOGpyA=; b=nEoLqK6jQrbQy1A5Q3g7NSDVz4s+qEkr3RhSpEWrnhXLO/XbyWoUoP+hLPtMxc/XXUMZdO nXREJBkMab+aC9Lg/tFv9viaiX9t/1UMprg+KeadnYs0G5+ROoo0L7KeoPicg00Ib/6Z62 cZcE+yNM/az8TN7urCIelnvrXKLKDwA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5veh9rZLLcCXtz2836kENn5/V9pIyMK5nnGT4QOGpyA=; b=u94XInPvp23LqxDiG0DTXvG2wtT6swlNpYS5vijuVxuDec3Kulr7FJegM8o0zUMTuTPKcN FTZt7Hg7IQWdd2Ag== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 777CB13A21; Tue, 21 May 2024 12:45:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id EUiLG12XTGYCfAAAD6G6ig (envelope-from ); Tue, 21 May 2024 12:45:17 +0000 Date: Tue, 21 May 2024 14:45:17 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com, tamar.christina@arm.com Subject: [PATCH 3/4] Avoid splitting store dataref groups during SLP discovery MIME-Version: 1.0 Message-Id: <20240521124517.777CB13A21@imap1.dmz-prg2.suse.org> X-Spam-Score: -4.30 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MISSING_XM_UA(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo] X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int x[1024], y[1024], z[1024], w[1024]; void foo (void) { for (int i = 0; i < 256; i++) { x[4*i+0] = y[2*i+0]; x[4*i+1] = y[2*i+1]; x[4*i+2] = z[i]; x[4*i+3] = w[i]; } } which was previously using hybrid SLP can now be fully SLPed and SSE code generated looks better (but of course you never know, I didn't actually benchmark). We of course need a VF of four here. .L2: movdqa z(%rax), %xmm0 movdqa w(%rax), %xmm4 movdqa y(%rax,%rax), %xmm2 movdqa y+16(%rax,%rax), %xmm1 movdqa %xmm0, %xmm3 punpckhdq %xmm4, %xmm0 punpckldq %xmm4, %xmm3 movdqa %xmm2, %xmm4 shufps $238, %xmm3, %xmm2 movaps %xmm2, x+16(,%rax,4) movdqa %xmm1, %xmm2 shufps $68, %xmm3, %xmm4 shufps $68, %xmm0, %xmm2 movaps %xmm4, x(,%rax,4) shufps $238, %xmm0, %xmm1 movaps %xmm2, x+32(,%rax,4) movaps %xmm1, x+48(,%rax,4) addq $16, %rax cmpq $1024, %rax jne .L2 The extra permute nodes merging distinct branches of the SLP tree might be unexpected for some code, esp. since SLP_TREE_REPRESENTATIVE cannot be meaningfully set and we cannot populate SLP_TREE_SCALAR_STMTS or SLP_TREE_SCALAR_OPS consistently as we can have a mix of both. The patch keeps the sub-trees form consecutive lanes but that's in principle not necessary if we for example have an even/odd split which now would result in N single-lane sub-trees. That's left for future improvements. The interesting part is how VLA vector ISAs handle merging of two vectors that's not trivial even/odd merging. The strathegy of how to build the permute tree might need adjustments for that (in the end splitting each branch to single lanes and then doing even/odd merging would be the brute-force fallback). Not sure how much we can or should rely on the SLP optimize pass to handle this. * tree-vect-slp.cc (vect_build_slp_instance): Do not split store dataref groups on loop SLP discovery failure but create a single SLP instance for the stores but branch to SLP sub-trees and merge with a series of VEC_PERM nodes. --- gcc/tree-vect-slp.cc | 240 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 214 insertions(+), 26 deletions(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 43f2c153bf0..873748b0a72 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -3468,12 +3468,7 @@ vect_build_slp_instance (vec_info *vinfo, return true; } } - else - { - /* Failed to SLP. */ - /* Free the allocated memory. */ - scalar_stmts.release (); - } + /* Failed to SLP. */ stmt_vec_info stmt_info = stmt_info_; /* Try to break the group up into pieces. */ @@ -3491,6 +3486,9 @@ vect_build_slp_instance (vec_info *vinfo, if (is_a (vinfo) && (i > 1 && i < group_size)) { + /* Free the allocated memory. */ + scalar_stmts.release (); + tree scalar_type = TREE_TYPE (DR_REF (STMT_VINFO_DATA_REF (stmt_info))); tree vectype = get_vectype_for_scalar_type (vinfo, scalar_type, @@ -3535,38 +3533,228 @@ vect_build_slp_instance (vec_info *vinfo, } } - /* For loop vectorization split into arbitrary pieces of size > 1. */ - if (is_a (vinfo) - && (i > 1 && i < group_size) - && !vect_slp_prefer_store_lanes_p (vinfo, stmt_info, group_size, i)) + /* For loop vectorization split the RHS into arbitrary pieces of + size >= 1. */ + else if (is_a (vinfo) + && (i > 0 && i < group_size) + && !vect_slp_prefer_store_lanes_p (vinfo, + stmt_info, group_size, i)) { - unsigned group1_size = i; - if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "Splitting SLP group at stmt %u\n", i); - stmt_vec_info rest = vect_split_slp_store_group (stmt_info, - group1_size); - /* Loop vectorization cannot handle gaps in stores, make sure - the split group appears as strided. */ - STMT_VINFO_STRIDED_P (rest) = 1; - DR_GROUP_GAP (rest) = 0; - STMT_VINFO_STRIDED_P (stmt_info) = 1; - DR_GROUP_GAP (stmt_info) = 0; + /* Analyze the stored values and pinch them together with + a permute node so we can preserve the whole store group. */ + auto_vec rhs_nodes; + + /* Calculate the unrolling factor based on the smallest type. */ + poly_uint64 unrolling_factor = 1; + + unsigned int start = 0, end = i; + while (start < group_size) + { + gcc_assert (end - start >= 1); + vec substmts; + substmts.create (end - start); + for (unsigned j = start; j < end; ++j) + substmts.quick_push (scalar_stmts[j]); + max_nunits = 1; + node = vect_build_slp_tree (vinfo, substmts, end - start, + &max_nunits, + matches, limit, &tree_size, bst_map); + if (node) + { + /* ??? Possibly not safe, but not sure how to check + and fail SLP build? */ + unrolling_factor + = force_common_multiple (unrolling_factor, + calculate_unrolling_factor + (max_nunits, end - start)); + rhs_nodes.safe_push (node); + start = end; + end = group_size; + } + else + { + substmts.release (); + if (end - start == 1) + { + /* Single-lane discovery failed. Free ressources. */ + for (auto node : rhs_nodes) + vect_free_slp_tree (node); + scalar_stmts.release (); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "SLP discovery failed\n"); + return false; + } + + /* ??? It really happens that we soft-fail SLP + build at a mismatch but the matching part hard-fails + later. As we know we arrived here with a group + larger than one try a group of size one! */ + if (!matches[0]) + end = start + 1; + else + for (unsigned j = start; j < end; j++) + if (!matches[j - start]) + { + end = j; + break; + } + } + } + + /* Now we assume we can build the root SLP node from all + stores. */ + node = vect_create_new_slp_node (scalar_stmts, 1); + SLP_TREE_VECTYPE (node) = SLP_TREE_VECTYPE (rhs_nodes[0]); + /* And a permute merging all RHS SLP trees. */ + slp_tree perm = vect_create_new_slp_node (rhs_nodes.length (), + VEC_PERM_EXPR); + SLP_TREE_CHILDREN (node).quick_push (perm); + SLP_TREE_LANE_PERMUTATION (perm).create (group_size); + SLP_TREE_VECTYPE (perm) = SLP_TREE_VECTYPE (node); + SLP_TREE_LANES (perm) = group_size; + /* ??? We should set this NULL but that's not expected. */ + SLP_TREE_REPRESENTATIVE (perm) + = SLP_TREE_REPRESENTATIVE (SLP_TREE_CHILDREN (rhs_nodes[0])[0]); + for (unsigned j = 0; j < rhs_nodes.length (); ++j) + { + SLP_TREE_CHILDREN (perm) + .quick_push (SLP_TREE_CHILDREN (rhs_nodes[j])[0]); + for (unsigned k = 0; + k < SLP_TREE_SCALAR_STMTS (rhs_nodes[j]).length (); ++k) + { + /* ??? We should populate SLP_TREE_SCALAR_STMTS + or SLP_TREE_SCALAR_OPS but then we might have + a mix of both in our children. */ + SLP_TREE_LANE_PERMUTATION (perm) + .quick_push (std::make_pair (j, k)); + } + } + + /* Now we have a single permute node but we cannot code-generate + the case with more than two inputs. + Perform pairwise reduction, reducing the two inputs + with the least number of lanes to one and then repeat until + we end up with two inputs. That scheme makes sure we end + up with permutes satisfying the restriction of requiring + at most two vector inputs to produce a single vector output. */ + while (SLP_TREE_CHILDREN (perm).length () > 2) + { + /* Pick the two nodes with the least number of lanes, + prefer the earliest candidate and maintain ai < bi. */ + int ai = -1; + int bi = -1; + for (unsigned ci = 0; + ci < SLP_TREE_CHILDREN (perm).length (); ++ci) + { + if (ai == -1) + ai = ci; + else if (bi == -1) + bi = ci; + else if ((SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[ci]) + < SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[ai])) + || (SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[ci]) + < SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[bi]))) + { + if (SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[ai]) + <= SLP_TREE_LANES (SLP_TREE_CHILDREN (perm)[bi])) + bi = ci; + else + { + ai = bi; + bi = ci; + } + } + } + + /* Produce a merge of nodes ai and bi. */ + slp_tree a = SLP_TREE_CHILDREN (perm)[ai]; + slp_tree b = SLP_TREE_CHILDREN (perm)[bi]; + unsigned n = SLP_TREE_LANES (a) + SLP_TREE_LANES (b); + slp_tree permab = vect_create_new_slp_node (2, VEC_PERM_EXPR); + SLP_TREE_LANES (permab) = n; + SLP_TREE_LANE_PERMUTATION (permab).create (n); + SLP_TREE_VECTYPE (permab) = SLP_TREE_VECTYPE (perm); /* ??? */ + /* ??? We should set this NULL but that's not expected. */ + SLP_TREE_REPRESENTATIVE (permab) = SLP_TREE_REPRESENTATIVE (perm); + SLP_TREE_CHILDREN (permab).quick_push (a); + for (unsigned k = 0; k < SLP_TREE_LANES (a); ++k) + SLP_TREE_LANE_PERMUTATION (permab) + .quick_push (std::make_pair (0, k)); + SLP_TREE_CHILDREN (permab).quick_push (b); + for (unsigned k = 0; k < SLP_TREE_LANES (b); ++k) + SLP_TREE_LANE_PERMUTATION (permab) + .quick_push (std::make_pair (1, k)); + + /* Put the merged node into 'perm', in place of a */ + SLP_TREE_CHILDREN (perm)[ai] = permab; + /* Adjust the references to b in the permutation + of perm and to the later children which we'll + remove. */ + for (unsigned k = 0; k < SLP_TREE_LANES (perm); ++k) + { + std::pair &p + = SLP_TREE_LANE_PERMUTATION (perm)[k]; + if (p.first == (unsigned) bi) + { + p.first = ai; + p.second += SLP_TREE_LANES (a); + } + else if (p.first > (unsigned) bi) + p.first--; + } + SLP_TREE_CHILDREN (perm).ordered_remove (bi); + } + + /* Create a new SLP instance. */ + slp_instance new_instance = XNEW (class _slp_instance); + SLP_INSTANCE_TREE (new_instance) = node; + SLP_INSTANCE_UNROLLING_FACTOR (new_instance) = unrolling_factor; + SLP_INSTANCE_LOADS (new_instance) = vNULL; + SLP_INSTANCE_ROOT_STMTS (new_instance) = root_stmt_infos; + SLP_INSTANCE_REMAIN_DEFS (new_instance) = remain; + SLP_INSTANCE_KIND (new_instance) = kind; + new_instance->reduc_phis = NULL; + new_instance->cost_vec = vNULL; + new_instance->subgraph_entries = vNULL; + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "SLP size %u vs. limit %u.\n", + tree_size, max_tree_size); - bool res = vect_analyze_slp_instance (vinfo, bst_map, stmt_info, - kind, max_tree_size, limit); - if (i + 1 < group_size) - res |= vect_analyze_slp_instance (vinfo, bst_map, - rest, kind, max_tree_size, limit); + vinfo->slp_instances.safe_push (new_instance); + + /* ??? We've replaced the old SLP_INSTANCE_GROUP_SIZE with + the number of scalar stmts in the root in a few places. + Verify that assumption holds. */ + gcc_assert (SLP_TREE_SCALAR_STMTS (SLP_INSTANCE_TREE (new_instance)) + .length () == group_size); - return res; + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, + "Final SLP tree for instance %p:\n", + (void *) new_instance); + vect_print_slp_graph (MSG_NOTE, vect_location, + SLP_INSTANCE_TREE (new_instance)); + } + return true; } + else + /* Free the allocated memory. */ + scalar_stmts.release (); /* Even though the first vector did not all match, we might be able to SLP (some) of the remainder. FORNOW ignore this possibility. */ } + else + /* Free the allocated memory. */ + scalar_stmts.release (); /* Failed to SLP. */ if (dump_enabled_p ()) From patchwork Tue May 21 12:48:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1937404 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=hK3lLFJp; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=RNxrSWIE; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=hK3lLFJp; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=RNxrSWIE; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VkDl370b7z20dK for ; Tue, 21 May 2024 22:48:47 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0520F384B112 for ; Tue, 21 May 2024 12:48:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id 2386C3858CD1 for ; Tue, 21 May 2024 12:48:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2386C3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2386C3858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295707; cv=none; b=p0vVir6UvDRd1xYTIWNqgYafMUjpGGVzZarw/5/DQ4iZNVoy1tZLQ60HLz33tPv9GRoiZr1XKQsQS2gTKO7k325OSkca8TTbe5T3EcKBnSMmvoB0DubYZvBttuC2Fg3Z5+gW38ix1Jq0rkBbKdfltJScT0pfPAFMmj/cPRt/Pzo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716295707; c=relaxed/simple; bh=qOJ6Vle6zSoszBpvqMmOKqVXtnYKRt+/vd+jasYVFCA=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=KJauWr4iICT/u4KXJH7s3kXF1EZFejIsG0Jfy7GVKdxNT3DVhshOhdAvw/ijzZ0ir7a+oH6q8ohbCRAz3liJ3gdoUl+pdZCSYVhpa/x7hFF2PhdLYVXKwRyLU2ee4YlDSssocNkqLyC1YJ2AfWXEhQTaF/iVi/nikGsONslZhaE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 191B5222F7; Tue, 21 May 2024 12:48:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295703; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lmecT0mEFJA9gqFqutDYmOmYBo4dilsV5EIUZpSIBrk=; b=hK3lLFJpC47z89EXHzYQvMluEZCZZ5E4UH5sCxpNbEshDBGzKnoouMx0ZRqXrNgZCqEHdx U0Uyzg9xxhDWp44PV7aGoEx/ehN7Lgk8ECKFFlJrKJv+Idf6hwRNyXMtVke0zSFgLEY6vu /Fw+G5LBMTud6ELuG9MOC3pFBTlSTrQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295703; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lmecT0mEFJA9gqFqutDYmOmYBo4dilsV5EIUZpSIBrk=; b=RNxrSWIEVOLYx+bOaW4qd6/+4d8BCaoc9DI6fiyv6+bFryR2FT5ucFR/uxjk8o3wp/auuR VPbMbMXaNU0bxeDA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1716295703; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lmecT0mEFJA9gqFqutDYmOmYBo4dilsV5EIUZpSIBrk=; b=hK3lLFJpC47z89EXHzYQvMluEZCZZ5E4UH5sCxpNbEshDBGzKnoouMx0ZRqXrNgZCqEHdx U0Uyzg9xxhDWp44PV7aGoEx/ehN7Lgk8ECKFFlJrKJv+Idf6hwRNyXMtVke0zSFgLEY6vu /Fw+G5LBMTud6ELuG9MOC3pFBTlSTrQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1716295703; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=lmecT0mEFJA9gqFqutDYmOmYBo4dilsV5EIUZpSIBrk=; b=RNxrSWIEVOLYx+bOaW4qd6/+4d8BCaoc9DI6fiyv6+bFryR2FT5ucFR/uxjk8o3wp/auuR VPbMbMXaNU0bxeDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E5A0313A1E; Tue, 21 May 2024 12:48:22 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id ZDhnNhaYTGayNgAAD6G6ig (envelope-from ); Tue, 21 May 2024 12:48:22 +0000 Date: Tue, 21 May 2024 14:48:14 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: tamar.christina@arm.com, richard.sandiford@arm.com Subject: [PATCH 4/4] Testsuite updates MIME-Version: 1.0 Message-Id: <20240521124822.E5A0313A1E@imap1.dmz-prg2.suse.org> X-Spam-Score: -4.30 X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MISSING_XM_UA(0.00)[]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo] X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The gcc.dg/vect/slp-12a.c case is interesting as we currently split the 8 store group into lanes 0-5 which we SLP with an unroll factor of two (on x86-64 with SSE) and the remaining two lanes are using interleaving vectorization with a final unroll factor of four. Thus we're using hybrid SLP within a single store group. After the change we discover the same 0-5 lane SLP part as well as two single-lane parts feeding the full store group. But that results in a load permutation that isn't supported (I have WIP patchs to rectify that). So we end up cancelling SLP and vectorizing the whole loop with interleaving which is IMO good and results in better code. This is similar for gcc.target/i386/pr52252-atom.c where interleaving generates much better code than hybrid SLP. I'm unsure how to update the testcase though. gcc.dg/vect/slp-21.c runs into similar situations. Note that when when analyzing SLP operations we discard an instance we currently force the full loop to have no SLP because hybrid detection is broken. It's probably not worth fixing this at this moment. For gcc.dg/vect/pr97428.c we are not splitting the 16 store group into two but merge the two 8 lane loads into one before doing the store and thus have only a single SLP instance. A similar situation happens in gcc.dg/vect/slp-11c.c but the branches feeding the single SLP store only have a single lane. Likewise for gcc.dg/vect/vect-complex-5.c and gcc.dg/vect/vect-gather-2.c. gcc.dg/vect/slp-cond-1.c has an additional SLP vectorization with a SLP store group of size two but two single-lane branches. gcc.target/i386/pr98928.c ICEs in SLP permute optimization because we don't expect a constant and internal branch to be merged with a permute node in vect_optimize_slp_pass::change_vec_perm_layout:4859 (the only permutes merging two SLP nodes are two-operator nodes right now). This still requires fixing. The whole series has been bootstrapped and tested on x86_64-unknown-linux-gnu with the gcc.target/i386/pr98928.c FAIL unfixed. Comments welcome (and hello ARM CI), RISC-V and other arch testing appreciated. Unless there are comments to the contrary I plan to push patch 1 and 2 tomorrow. Thanks, Richard. * gcc.dg/vect/pr97428.c: Expect a single store SLP group. * gcc.dg/vect/slp-11c.c: Likewise. * gcc.dg/vect/vect-complex-5.c: Likewise. * gcc.dg/vect/slp-12a.c: Do not expect SLP. * gcc.dg/vect/slp-21.c: Likewise. * gcc.dg/vect/slp-cond-1.c: Expect one more SLP. * gcc.dg/vect/vect-gather-2.c: Expect SLP to be used. * gcc.target/i386/pr52252-atom.c: XFAIL test for palignr. --- gcc/testsuite/gcc.dg/vect/pr97428.c | 2 +- gcc/testsuite/gcc.dg/vect/slp-11c.c | 5 +++-- gcc/testsuite/gcc.dg/vect/slp-12a.c | 6 +++++- gcc/testsuite/gcc.dg/vect/slp-21.c | 19 +++++-------------- gcc/testsuite/gcc.dg/vect/slp-cond-1.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-complex-5.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-gather-2.c | 1 - gcc/testsuite/gcc.target/i386/pr52252-atom.c | 3 ++- 8 files changed, 18 insertions(+), 22 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c b/gcc/testsuite/gcc.dg/vect/pr97428.c index 60dd984cfd3..3cc9976c00c 100644 --- a/gcc/testsuite/gcc.dg/vect/pr97428.c +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c @@ -44,5 +44,5 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n) /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" "vect" } } */ /* We're not able to peel & apply re-aligning to make accesses well-aligned for !vect_hw_misalign, but we could by peeling the stores for alignment and applying re-aligning loads. */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { xfail { ! vect_hw_misalign } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { ! vect_hw_misalign } } } } */ /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-11c.c b/gcc/testsuite/gcc.dg/vect/slp-11c.c index 0f680cd4e60..169b0d10eec 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-11c.c +++ b/gcc/testsuite/gcc.dg/vect/slp-11c.c @@ -13,7 +13,8 @@ main1 () unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63}; float out[N*8]; - /* Different operations - not SLPable. */ + /* Different operations - we SLP the store and split the group to two + single-lane branches. */ for (i = 0; i < N*4; i++) { out[i*2] = ((float) in[i*2] * 2 + 6) ; @@ -44,4 +45,4 @@ int main (void) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_uintfloat_cvt && vect_strided2 } && vect_int_mult } } } } */ /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { { vect_uintfloat_cvt && vect_strided2 } && vect_int_mult } } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-12a.c b/gcc/testsuite/gcc.dg/vect/slp-12a.c index 973de6ada21..2f98dc9da0b 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-12a.c +++ b/gcc/testsuite/gcc.dg/vect/slp-12a.c @@ -40,6 +40,10 @@ main1 () out[i*8 + 3] = b3 - 1; out[i*8 + 4] = b4 - 8; out[i*8 + 5] = b5 - 7; + /* Due to the use in the ia[i] store we keep the feeding expression + in the form ((in[i*8 + 6] + 11) * 3 - 3) while other expressions + got associated as for example (in[i*5 + 5] * 4 + 33). That + causes SLP discovery to fail. */ out[i*8 + 6] = b6 - 3; out[i*8 + 7] = b7 - 7; @@ -76,5 +80,5 @@ int main (void) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided8 && vect_int_mult } } } } */ /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_strided8 && vect_int_mult } } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_strided8 && {! vect_load_lanes } } && vect_int_mult } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { vect_strided8 && {! vect_load_lanes } } && vect_int_mult } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { ! { vect_strided8 && vect_int_mult } } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-21.c b/gcc/testsuite/gcc.dg/vect/slp-21.c index 58751688414..dc153a53b47 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-21.c +++ b/gcc/testsuite/gcc.dg/vect/slp-21.c @@ -12,6 +12,7 @@ main1 () unsigned short out[N*8], out2[N*8], b0, b1, b2, b3, b4, a0, a1, a2, a3, b5; unsigned short in[N*8]; +#pragma GCC novector for (i = 0; i < N*8; i++) { in[i] = i; @@ -202,18 +203,8 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target { vect_strided4 || vect_extract_even_odd } } } } */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! { vect_strided4 || vect_extract_even_odd } } } } } */ -/* Some targets can vectorize the second of the three main loops using - hybrid SLP. For 128-bit vectors, the required 4->3 permutations are: - - { 0, 1, 2, 4, 5, 6, 8, 9 } - { 2, 4, 5, 6, 8, 9, 10, 12 } - { 5, 6, 8, 9, 10, 12, 13, 14 } - - Not all vect_perm targets support that, and it's a bit too specific to have - its own effective-target selector, so we just test targets directly. */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target { powerpc64*-*-* s390*-*-* loongarch*-*-* } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { vect_strided4 && { ! { powerpc64*-*-* s390*-*-* loongarch*-*-* } } } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { ! { vect_strided4 } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { vect_strided4 || vect_extract_even_odd } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_strided4 || vect_extract_even_odd } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 6 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-1.c b/gcc/testsuite/gcc.dg/vect/slp-cond-1.c index 450c7141c96..16ab0cc7605 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-cond-1.c +++ b/gcc/testsuite/gcc.dg/vect/slp-cond-1.c @@ -125,4 +125,4 @@ main () return 0; } -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-complex-5.c b/gcc/testsuite/gcc.dg/vect/vect-complex-5.c index addcf60438c..ac562dc475c 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-complex-5.c +++ b/gcc/testsuite/gcc.dg/vect/vect-complex-5.c @@ -41,4 +41,4 @@ main (void) } /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target vect_load_lanes } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { ! vect_load_lanes } xfail { ! vect_hw_misalign } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_load_lanes } xfail { ! vect_hw_misalign } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-2.c b/gcc/testsuite/gcc.dg/vect/vect-gather-2.c index 4c23b808333..10e64e64d47 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-gather-2.c +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-2.c @@ -36,6 +36,5 @@ f3 (int *restrict y, int *restrict x, int *restrict indices) } } -/* { dg-final { scan-tree-dump-not "Loop contains only SLP stmts" vect } } */ /* { dg-final { scan-tree-dump "different gather base" vect { target { ! vect_gather_load_ifn } } } } */ /* { dg-final { scan-tree-dump "different gather scale" vect { target { ! vect_gather_load_ifn } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr52252-atom.c b/gcc/testsuite/gcc.target/i386/pr52252-atom.c index 11f94411575..02736d56d31 100644 --- a/gcc/testsuite/gcc.target/i386/pr52252-atom.c +++ b/gcc/testsuite/gcc.target/i386/pr52252-atom.c @@ -25,4 +25,5 @@ matrix_mul (byte *in, byte *out, int size) } } -/* { dg-final { scan-assembler "palignr" } } */ +/* We are no longer using hybrid SLP. */ +/* { dg-final { scan-assembler "palignr" { xfail *-*-* } } } */