From patchwork Fri Sep 20 11:23:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1987887 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=KM/i3BO1; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=WYlVTTFH; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=d+teHoM3; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=zvp9Dhpy; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X995T2m4jz1y1t for ; Fri, 20 Sep 2024 21:24:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7B6F13857349 for ; Fri, 20 Sep 2024 11:24:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id BDF873858D35 for ; Fri, 20 Sep 2024 11:24:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BDF873858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BDF873858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726831448; cv=none; b=SZ8/4/c/LqKGN/AO2Dcx7L8m7k1j6a+Fhk+2b4LrG4N5YSXbCnS7V/3bSfptxtBLTW4yeBctCD5Cx030uvOtqz+v8AGf8PHb+xOIfXejv9PqNFCOAICac8i321kXC+N2XGgmIjQ6re8iwsBoulkzX+iwZibiTe5xhRvcjwJEg1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726831448; c=relaxed/simple; bh=3DuN1UeUtteHECP5xprw3tTeH2xqIViYvPPT+SbV/CM=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=KuCYdjS0XijNR6I0Ni/eybO+7sWmk8DWBX+2u6t6uHUpxJ4dWKItva1irfGWsP4FtrcxmHHAl0rhx07vjBzoR/RSbO+COGasawiNetboJTalpfA/AaOg3mWZ5k82VMj28yHiMsJ96VkcqJfoZIbkKmhZUMuZLOZa+M9NpwzV5hI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 8A12633A2C for ; Fri, 20 Sep 2024 11:24:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1726831445; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=S/BXx5eUeb44H9PvMfQnXxEGmeYujku8dPqBZ8FGLkY=; b=KM/i3BO1Vd5tOIf69WYiGl5H7c1sRf/+teEdTxjuSjAM2l9irWWvbbYFcy6UToYlx275uG IlVYeX0as4/8Fy/GsbZbg1KrIcPQHIvBn7Yr8/gB9ZgMKLuRTdlEhF8g1fwXpwbyWFg1x4 DdUwj770FYT5iKMHwhGZkgs1ESSO5NU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1726831445; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=S/BXx5eUeb44H9PvMfQnXxEGmeYujku8dPqBZ8FGLkY=; b=WYlVTTFHbwDXMhOH9EuE3uX+VXpfuMZW6X4awvB0SyLEFXZ2JXuZjAAZC6Q5r0I9nhVJik jO6PSqvb8iIi6aAA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1726831444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=S/BXx5eUeb44H9PvMfQnXxEGmeYujku8dPqBZ8FGLkY=; b=d+teHoM364TidK19S0yOAE3L7rorWqhxX/LUXfp32OZfUwSpImawI9OXjOD9oUSrKYAxJf KQPZYYhdEFHKGeMVbqTqcsZs4I46zWzIE1Em/ANS3Rm1GDezR9DVHv0C8cHGkBtrjxV+U3 Y06ldK3+XaS6YPxJZf8u3lMeZxlqKnU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1726831444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=S/BXx5eUeb44H9PvMfQnXxEGmeYujku8dPqBZ8FGLkY=; b=zvp9DhpyKVNpUcFoHCcCBxYHShUtXw+lcLYOCbuxXj/1zzdgY/kfkR76cMh/DXtnURZRKy 9RY+8lJ5s6EGqVBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 6D47013AE1 for ; Fri, 20 Sep 2024 11:24:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 2x13GFRb7WbkFQAAD6G6ig (envelope-from ) for ; Fri, 20 Sep 2024 11:24:04 +0000 Date: Fri, 20 Sep 2024 13:23:59 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] Fall back to elementwise access for too spaced SLP single element interleaving MIME-Version: 1.0 Message-Id: <20240920112404.6D47013AE1@imap1.dmz-prg2.suse.org> X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; MISSING_XM_UA(0.00)[]; RCVD_TLS_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; PREVIOUSLY_DELIVERED(0.00)[gcc-patches@gcc.gnu.org]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo, imap1.dmz-prg2.suse.org:mid, tree-vect-stmts.cc:url] X-Spam-Score: -4.30 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org gcc.dg/vect/vect-pr111779.c is a case where non-SLP manages to vectorize using VMAT_ELEMENTWISE but SLP currently refuses because doing a regular access with permutes would cause excess vector loads with at most one element used. The following makes us fall back to elementwise accesses for that, too. Bootstrapped and tested on x86_64-unknown-linux-gnu. * tree-vect-stmts.cc (get_group_load_store_type): Fall back to VMAT_ELEMENTWISE when single element interleaving of a too large group. (vectorizable_load): Do not try to verify load permutations when using VMAT_ELEMENTWISE for single-lane SLP and fix code generation for this case. --- gcc/tree-vect-stmts.cc | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 33cdccae784..45003f762dd 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2190,11 +2190,12 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, && single_element_p && maybe_gt (group_size, TYPE_VECTOR_SUBPARTS (vectype))) { + *memory_access_type = VMAT_ELEMENTWISE; if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "single-element interleaving not supported " - "for not adjacent vector loads\n"); - return false; + "for not adjacent vector loads, using " + "elementwise access\n"); } } } @@ -10039,7 +10040,23 @@ vectorizable_load (vec_info *vinfo, else group_size = 1; - if (slp && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()) + vect_memory_access_type memory_access_type; + enum dr_alignment_support alignment_support_scheme; + int misalignment; + poly_int64 poffset; + internal_fn lanes_ifn; + if (!get_load_store_type (vinfo, stmt_info, vectype, slp_node, mask, VLS_LOAD, + ncopies, &memory_access_type, &poffset, + &alignment_support_scheme, &misalignment, &gs_info, + &lanes_ifn)) + return false; + + /* ??? The following checks should really be part of + get_group_load_store_type. */ + if (slp + && SLP_TREE_LOAD_PERMUTATION (slp_node).exists () + && !(memory_access_type == VMAT_ELEMENTWISE + && SLP_TREE_LANES (slp_node) == 1)) { slp_perm = true; @@ -10079,17 +10096,6 @@ vectorizable_load (vec_info *vinfo, } } - vect_memory_access_type memory_access_type; - enum dr_alignment_support alignment_support_scheme; - int misalignment; - poly_int64 poffset; - internal_fn lanes_ifn; - if (!get_load_store_type (vinfo, stmt_info, vectype, slp_node, mask, VLS_LOAD, - ncopies, &memory_access_type, &poffset, - &alignment_support_scheme, &misalignment, &gs_info, - &lanes_ifn)) - return false; - if (slp_node && slp_node->ldst_lanes && memory_access_type != VMAT_LOAD_STORE_LANES) @@ -10292,7 +10298,8 @@ vectorizable_load (vec_info *vinfo, first_dr_info = dr_info; } - if (slp && grouped_load) + if (slp && grouped_load + && memory_access_type == VMAT_STRIDED_SLP) { group_size = DR_GROUP_SIZE (first_stmt_info); ref_type = get_group_alias_ptr_type (first_stmt_info);