From patchwork Fri Jun 21 12:00:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 1950739 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=ywkbJZh5; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=HqCpHB7Z; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=ywkbJZh5; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=HqCpHB7Z; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W5GCS74CBz20X4 for ; Fri, 21 Jun 2024 22:00:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5B73A3898C46 for ; Fri, 21 Jun 2024 12:00:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 7AFCC38A1403 for ; Fri, 21 Jun 2024 12:00:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7AFCC38A1403 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7AFCC38A1403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718971226; cv=none; b=se6FpRehM4mYbklILIIC9EsvKsKoUM1yei8kzMWS0lsIs4sGj/yJfFk6DGZS4ZEwmm5NS1y1Wns5joOKcsqU3gvTkwd3FEaZD3gbQQUivnyCAuxE5EWxk91D7cWXhcaWsh1JDg8wfPf3mTgf864Uw36BaAUEUCeCQ5HGTwte79U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718971226; c=relaxed/simple; bh=/FeJsSt7kAW3mfAlLPQ8k6mltkAHFQPsZjPFOU+PF+E=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=emmy1Qu34VB4wPLtMNqjnPdLl16z0zdLqxlVMfL3LAB+D1VBDAoWOCdOOzr5681hQgY+P7xdWYwZMPW4efXy7E9vPcVgcPDqYAu9FQsHucIIU0Z7A3VBoO++DRpZ00RouhqlbvPDsjjNlw56eY6Ugqvm3TFE6/P1dudJHlrRLiI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 2B6A01FB6F; Fri, 21 Jun 2024 12:00:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1718971223; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=riodYz2E1DZ1IWlXRhebOFwETkBGXARM3oxK9lsZqAU=; b=ywkbJZh5jgECGfZ1qgCrbM+5+G3y3yJHQJRdc7I1vdZlFhtWXpqrWKNZIfeFQ9gO0OdDmt +Pp/CR+rswU3jqCF4sgwbt8ECKQ7Pug1AQeI3lq0LyGODoJaPeikvFQpKidpoLty1I1nuf vX67s+Q7QjtcPizkvheXcEHB5+iv2uw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1718971223; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=riodYz2E1DZ1IWlXRhebOFwETkBGXARM3oxK9lsZqAU=; b=HqCpHB7ZhnS55RooF8rWhj4EDLCDTVzp/e7LDIWA3edlLCo+o8fQ8ubl9fceKUKyPD+K1L TfijYT5NgG+8rUBw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=ywkbJZh5; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=HqCpHB7Z DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1718971223; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=riodYz2E1DZ1IWlXRhebOFwETkBGXARM3oxK9lsZqAU=; b=ywkbJZh5jgECGfZ1qgCrbM+5+G3y3yJHQJRdc7I1vdZlFhtWXpqrWKNZIfeFQ9gO0OdDmt +Pp/CR+rswU3jqCF4sgwbt8ECKQ7Pug1AQeI3lq0LyGODoJaPeikvFQpKidpoLty1I1nuf vX67s+Q7QjtcPizkvheXcEHB5+iv2uw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1718971223; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=riodYz2E1DZ1IWlXRhebOFwETkBGXARM3oxK9lsZqAU=; b=HqCpHB7ZhnS55RooF8rWhj4EDLCDTVzp/e7LDIWA3edlLCo+o8fQ8ubl9fceKUKyPD+K1L TfijYT5NgG+8rUBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0B35D13AAA; Fri, 21 Jun 2024 12:00:23 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id UykiAVdrdWb9dAAAD6G6ig (envelope-from ); Fri, 21 Jun 2024 12:00:23 +0000 Date: Fri, 21 Jun 2024 14:00:22 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com Subject: [PATCH] tree-optimization/115528 - fix vect alignment analysis for outer loop vect MIME-Version: 1.0 Message-Id: <20240621120023.0B35D13AAA@imap1.dmz-prg2.suse.org> X-Rspamd-Queue-Id: 2B6A01FB6F X-Spam-Score: -6.51 X-Spam-Level: X-Spamd-Result: default: False [-6.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; DWL_DNSWL_MED(-2.00)[suse.de:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCPT_COUNT_TWO(0.00)[2]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; DNSWL_BLOCKED(0.00)[2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Action: no action X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org For outer loop vectorization of a data reference in the inner loop we have to look at both steps to see if they preserve alignment. What is special for this testcase is that the outer loop step is one element but the inner loop step four and that we now use SLP and the vectorization factor is one. But the issue looks latent to me. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. PR tree-optimization/115528 * tree-vect-data-refs.cc (vect_compute_data_ref_alignment): Make sure to look at both the inner and outer loop step behavior. * gfortran.dg/vect/pr115528.f: New testcase. --- gcc/testsuite/gfortran.dg/vect/pr115528.f | 27 +++++++++++ gcc/tree-vect-data-refs.cc | 57 ++++++++++++----------- 2 files changed, 56 insertions(+), 28 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/vect/pr115528.f diff --git a/gcc/testsuite/gfortran.dg/vect/pr115528.f b/gcc/testsuite/gfortran.dg/vect/pr115528.f new file mode 100644 index 00000000000..764a4b92b3e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/vect/pr115528.f @@ -0,0 +1,27 @@ +! { dg-additional-options "-fno-inline" } + + subroutine init(COEF1,FORM1,AA) + double precision COEF1,X + double complex FORM1 + double precision AA(4,4) + COEF1=0 + FORM1=0 + AA=0 + end + subroutine curr(HADCUR) + double precision COEF1 + double complex HADCUR(4),FORM1 + double precision AA(4,4) + call init(COEF1,FORM1,AA) + do i = 1,4 + do j = 1,4 + HADCUR(I)= + $ HADCUR(I)+CMPLX(COEF1)*FORM1*AA(I,J) + end do + end do + end + program test + double complex HADCUR(4) + hadcur=0 + call curr(hadcur) + end diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index ae237407672..959e127c385 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -1356,42 +1356,43 @@ vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info, step_preserves_misalignment_p = true; } - /* In case the dataref is in an inner-loop of the loop that is being - vectorized (LOOP), we use the base and misalignment information - relative to the outer-loop (LOOP). This is ok only if the misalignment - stays the same throughout the execution of the inner-loop, which is why - we have to check that the stride of the dataref in the inner-loop evenly - divides by the vector alignment. */ - else if (nested_in_vect_loop_p (loop, stmt_info)) - { - step_preserves_misalignment_p - = (DR_STEP_ALIGNMENT (dr_info->dr) % vect_align_c) == 0; - - if (dump_enabled_p ()) - { - if (step_preserves_misalignment_p) - dump_printf_loc (MSG_NOTE, vect_location, - "inner step divides the vector alignment.\n"); - else - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "inner step doesn't divide the vector" - " alignment.\n"); - } - } - - /* Similarly we can only use base and misalignment information relative to - an innermost loop if the misalignment stays the same throughout the - execution of the loop. As above, this is the case if the stride of - the dataref evenly divides by the alignment. */ else { + /* We can only use base and misalignment information relative to + an innermost loop if the misalignment stays the same throughout the + execution of the loop. As above, this is the case if the stride of + the dataref evenly divides by the alignment. */ poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); step_preserves_misalignment_p - = multiple_p (DR_STEP_ALIGNMENT (dr_info->dr) * vf, vect_align_c); + = multiple_p (drb->step_alignment * vf, vect_align_c); if (!step_preserves_misalignment_p && dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "step doesn't divide the vector alignment.\n"); + + /* In case the dataref is in an inner-loop of the loop that is being + vectorized (LOOP), we use the base and misalignment information + relative to the outer-loop (LOOP). This is ok only if the + misalignment stays the same throughout the execution of the + inner-loop, which is why we have to check that the stride of the + dataref in the inner-loop evenly divides by the vector alignment. */ + if (step_preserves_misalignment_p + && nested_in_vect_loop_p (loop, stmt_info)) + { + step_preserves_misalignment_p + = (DR_STEP_ALIGNMENT (dr_info->dr) % vect_align_c) == 0; + + if (dump_enabled_p ()) + { + if (step_preserves_misalignment_p) + dump_printf_loc (MSG_NOTE, vect_location, + "inner step divides the vector alignment.\n"); + else + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "inner step doesn't divide the vector" + " alignment.\n"); + } + } } unsigned int base_alignment = drb->base_alignment;