From patchwork Thu Nov 7 12:09:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1191104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512699-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="tutMOdOz"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4782JD6FYjz9sPk for ; Thu, 7 Nov 2019 23:09:15 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=i1G1Apl7eK0rFSwezgNOlxWWNMfuYdGprRM+s9tjwNXOwoASyJ 4cAYHeInjbo7+uYW3NYohjwhtyrpzrhLy4xgXYY8cV8xCoxxRQvLeQs6MGE3885U t+AmXbpcZ4NJ1QOZYPLIAeur3EQzxMENbx0rEjByo7TxEEgq5zwmLIlQg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=cbh5ffKErHg37vpBzcIo54NiCr4=; b=tutMOdOzId3f776MuPOM vQm9ZfVzV1uhJOon6cc+3l5Yr+Vqm3xykvc0ePbLM0Xj+hEXId7kO+phfd/FvaQD SJ15MkDoLPtyKFXHcj/RGyd+RqJj7XdQ6NxbNDxufjUyttNh6yTcceXe5aV/bdM+ eKzK3owgVMnaqB2Gzh2xo6M= Received: (qmail 46977 invoked by alias); 7 Nov 2019 12:09:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 46966 invoked by uid 89); 7 Nov 2019 12:09:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LOTSOFHASH, SPF_PASS autolearn=ham version=3.3.1 spammy=Andre, andre, 92351 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Nov 2019 12:09:05 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3A12D31B; Thu, 7 Nov 2019 04:09:03 -0800 (PST) Received: from [10.2.206.37] (e107157-lin.cambridge.arm.com [10.2.206.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B6A683F6C4; Thu, 7 Nov 2019 04:09:02 -0800 (PST) To: gcc-patches Cc: marxin@gcc.gnu.org, Richard Biener From: "Andre Vieira (lists)" Subject: [PATCH][vect] PR 92351: When peeling for alignment make alignment of epilogues unknown Message-ID: <4c528df2-e0bd-e022-54f6-85212b47c519@arm.com> Date: Thu, 7 Nov 2019 12:09:01 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 X-IsSubscribed: yes Hi, PR92351 reports a bug in which a wrongly aligned load is generated for an epilogue of a main loop for which we peeled for alignment. There is no way to guarantee that epilogue data accesses are aligned when the main loop is peeling for alignment. I also had to split vect-peel-2.c as there were scans there for the number of unaligned accesses that were vectorized, thanks to this change that now depends on whether we are vectorizing the epilogue, which will also contain unaligned accesses. Since not all targets need to be able to vectorize the epilogue I decided to disable epilogue vectorization for the version in which we scan the dumps and add a version that attempts epilogue vectorization but does not scan the dumps. Bootstrapped and regression tested on x86_64 and aarch64. Is this OK for trunk? In the future I would like to look at allowing for misalignment analysis for cases in which both the number of iterations and iterations to peel are known at compile time, as in that case we shouldn't ever be skipping the main loop as we shouldn't be generating it. gcc/ChangeLog: 2019-11-07 Andre Vieira * tree-vect-data-refs.c (vect_compute_data_ref_alignment): When we are peeling the main loop for alignment, make sure to set the misalignment of the epilogue's data references to DR_MISALIGNMENT_UNKNOWN. gcc/testsuite/ChangeLog: 2019-11-07 Andre Vieira * gcc.dg/vect/vect-peel-2.c: Disable epilogue vectorization and split the source of this test to... * gcc.dg/vect/vect-peel-2-src.c: ... This. * gcc.dg/vect/vect-peel-2-epilogues.c: New test. diff --git a/gcc/testsuite/gcc.dg/vect/vect-peel-2-epilogues.c b/gcc/testsuite/gcc.dg/vect/vect-peel-2-epilogues.c new file mode 100644 index 0000000000000000000000000000000000000000..c06fa442fafa36855d285d2336e0d69ee9bffe03 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-peel-2-epilogues.c @@ -0,0 +1,3 @@ +/* { dg-require-effective-target vect_int } */ + +#include "vect-peel-2-src.c" diff --git a/gcc/testsuite/gcc.dg/vect/vect-peel-2-src.c b/gcc/testsuite/gcc.dg/vect/vect-peel-2-src.c new file mode 100644 index 0000000000000000000000000000000000000000..f6fc134c8705567a628dcd62c053ad6f2ca2904d --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-peel-2-src.c @@ -0,0 +1,48 @@ +#include +#include "tree-vect.h" + +#define N 128 + +/* unaligned store. */ + +int ib[N+7]; + +__attribute__ ((noinline)) +int main1 () +{ + int i; + int ia[N+1]; + + /* The store is aligned and the loads are misaligned with the same + misalignment. Cost model is disabled. If misaligned stores are supported, + we peel according to the loads to align them. */ + for (i = 0; i <= N; i++) + { + ia[i] = ib[i+2] + ib[i+6]; + } + + /* check results: */ + for (i = 1; i <= N; i++) + { + if (ia[i] != ib[i+2] + ib[i+6]) + abort (); + } + + return 0; +} + +int main (void) +{ + int i; + + check_vect (); + + for (i = 0; i <= N+6; i++) + { + asm volatile ("" : "+r" (i)); + ib[i] = i; + } + + return main1 (); +} + diff --git a/gcc/testsuite/gcc.dg/vect/vect-peel-2.c b/gcc/testsuite/gcc.dg/vect/vect-peel-2.c index b6061c3b8553b67ecdf56367b2f4128d7c0bd342..65e70bd44170c63ce3bc25c6a7ecf426ddcd39b1 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-peel-2.c +++ b/gcc/testsuite/gcc.dg/vect/vect-peel-2.c @@ -1,52 +1,8 @@ /* { dg-require-effective-target vect_int } */ +/* Disabling epilogues until we find a better way to deal with scans. */ +/* { dg-additional-options "--param vect-epilogues-nomask=0" } */ -#include -#include "tree-vect.h" - -#define N 128 - -/* unaligned store. */ - -int ib[N+7]; - -__attribute__ ((noinline)) -int main1 () -{ - int i; - int ia[N+1]; - - /* The store is aligned and the loads are misaligned with the same - misalignment. Cost model is disabled. If misaligned stores are supported, - we peel according to the loads to align them. */ - for (i = 0; i <= N; i++) - { - ia[i] = ib[i+2] + ib[i+6]; - } - - /* check results: */ - for (i = 1; i <= N; i++) - { - if (ia[i] != ib[i+2] + ib[i+6]) - abort (); - } - - return 0; -} - -int main (void) -{ - int i; - - check_vect (); - - for (i = 0; i <= N+6; i++) - { - asm volatile ("" : "+r" (i)); - ib[i] = i; - } - - return main1 (); -} +#include "vect-peel-2-src.c" /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { { vect_element_align } && { vect_aligned_arrays } } } } } */ diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 9dd18d265361ba5635de685f9d898e355999bf4c..fc75ac4f112486934a007c90cc6b646b6115857b 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -938,6 +938,18 @@ vect_compute_data_ref_alignment (dr_vec_info *dr_info) = exact_div (vect_calculate_target_alignment (dr_info), BITS_PER_UNIT); DR_TARGET_ALIGNMENT (dr_info) = vector_alignment; + /* If the main loop has peeled for alignment we have no way of knowing + whether the data accesses in the epilogues are aligned. We can't at + compile time answer the question whether we have entered the main loop or + not. Fixes PR 92351. */ + if (loop_vinfo) + { + loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo); + if (orig_loop_vinfo + && LOOP_VINFO_PEELING_FOR_ALIGNMENT (orig_loop_vinfo) != 0) + return; + } + unsigned HOST_WIDE_INT vect_align_c; if (!vector_alignment.is_constant (&vect_align_c)) return;