From patchwork Fri Jul 19 21:52:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 1962649 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=baylibre-com.20230601.gappssmtp.com header.i=@baylibre-com.20230601.gappssmtp.com header.a=rsa-sha256 header.s=20230601 header.b=yRuaC5aS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WQk1q42h5z1ySl for ; Sat, 20 Jul 2024 07:53:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 190C538432E2 for ; Fri, 19 Jul 2024 21:53:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id 9605B384514B for ; Fri, 19 Jul 2024 21:52:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9605B384514B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9605B384514B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::434 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721425958; cv=none; b=VxMiyXQTdQWH0vmt25BFqY2v1rENUtBx+qHQMvCakVj4JS81opFYxsgZYgRkECV0pyEM13rnKBme7R79BFzf3NBjA+gtv1M7gOS17bvyKGOWCzv4X0zfd+YSOA1GJqeqpHppYG7HVsLSmelSNsIukiCXGTtaiDoxAf6U5XJsYlk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721425958; c=relaxed/simple; bh=TERHo803e8JChYHs8xYskvqMG6Jaa4t8VpnAUexI4Iw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=qA4F82lqfoa9XszGEx1E7G2/95XIs6CXIc9OJFpPEjdqMBSav+dOdm3gWt3MBMMeQyaDaSKfClMLESF57JkAU6kYjIP1oJ0Jy3h7Gpuw0UjzBxJYgyBgHm4jDw6d7vWeyooHcXwpb41sTHUaMyl1+MDx16er7FSoRNDZKz4K/H0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-367940c57ddso890406f8f.3 for ; Fri, 19 Jul 2024 14:52:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1721425949; x=1722030749; darn=gcc.gnu.org; h=mime-version:message-id:date:user-agent:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=CSBIKbcLStXjGzRNlwreMoVcSUThTMpgOY35vfloqr8=; b=yRuaC5aSd9/VuCCKVW5OkxQNB/oBmtVcAnQxPMpyJvqOEE5P639WVyK6bcWzIPX8UM ho2caCCTAmlMfEndqu4g2HzH+nLg2P6ZL4JPaV9oSYDBRmXSHS1s9KhuqwG+zkxpFRt9 pk72/TrMdB88m05X+PomnYCyDxtlrBu6a8U6GEJSyDkg5vSZ+IP43K5qWWhI7sCKORA5 S08foC47eOjb1pkz7Mi4iDIKDUED75yaYi7E0WdW8VkSCQid4H0eMSl3GuWBJiqV+pbf mfdjVrb2mc9gl6zu0PNez01tcPJdXF75032J6ZfTCh480FeEeiiYqpZLUzV8LqH8714N Pv/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721425949; x=1722030749; h=mime-version:message-id:date:user-agent:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CSBIKbcLStXjGzRNlwreMoVcSUThTMpgOY35vfloqr8=; b=JQLPIl23xVMnFAMQIoAL/EfzzlZRcwfLStDNZ8k4DIZxVCYSXadMhZO107WVQEmsYO XnEPCrNByJaVQq9qYTr/8ADnx0bJZrvlzmnldPGC1RtQ2ddATGaH9dDDuRqmWc4wg7Wi aPLRXWtD6e/kZWlvgyN36f8oTEhgR0NMCo94KbnbxdbutLolgbHbB/b2CB4M9oq0bFZU xIYh9hSf5I2O1O79oCTaug93+nwu7tFGaxqpcY2a8KPGt96tHbl1pRNmjWOaq4lUjfgJ e9v93qqfDBit3OhQ8yOQEW5puYzp8C92hZeUiQzym2/e719/AkYGmi6KLFdMTYpaT4Qh t1OA== X-Gm-Message-State: AOJu0YxdJc6v0TXJFJriW7j5xIPbUSDkiUWcnC5w+dweMzK70UPpkbl2 8QD62zzporL3bLcJsQrRraRjoAXnTidHEyzAX42afV9QxA89qsZIA36TNiN+bOHyjkFUJAJ2aLu z X-Google-Smtp-Source: AGHT+IEAqHG89Qi+6RG4fEOVbd0Z5im4MgRxvfR0b9GK/0K7eZJGcfdXMRPGKdpSTLl5tiLDHdVU3g== X-Received: by 2002:a5d:540d:0:b0:365:980c:d281 with SMTP id ffacd0b85a97d-3683171c2femr5455578f8f.45.1721425948771; Fri, 19 Jul 2024 14:52:28 -0700 (PDT) Received: from euler.schwinge.ddns.net (p200300c8b733b9005e8fc6f38b6af531.dip0.t-ipconnect.de. [2003:c8:b733:b900:5e8f:c6f3:8b6a:f531]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-368787ceccasm2539831f8f.81.2024.07.19.14.52.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jul 2024 14:52:28 -0700 (PDT) From: Thomas Schwinge To: gcc-patches@gcc.gnu.org Cc: Andrew Stubbs Subject: [OG14] Revert "[og10] vect: Add target hook to prefer gather/scatter instructions" (was: [PATCH] [og10] vect: Add target hook to prefer gather/scatter instructions) In-Reply-To: <20210113234842.71133-2-julian@codesourcery.com> References: <20210113234842.71133-2-julian@codesourcery.com> User-Agent: Notmuch/0.30+8~g47a4bad (https://notmuchmail.org) Emacs/29.4 (x86_64-pc-linux-gnu) Date: Fri, 19 Jul 2024 23:52:19 +0200 Message-ID: <87o76t2goc.fsf@euler.schwinge.ddns.net> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi! On 2021-01-13T15:48:42-0800, Julian Brown wrote: > For AMD GCN, the instructions available for loading/storing vectors are > always scatter/gather operations (i.e. there are separate addresses for > each vector lane), so the current heuristic to avoid gather/scatter > operations with too many elements in get_group_load_store_type is > counterproductive. Avoiding such operations in that function can > subsequently lead to a missed vectorization opportunity whereby later > analyses in the vectorizer try to use a very wide array type which is > not available on this target, and thus it bails out. > > The attached patch adds a target hook to override the "single_element_p" > heuristic in the function as a target hook, and activates it for GCN. This > allows much better code to be generated for affected loops. > > Tested with offloading to AMD GCN. I will apply to the og10 branch > shortly. Testing current OG14 commit 735bbbfc6eaf58522c3ebb0946b66f33958ea134 for '--target=amdgcn-amdhsa' (I've tested '-march=gfx908', '-march=gfx1100'), this change has been identified to be causing ~100 instances of execution test PASS -> FAIL, thus wrong-code generation. It's possible that we've had the same misbehavior also on OG13 and earlier, but just nobody ever tested that. And/or, that at some point in time, the original patch fell out of sync, wasn't updated for relevant upstream vectorizer changes. Until someone gets to analyze that (and upstream these changes here), we shall revert this commit on OG14. Pushed to devel/omp/gcc-14 branch commit 8678fc697046fba1014f1db6321ee670538b0881 'Revert "[og10] vect: Add target hook to prefer gather/scatter instructions"', see attached. List of GCC 14.1 vs OG14 regressions (... avoided by this revert commit): '-march=gfx1100' only: PASS: g++.dg/vect/pr97255.cc -std=c++14 (test for excess errors) [-PASS:-]{+FAIL:+} g++.dg/vect/pr97255.cc -std=c++14 execution test PASS: g++.dg/vect/pr97255.cc -std=c++17 (test for excess errors) [-PASS:-]{+FAIL:+} g++.dg/vect/pr97255.cc -std=c++17 execution test PASS: g++.dg/vect/pr97255.cc -std=c++20 (test for excess errors) [-PASS:-]{+FAIL:+} g++.dg/vect/pr97255.cc -std=c++20 execution test UNSUPPORTED: g++.dg/vect/pr97255.cc -std=c++98 GCN Kernel Aborted @@ -101950,11 +101950,11 @@ PASS: gcc.dg/torture/pr52028.c -O0 execution test PASS: gcc.dg/torture/pr52028.c -O1 (test for excess errors) PASS: gcc.dg/torture/pr52028.c -O1 execution test PASS: gcc.dg/torture/pr52028.c -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr52028.c -O2 execution test PASS: gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr52028.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gcc.dg/torture/pr52028.c -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr52028.c -O3 -g execution test PASS: gcc.dg/torture/pr52028.c -Os (test for excess errors) PASS: gcc.dg/torture/pr52028.c -Os execution test GCN Kernel Aborted @@ -102160,11 +102160,11 @@ PASS: gcc.dg/torture/pr53366-1.c -O0 execution test PASS: gcc.dg/torture/pr53366-1.c -O1 (test for excess errors) PASS: gcc.dg/torture/pr53366-1.c -O1 execution test PASS: gcc.dg/torture/pr53366-1.c -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr53366-1.c -O2 execution test PASS: gcc.dg/torture/pr53366-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr53366-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gcc.dg/torture/pr53366-1.c -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr53366-1.c -O3 -g execution test PASS: gcc.dg/torture/pr53366-1.c -Os (test for excess errors) PASS: gcc.dg/torture/pr53366-1.c -Os execution test GCN Kernel Aborted PASS: gcc.dg/torture/pr93868.c -O0 (test for excess errors) PASS: gcc.dg/torture/pr93868.c -O0 execution test PASS: gcc.dg/torture/pr93868.c -O1 (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr93868.c -O1 execution test PASS: gcc.dg/torture/pr93868.c -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr93868.c -O2 execution test PASS: gcc.dg/torture/pr93868.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr93868.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gcc.dg/torture/pr93868.c -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/torture/pr93868.c -O3 -g execution test PASS: gcc.dg/torture/pr93868.c -Os (test for excess errors) PASS: gcc.dg/torture/pr93868.c -Os execution test GCN Kernel Aborted PASS: gcc.target/gcn/complex.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.target/gcn/complex.c execution test GCN Kernel Aborted 'gcc.dg/vect/': generally, both '-march=gfx908', '-march=gfx1100': PASS: gcc.dg/vect/pr45752.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr45752.c execution test PASS: gcc.dg/vect/pr45752.c scan-tree-dump-times vect "gaps requires scalar epilogue loop" 0 PASS: gcc.dg/vect/pr45752.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/pr45752.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 PASS: gcc.dg/vect/pr66636.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr66636.c execution test PASS: gcc.dg/vect/pr78558.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr78558.c execution test PASS: gcc.dg/vect/slp-12a.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-12a.c execution test PASS: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 '-march=gfx908' only. PASS: gcc.dg/vect/slp-12a.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-12a.c execution test PASS: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 '-march=gfx1100' only. PASS: gcc.dg/vect/slp-19c.c (test for excess errors) PASS: gcc.dg/vect/slp-19c.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-21.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-21.c execution test PASS: gcc.dg/vect/slp-21.c scan-tree-dump-times vect "vectorized 4 loops" 1 FAIL: gcc.dg/vect/slp-21.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 PASS: gcc.dg/vect/slp-perm-12.c (test for excess errors) PASS: gcc.dg/vect/slp-perm-12.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-12.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 '-march=gfx908' only. PASS: gcc.dg/vect/slp-perm-12.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-12.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-12.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 '-march=gfx1100' only. PASS: gcc.dg/vect/slp-perm-4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-4.c execution test PASS: gcc.dg/vect/slp-perm-4.c scan-tree-dump-times vect "gaps requires scalar epilogue loop" 0 PASS: gcc.dg/vect/slp-perm-4.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-4.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/vect-avg-16.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-avg-16.c execution test PASS: gcc.dg/vect/vect-avg-16.c scan-tree-dump vect "vect_recog_average_pattern: detected" PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c execution test PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap7-big-array.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u8-i8-gap7.c execution test PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap7.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap4.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap4.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap7.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap7.c scan-tree-dump-times vect "vectorized 1 loops" 1 'gcc.dg/vect/': not '-march=gfx908'; '-march=gfx1100' only: PASS: gcc.dg/vect/no-scevccp-outer-18.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-18.c execution test PASS: gcc.dg/vect/no-scevccp-outer-18.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED." 1 PASS: gcc.dg/vect/pr101445.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr101445.c execution test PASS: gcc.dg/vect/pr37027.c (test for excess errors) PASS: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/pr37539.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr37539.c execution test PASS: gcc.dg/vect/pr37539.c scan-tree-dump-times vect "vectorized 1 loops" 2 PASS: gcc.dg/vect/pr56826.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr56826.c execution test PASS: gcc.dg/vect/pr59354.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr59354.c execution test PASS: gcc.dg/vect/pr59354.c scan-tree-dump vect "vectorized 1 loop" PASS: gcc.dg/vect/pr61680.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr61680.c execution test PASS: gcc.dg/vect/pr64252.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr64252.c execution test PASS: gcc.dg/vect/pr66253.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr66253.c execution test PASS: gcc.dg/vect/pr66253.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/pr67790.c (test for excess errors) PASS: gcc.dg/vect/pr67790.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/pr67790.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/pr67790.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/pr68445.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr68445.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/pr71259.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr71259.c execution test PASS: gcc.dg/vect/pr81410.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr81410.c execution test PASS: gcc.dg/vect/pr81410.c scan-tree-dump vect "vectorized 1 loops" PASS: gcc.dg/vect/pr82108.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr82108.c execution test PASS: gcc.dg/vect/pr82108.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/pr87288-1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr87288-1.c execution test PASS: gcc.dg/vect/pr87288-1.c scan-tree-dump-times vect "LOOP VECTORIZED" 1 PASS: gcc.dg/vect/pr87288-2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr87288-2.c execution test PASS: gcc.dg/vect/pr87288-2.c scan-tree-dump vect "LOOP VECTORIZED" PASS: gcc.dg/vect/pr87288-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr87288-3.c execution test PASS: gcc.dg/vect/pr87288-3.c scan-tree-dump vect "LOOP VECTORIZED" PASS: gcc.dg/vect/pr92420.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr92420.c execution test PASS: gcc.dg/vect/pr96783-2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/pr96783-2.c execution test PASS: gcc.dg/vect/pr97832-1.c (test for excess errors) PASS: gcc.dg/vect/pr97832-1.c scan-tree-dump vect "Loop contains only SLP stmts" [-PASS:-]{+FAIL:+} gcc.dg/vect/pr97832-1.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/pr97832-2.c (test for excess errors) PASS: gcc.dg/vect/pr97832-2.c scan-tree-dump vect "Loop contains only SLP stmts" [-PASS:-]{+FAIL:+} gcc.dg/vect/pr97832-2.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/pr97832-3.c (test for excess errors) PASS: gcc.dg/vect/pr97832-3.c scan-tree-dump vect "Loop contains only SLP stmts" [-PASS:-]{+FAIL:+} gcc.dg/vect/pr97832-3.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/pr97832-4.c (test for excess errors) PASS: gcc.dg/vect/pr97832-4.c scan-tree-dump vect "Loop contains only SLP stmts" [-PASS:-]{+FAIL:+} gcc.dg/vect/pr97832-4.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/slp-11a.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11a.c execution test PASS: gcc.dg/vect/slp-11a.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-11a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 0 PASS: gcc.dg/vect/slp-11b.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11b.c execution test PASS: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-11c.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11c.c execution test PASS: gcc.dg/vect/slp-11c.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-11c.c scan-tree-dump-times vect "vectorizing stmts using SLP" 0 PASS: gcc.dg/vect/slp-23.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-23.c execution test PASS: gcc.dg/vect/slp-23.c scan-tree-dump-times vect "vectorized 2 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-23.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 PASS: gcc.dg/vect/slp-42.c (test for excess errors) PASS: gcc.dg/vect/slp-42.c scan-tree-dump vect "vectorized 1 loops" [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-42.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/slp-46.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-46.c execution test FAIL: gcc.dg/vect/slp-46.c scan-tree-dump-times vect "vectorizing stmts using SLP" 4 PASS: gcc.dg/vect/slp-47.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-47.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-47.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 PASS: gcc.dg/vect/slp-48.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-48.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-48.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 PASS: gcc.dg/vect/slp-perm-1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-1.c execution test PASS: gcc.dg/vect/slp-perm-1.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-perm-10.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-10.c execution test PASS: gcc.dg/vect/slp-perm-10.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-10.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-perm-11.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-11.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-11.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-perm-2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-2.c execution test PASS: gcc.dg/vect/slp-perm-2.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-2.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-perm-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-3.c execution test PASS: gcc.dg/vect/slp-perm-3.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-perm-5.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-5.c execution test PASS: gcc.dg/vect/slp-perm-5.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-perm-7.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-7.c execution test PASS: gcc.dg/vect/slp-perm-7.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-perm-8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-8.c execution test PASS: gcc.dg/vect/slp-perm-8.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-perm-9.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-9.c execution test PASS: gcc.dg/vect/slp-perm-9.c scan-tree-dump-not vect "permutation requires at least three vectors" PASS: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-reduc-1.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-1.c execution test PASS: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 @@ -119631,23 +119631,23 @@ PASS: gcc.dg/vect/slp-reduc-2.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-2.c execution test PASS: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-reduc-3.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-3.c execution test PASS: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 XFAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vect_recog_dot_prod_pattern: detected" 1 PASS: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/slp-reduc-4.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-4.c execution test [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-4.c scan-tree-dump vect "vectorizing stmts using SLP" PASS: gcc.dg/vect/slp-reduc-4.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/slp-reduc-4.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/slp-reduc-5.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-5.c execution test PASS: gcc.dg/vect/slp-reduc-5.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/slp-reduc-5.c scan-tree-dump-times vect "vectorized 1 loops" 2 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-5.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 @@ -119657,7 +119657,7 @@ PASS: gcc.dg/vect/slp-reduc-7.c (test for excess errors) PASS: gcc.dg/vect/slp-reduc-7.c execution test PASS: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "VEC_PERM_EXPR" 0 PASS: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorized 1 loops" 1 [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 PASS: gcc.dg/vect/tsvc/vect-tsvc-s127.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/tsvc/vect-tsvc-s127.c execution test PASS: gcc.dg/vect/tsvc/vect-tsvc-s127.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-119.c (test for excess errors) PASS: gcc.dg/vect/vect-119.c scan-tree-dump-not optimized "Invalid sum" [-FAIL:-]{+PASS:+} gcc.dg/vect/vect-119.c scan-tree-dump-times vect "Detected interleaving load of size 2" 1 PASS: gcc.dg/vect/vect-cselim-1.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-cselim-1.c execution test PASS: gcc.dg/vect/vect-cselim-1.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-fmax-3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-fmax-3.c execution test PASS: gcc.dg/vect/vect-fmax-3.c scan-tree-dump vect "Detected reduction" PASS: gcc.dg/vect/vect-pr114375.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-pr114375.c execution test PASS: gcc.dg/vect/vect-strided-a-mult.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-mult.c execution test PASS: gcc.dg/vect/vect-strided-a-mult.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u16-i2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u16-i2.c execution test PASS: gcc.dg/vect/vect-strided-a-u16-i2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u16-i4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u16-i4.c execution test PASS: gcc.dg/vect/vect-strided-a-u16-i4.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u16-mult.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u16-mult.c execution test PASS: gcc.dg/vect/vect-strided-a-u16-mult.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u32-mult.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u32-mult.c execution test PASS: gcc.dg/vect/vect-strided-a-u32-mult.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u8-i2-gap.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u8-i2-gap.c execution test PASS: gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c execution test PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap2-big-array.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-a-u8-i8-gap2.c execution test PASS: gcc.dg/vect/vect-strided-a-u8-i8-gap2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-float.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-float.c execution test XFAIL: gcc.dg/vect/vect-strided-float.c scan-tree-dump-times vect "vectorized 0 loops" 1 PASS: gcc.dg/vect/vect-strided-mult-char-ls.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-mult-char-ls.c execution test PASS: gcc.dg/vect/vect-strided-mult-char-ls.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-mult.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-mult.c execution test PASS: gcc.dg/vect/vect-strided-mult.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-same-dr.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-same-dr.c execution test PASS: gcc.dg/vect/vect-strided-same-dr.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-store-a-u8-i2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-store-a-u8-i2.c execution test PASS: gcc.dg/vect/vect-strided-store-a-u8-i2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-store-u16-i4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-store-u16-i4.c execution test PASS: gcc.dg/vect/vect-strided-store-u16-i4.c scan-tree-dump-times vect "vectorized 1 loops" 2 PASS: gcc.dg/vect/vect-strided-store-u32-i2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-store-u32-i2.c execution test XFAIL: gcc.dg/vect/vect-strided-store-u32-i2.c scan-tree-dump-times vect "vectorized 0 loops" 1 PASS: gcc.dg/vect/vect-strided-store-u32-i2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u16-i2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u16-i2.c execution test PASS: gcc.dg/vect/vect-strided-u16-i2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u16-i3.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u16-i3.c execution test PASS: gcc.dg/vect/vect-strided-u16-i3.c scan-tree-dump-times vect "vectorized 4 loops" 1 PASS: gcc.dg/vect/vect-strided-u16-i4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u16-i4.c execution test PASS: gcc.dg/vect/vect-strided-u16-i4.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u32-i4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u32-i4.c execution test PASS: gcc.dg/vect/vect-strided-u32-i4.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u32-i8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u32-i8.c execution test PASS: gcc.dg/vect/vect-strided-u32-i8.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u32-mult.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u32-mult.c execution test PASS: gcc.dg/vect/vect-strided-u32-mult.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i2-gap.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i2-gap.c execution test PASS: gcc.dg/vect/vect-strided-u8-i2-gap.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i2.c execution test PASS: gcc.dg/vect/vect-strided-u8-i2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap2-big-array.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap2.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap2.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap2.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8-gap4.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8-gap4.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8-gap4.c scan-tree-dump-times vect "vectorized 2 loops" 1 PASS: gcc.dg/vect/vect-strided-u8-i8.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-strided-u8-i8.c execution test PASS: gcc.dg/vect/vect-strided-u8-i8.c scan-tree-dump-times vect "vectorized 1 loops" 1 PASS: gcc.dg/vect/vect-vfa-03.c (test for excess errors) [-PASS:-]{+FAIL:+} gcc.dg/vect/vect-vfa-03.c execution test XFAIL: gcc.dg/vect/vect-vfa-03.c scan-tree-dump-times vect "vectorized 1 loops" 0 PASS: gcc.dg/vect/vect-vfa-03.c scan-tree-dump-times vect "vectorized 1 loops" 1 Miscellaneous GCN target Fortran regressions: generally, both '-march=gfx908', '-march=gfx1100': @@ -811,9 +811,9 @@ PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -O1 execution test PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -O2 (test for excess errors) PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -O2 execution test PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/c-interop/fc-descriptor-7.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/c-interop/fc-descriptor-7.f90 -O3 -g execution test PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -Os (test for excess errors) PASS: gfortran.dg/c-interop/fc-descriptor-7.f90 -Os execution test @@ -1013,9 +1013,9 @@ PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -O1 execution test PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -O2 (test for excess errors) PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -O2 execution test PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/c-interop/ff-descriptor-7.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/c-interop/ff-descriptor-7.f90 -O3 -g execution test PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -Os (test for excess errors) PASS: gfortran.dg/c-interop/ff-descriptor-7.f90 -Os execution test @@ -26750,9 +26751,9 @@ PASS: gfortran.dg/finalize_15.f90 -O1 execution test PASS: gfortran.dg/finalize_15.f90 -O2 (test for excess errors) PASS: gfortran.dg/finalize_15.f90 -O2 execution test PASS: gfortran.dg/finalize_15.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/finalize_15.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/finalize_15.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/finalize_15.f90 -O3 -g execution test PASS: gfortran.dg/finalize_15.f90 -Os (test for excess errors) PASS: gfortran.dg/finalize_15.f90 -Os execution test PASS: gfortran.dg/inline_matmul_10.f90 -O (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_matmul_10.f90 -O execution test @@ -30726,17 +30729,17 @@ PASS: gfortran.dg/inline_matmul_24.f90 -O2 (test for excess errors) PASS: gfortran.dg/inline_matmul_24.f90 -O2 execution test PASS: gfortran.dg/inline_matmul_24.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions scan-tree-dump-times original "gamma5\\[__var_1_do \\* 4 \\+ __var_2_do\\]|gamma5\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\]" 1 PASS: gfortran.dg/inline_matmul_24.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_matmul_24.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/inline_matmul_24.f90 -O3 -g scan-tree-dump-times original "gamma5\\[__var_1_do \\* 4 \\+ __var_2_do\\]|gamma5\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\]" 1 PASS: gfortran.dg/inline_matmul_24.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_matmul_24.f90 -O3 -g execution test PASS: gfortran.dg/inline_matmul_24.f90 -Os scan-tree-dump-times original "gamma5\\[__var_1_do \\* 4 \\+ __var_2_do\\]|gamma5\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\]" 1 PASS: gfortran.dg/inline_matmul_24.f90 -Os (test for excess errors) PASS: gfortran.dg/inline_matmul_24.f90 -Os execution test PASS: gfortran.dg/inline_matmul_3.f90 -O scan-tree-dump-times optimized "_gfortran_matmul" 8 PASS: gfortran.dg/inline_matmul_3.f90 -O (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_matmul_3.f90 -O execution test @@ -30877,7 +30880,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -O0 scan-tree-dump-times original " PASS: gfortran.dg/inline_transpose_1.f90 -O0 scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -O0 scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -O0 execution test PASS: gfortran.dg/inline_transpose_1.f90 -O1 (test for warnings, line 112) PASS: gfortran.dg/inline_transpose_1.f90 -O1 (test for warnings, line 120) PASS: gfortran.dg/inline_transpose_1.f90 -O1 (test for warnings, line 144) @@ -30903,7 +30906,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -O1 scan-tree-dump-times original " PASS: gfortran.dg/inline_transpose_1.f90 -O1 scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -O1 scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -O1 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -O1 execution test PASS: gfortran.dg/inline_transpose_1.f90 -O2 (test for warnings, line 112) PASS: gfortran.dg/inline_transpose_1.f90 -O2 (test for warnings, line 120) PASS: gfortran.dg/inline_transpose_1.f90 -O2 (test for warnings, line 144) @@ -30929,7 +30932,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -O2 scan-tree-dump-times original " PASS: gfortran.dg/inline_transpose_1.f90 -O2 scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -O2 scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -O2 execution test PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for warnings, line 112) PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for warnings, line 120) PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for warnings, line 144) @@ -30955,7 +30958,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loo PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g (test for warnings, line 112) PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g (test for warnings, line 120) PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g (test for warnings, line 144) @@ -30981,7 +30984,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g scan-tree-dump-times origina PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -O3 -g execution test PASS: gfortran.dg/inline_transpose_1.f90 -Os (test for warnings, line 112) PASS: gfortran.dg/inline_transpose_1.f90 -Os (test for warnings, line 120) PASS: gfortran.dg/inline_transpose_1.f90 -Os (test for warnings, line 144) @@ -31007,7 +31010,7 @@ PASS: gfortran.dg/inline_transpose_1.f90 -Os scan-tree-dump-times original " PASS: gfortran.dg/inline_transpose_1.f90 -Os scan-tree-dump-times original "_gfortran_transpose" 0 PASS: gfortran.dg/inline_transpose_1.f90 -Os scan-tree-dump-times original "struct[^\\n]*atmp" 24 PASS: gfortran.dg/inline_transpose_1.f90 -Os (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/inline_transpose_1.f90 -Os execution test PASS: gfortran.dg/intrinsic_intkinds_1.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/intrinsic_intkinds_1.f90 -O0 execution test PASS: gfortran.dg/intrinsic_intkinds_1.f90 -O1 (test for excess errors) PASS: gfortran.dg/intrinsic_intkinds_1.f90 -O1 execution test PASS: gfortran.dg/intrinsic_intkinds_1.f90 -O2 (test for excess errors) PASS: gfortran.dg/matmul_1.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_1.f90 -O0 execution test PASS: gfortran.dg/matmul_1.f90 -O1 (test for excess errors) PASS: gfortran.dg/matmul_1.f90 -O1 execution test PASS: gfortran.dg/matmul_1.f90 -O2 (test for excess errors) PASS: gfortran.dg/matmul_1.f90 -O2 execution test PASS: gfortran.dg/matmul_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/matmul_1.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_1.f90 -O3 -g execution test PASS: gfortran.dg/matmul_1.f90 -Os (test for excess errors) PASS: gfortran.dg/matmul_1.f90 -Os execution test PASS: gfortran.dg/matmul_10.f90 -O0 (test for warnings, line 12) PASS: gfortran.dg/matmul_10.f90 -O0 (test for warnings, line 17) PASS: gfortran.dg/matmul_10.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_10.f90 -O0 execution test PASS: gfortran.dg/matmul_10.f90 -O1 (test for warnings, line 12) PASS: gfortran.dg/matmul_10.f90 -O1 (test for warnings, line 17) PASS: gfortran.dg/matmul_10.f90 -O1 (test for excess errors) @@ -34670,7 +34673,7 @@ PASS: gfortran.dg/matmul_10.f90 -Os execution test PASS: gfortran.dg/matmul_12.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_12.f90 -O0 execution test PASS: gfortran.dg/matmul_12.f90 -O1 (test for excess errors) PASS: gfortran.dg/matmul_12.f90 -O1 execution test PASS: gfortran.dg/matmul_12.f90 -O2 (test for excess errors) PASS: gfortran.dg/matmul_2.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_2.f90 -O0 execution test PASS: gfortran.dg/matmul_2.f90 -O1 (test for excess errors) PASS: gfortran.dg/matmul_2.f90 -O1 execution test PASS: gfortran.dg/matmul_2.f90 -O2 (test for excess errors) PASS: gfortran.dg/matmul_3.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_3.f90 -O0 execution test PASS: gfortran.dg/matmul_3.f90 -O1 (test for excess errors) PASS: gfortran.dg/matmul_3.f90 -O1 execution test PASS: gfortran.dg/matmul_3.f90 -O2 (test for excess errors) PASS: gfortran.dg/matmul_6.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/matmul_6.f90 -O0 execution test PASS: gfortran.dg/matmul_6.f90 -O1 (test for excess errors) PASS: gfortran.dg/matmul_6.f90 -O1 execution test PASS: gfortran.dg/matmul_6.f90 -O2 (test for excess errors) @@ -38912,11 +38915,11 @@ PASS: gfortran.dg/overload_5.f90 -O0 execution test PASS: gfortran.dg/overload_5.f90 -O1 (test for excess errors) PASS: gfortran.dg/overload_5.f90 -O1 execution test PASS: gfortran.dg/overload_5.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/overload_5.f90 -O2 execution test PASS: gfortran.dg/overload_5.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/overload_5.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/overload_5.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/overload_5.f90 -O3 -g execution test PASS: gfortran.dg/overload_5.f90 -Os (test for excess errors) PASS: gfortran.dg/overload_5.f90 -Os execution test Not '-march=gfx908'; '-march=gfx1100' only. @@ -39828,9 +39831,9 @@ PASS: gfortran.dg/pointer_assign_4.f90 -O1 execution test PASS: gfortran.dg/pointer_assign_4.f90 -O2 (test for excess errors) PASS: gfortran.dg/pointer_assign_4.f90 -O2 execution test PASS: gfortran.dg/pointer_assign_4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/pointer_assign_4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/pointer_assign_4.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/pointer_assign_4.f90 -O3 -g execution test PASS: gfortran.dg/pointer_assign_4.f90 -Os (test for excess errors) PASS: gfortran.dg/pointer_assign_4.f90 -Os execution test @@ -40290,11 +40293,11 @@ PASS: gfortran.dg/pointer_remapping_10.f90 -O0 execution test PASS: gfortran.dg/pointer_remapping_10.f90 -O1 (test for excess errors) PASS: gfortran.dg/pointer_remapping_10.f90 -O1 execution test PASS: gfortran.dg/pointer_remapping_10.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/pointer_remapping_10.f90 -O2 execution test PASS: gfortran.dg/pointer_remapping_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/pointer_remapping_10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: gfortran.dg/pointer_remapping_10.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/pointer_remapping_10.f90 -O3 -g execution test PASS: gfortran.dg/pointer_remapping_10.f90 -Os (test for excess errors) PASS: gfortran.dg/pointer_remapping_10.f90 -Os execution test Not '-march=gfx908'; '-march=gfx1100' only. PASS: gfortran.dg/transpose_4.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/transpose_4.f90 -O0 execution test PASS: gfortran.dg/transpose_4.f90 -O1 (test for excess errors) PASS: gfortran.dg/transpose_4.f90 -O1 execution test PASS: gfortran.dg/transpose_4.f90 -O2 (test for excess errors) PASS: gfortran.dg/vector_subscript_5.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} gfortran.dg/vector_subscript_5.f90 -O0 execution test PASS: gfortran.dg/vector_subscript_5.f90 -O1 (test for excess errors) PASS: gfortran.dg/vector_subscript_5.f90 -O1 execution test PASS: gfortran.dg/vector_subscript_5.f90 -O2 (test for excess errors) @@ -56898,7 +56901,7 @@ PASS: gfortran.fortran-torture/execute/arrayarg.f90 execution, -O2 PASS: gfortran.fortran-torture/execute/arrayarg.f90 execution, -O2 -fbounds-check PASS: gfortran.fortran-torture/execute/arrayarg.f90 execution, -O2 -fomit-frame-pointer -finline-functions PASS: gfortran.fortran-torture/execute/arrayarg.f90 execution, -O2 -fomit-frame-pointer -finline-functions -funroll-loops [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/arrayarg.f90 execution, -O3 -g PASS: gfortran.fortran-torture/execute/arrayarg.f90 execution, -Os @@ -57982,11 +57985,11 @@ PASS: gfortran.fortran-torture/execute/in-pack.f90 compilation, -O3 -g PASS: gfortran.fortran-torture/execute/in-pack.f90 compilation, -Os PASS: gfortran.fortran-torture/execute/in-pack.f90 execution, -O0 PASS: gfortran.fortran-torture/execute/in-pack.f90 execution, -O1 [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/in-pack.f90 execution, -O2 [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/in-pack.f90 execution, -O2 -fbounds-check [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/in-pack.f90 execution, -O2 -fomit-frame-pointer -finline-functions [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/in-pack.f90 execution, -O2 -fomit-frame-pointer -finline-functions -funroll-loops [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/in-pack.f90 execution, -O3 -g PASS: gfortran.fortran-torture/execute/in-pack.f90 execution, -Os Not '-march=gfx908'; '-march=gfx1100' only. @@ -58460,7 +58463,7 @@ PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 compilation, -O2 -f PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 compilation, -O2 -fomit-frame-pointer -finline-functions -funroll-loops PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 compilation, -O3 -g PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 compilation, -Os [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution, -O0 PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution, -O1 PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution, -O2 PASS: gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution, -O2 -fbounds-check @@ -60034,7 +60037,7 @@ PASS: gfortran.fortran-torture/execute/where_1.f90 execution, -O2 PASS: gfortran.fortran-torture/execute/where_1.f90 execution, -O2 -fbounds-check PASS: gfortran.fortran-torture/execute/where_1.f90 execution, -O2 -fomit-frame-pointer -finline-functions PASS: gfortran.fortran-torture/execute/where_1.f90 execution, -O2 -fomit-frame-pointer -finline-functions -funroll-loops [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/where_1.f90 execution, -O3 -g PASS: gfortran.fortran-torture/execute/where_1.f90 execution, -Os @@ -60226,7 +60229,7 @@ PASS: gfortran.fortran-torture/execute/where_6.f90 execution, -O2 PASS: gfortran.fortran-torture/execute/where_6.f90 execution, -O2 -fbounds-check PASS: gfortran.fortran-torture/execute/where_6.f90 execution, -O2 -fomit-frame-pointer -finline-functions PASS: gfortran.fortran-torture/execute/where_6.f90 execution, -O2 -fomit-frame-pointer -finline-functions -funroll-loops [-PASS:-]{+FAIL:+} gfortran.fortran-torture/execute/where_6.f90 execution, -O3 -g PASS: gfortran.fortran-torture/execute/where_6.f90 execution, -Os Grüße Thomas > 2021-01-13 Julian Brown > > gcc/ > * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add > documentation hook. > * doc/tm.texi: Regenerate. > * target.def (prefer_gather_scatter): Add target hook under vectorizer. > * tree-vect-stmts.c (get_group_load_store_type): Optionally prefer > gather/scatter instructions to scalar/elementwise fallback. > * config/gcn/gcn.c (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define > hook. > --- > gcc/config/gcn/gcn.c | 2 ++ > gcc/doc/tm.texi | 5 +++++ > gcc/doc/tm.texi.in | 2 ++ > gcc/target.def | 8 ++++++++ > gcc/tree-vect-stmts.c | 9 +++++++-- > 5 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c > index ee9f00558305..ea88b5e91244 100644 > --- a/gcc/config/gcn/gcn.c > +++ b/gcc/config/gcn/gcn.c > @@ -6501,6 +6501,8 @@ gcn_dwarf_register_span (rtx rtl) > gcn_vector_alignment_reachable > #undef TARGET_VECTOR_MODE_SUPPORTED_P > #define TARGET_VECTOR_MODE_SUPPORTED_P gcn_vector_mode_supported_p > +#undef TARGET_VECTORIZE_PREFER_GATHER_SCATTER > +#define TARGET_VECTORIZE_PREFER_GATHER_SCATTER true > > struct gcc_target targetm = TARGET_INITIALIZER; > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 581b7b51eeb0..bd0b2eea477a 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6122,6 +6122,11 @@ The default is @code{NULL_TREE} which means to not vectorize scatter > stores. > @end deftypefn > > +@deftypevr {Target Hook} bool TARGET_VECTORIZE_PREFER_GATHER_SCATTER > +This hook is set to TRUE if gather loads or scatter stores are cheaper on > +this target than a sequence of elementwise loads or stores. > +@end deftypevr > + > @deftypefn {Target Hook} int TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN (struct cgraph_node *@var{}, struct cgraph_simd_clone *@var{}, @var{tree}, @var{int}) > This hook should set @var{vecsize_mangle}, @var{vecsize_int}, @var{vecsize_float} > fields in @var{simd_clone} structure pointed by @var{clone_info} argument and also > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index afa19d4ac63c..c0883e5da82c 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -4195,6 +4195,8 @@ address; but often a machine-dependent strategy can generate better code. > > @hook TARGET_VECTORIZE_BUILTIN_SCATTER > > +@hook TARGET_VECTORIZE_PREFER_GATHER_SCATTER > + > @hook TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN > > @hook TARGET_SIMD_CLONE_ADJUST > diff --git a/gcc/target.def b/gcc/target.def > index 00421f3a6acd..0b34ab5c3d52 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -2027,6 +2027,14 @@ all zeros. GCC can then try to branch around the instruction instead.", > (unsigned ifn), > default_empty_mask_is_expensive) > > +/* Prefer gather/scatter loads/stores to e.g. elementwise accesses if\n\ > +we cannot use a contiguous access. */ > +DEFHOOKPOD > +(prefer_gather_scatter, > + "This hook is set to TRUE if gather loads or scatter stores are cheaper on\n\ > +this target than a sequence of elementwise loads or stores.", > + bool, false) > + > /* Target builtin that implements vector gather operation. */ > DEFHOOK > (builtin_gather, > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 9ace345fc5e2..e117d3d16afc 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -2444,9 +2444,14 @@ get_group_load_store_type (stmt_vec_info stmt_info, tree vectype, bool slp, > it probably isn't a win to use separate strided accesses based > on nearby locations. Or, even if it's a win over scalar code, > it might not be a win over vectorizing at a lower VF, if that > - allows us to use contiguous accesses. */ > + allows us to use contiguous accesses. > + > + On some targets (e.g. AMD GCN), always use gather/scatter accesses > + here since those are the only types of vector loads/stores available, > + and the fallback case of using elementwise accesses is very > + inefficient. */ > if (*memory_access_type == VMAT_ELEMENTWISE > - && single_element_p > + && (targetm.vectorize.prefer_gather_scatter || single_element_p) > && loop_vinfo > && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo, > masked_p, gs_info)) From 8678fc697046fba1014f1db6321ee670538b0881 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 3 Jul 2024 12:20:17 +0200 Subject: [PATCH] Revert "[og10] vect: Add target hook to prefer gather/scatter instructions" Testing current OG14 commit 735bbbfc6eaf58522c3ebb0946b66f33958ea134 for '--target=amdgcn-amdhsa' (I've tested '-march=gfx908', '-march=gfx1100'), this change has been identified to be causing ~100 instances of execution test PASS -> FAIL, thus wrong-code generation. It's possible that we've had the same misbehavior also on OG13 and earlier, but just nobody ever tested that. And/or, that at some point in time, the original patch fell out of sync, wasn't updated for relevant upstream vectorizer changes. Until someone gets to analyze that (and upstream these changes here), we shall revert this commit on OG14. gcc/ * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Remove documentation hook. * doc/tm.texi: Regenerate. * target.def (prefer_gather_scatter): Remove target hook under vectorizer. * tree-vect-stmts.cc (get_group_load_store_type): Remove code to optionally prefer gather/scatter instructions to scalar/elementwise fallback. * config/gcn/gcn.cc (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Remove hook definition. This reverts OG14 commit 4abc54b6d6c3129cf4233e49231b1255b236c2be. --- gcc/ChangeLog.omp | 13 +++++++++++++ gcc/config/gcn/gcn.cc | 2 -- gcc/doc/tm.texi | 5 ----- gcc/doc/tm.texi.in | 2 -- gcc/target.def | 8 -------- gcc/tree-vect-stmts.cc | 9 ++------- 6 files changed, 15 insertions(+), 24 deletions(-) diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index ac4a30e81c8..3dd5bd03dc9 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,16 @@ +2024-07-03 Thomas Schwinge + + * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Remove + documentation hook. + * doc/tm.texi: Regenerate. + * target.def (prefer_gather_scatter): Remove target hook under + vectorizer. + * tree-vect-stmts.cc (get_group_load_store_type): Remove code to + optionally prefer gather/scatter instructions to + scalar/elementwise fallback. + * config/gcn/gcn.cc (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): + Remove hook definition. + 2024-05-19 Roger Sayle * config/nvptx/nvptx.md (popcount2): Split into... diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index a247eecd8e8..d6531f55190 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -8059,8 +8059,6 @@ gcn_dwarf_register_span (rtx rtl) gcn_vector_alignment_reachable #undef TARGET_VECTOR_MODE_SUPPORTED_P #define TARGET_VECTOR_MODE_SUPPORTED_P gcn_vector_mode_supported_p -#undef TARGET_VECTORIZE_PREFER_GATHER_SCATTER -#define TARGET_VECTORIZE_PREFER_GATHER_SCATTER true struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index e64c7541f60..c8b8b126b24 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6482,11 +6482,6 @@ The default is @code{NULL_TREE} which means to not vectorize scatter stores. @end deftypefn -@deftypevr {Target Hook} bool TARGET_VECTORIZE_PREFER_GATHER_SCATTER -This hook is set to TRUE if gather loads or scatter stores are cheaper on -this target than a sequence of elementwise loads or stores. -@end deftypevr - @deftypefn {Target Hook} int TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN (struct cgraph_node *@var{}, struct cgraph_simd_clone *@var{}, @var{tree}, @var{int}, @var{bool}) This hook should set @var{vecsize_mangle}, @var{vecsize_int}, @var{vecsize_float} fields in @var{simd_clone} structure pointed by @var{clone_info} argument and also diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 645950b12d7..658e1e63371 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4309,8 +4309,6 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_BUILTIN_SCATTER -@hook TARGET_VECTORIZE_PREFER_GATHER_SCATTER - @hook TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN @hook TARGET_SIMD_CLONE_ADJUST diff --git a/gcc/target.def b/gcc/target.def index e4b26a7df3e..fdad7bbc93e 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2044,14 +2044,6 @@ all zeros. GCC can then try to branch around the instruction instead.", (unsigned ifn), default_empty_mask_is_expensive) -/* Prefer gather/scatter loads/stores to e.g. elementwise accesses if\n\ -we cannot use a contiguous access. */ -DEFHOOKPOD -(prefer_gather_scatter, - "This hook is set to TRUE if gather loads or scatter stores are cheaper on\n\ -this target than a sequence of elementwise loads or stores.", - bool, false) - /* Target builtin that implements vector gather operation. */ DEFHOOK (builtin_gather, diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index a7e33120eda..f8d8636b139 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2217,14 +2217,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, it probably isn't a win to use separate strided accesses based on nearby locations. Or, even if it's a win over scalar code, it might not be a win over vectorizing at a lower VF, if that - allows us to use contiguous accesses. - - On some targets (e.g. AMD GCN), always use gather/scatter accesses - here since those are the only types of vector loads/stores available, - and the fallback case of using elementwise accesses is very - inefficient. */ + allows us to use contiguous accesses. */ if (*memory_access_type == VMAT_ELEMENTWISE - && (targetm.vectorize.prefer_gather_scatter || single_element_p) + && single_element_p && loop_vinfo && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo, masked_p, gs_info)) -- 2.34.1