From patchwork Wed Aug 28 15:44:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1977957 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wv7yd5vwCz1yZd for ; Thu, 29 Aug 2024 01:44:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A57023860C3E for ; Wed, 28 Aug 2024 15:44:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 11F813861010 for ; Wed, 28 Aug 2024 15:44:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 11F813861010 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 11F813861010 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724859876; cv=none; b=WcXTMkWOho7ruFbSiCdXkMcURV7sei0D9O6eo/Dh0+Y0ab/hdu6XDuZLwxeTDnQszlijSkpivjsAToaJnmp2DNW/bqq5OerO7YsP/huxjmuwcff5j4LkLOSKtzYW/BA1DGljL0nA9mbrBEYCg0IJbBxeXLtkg4k7dLnVMINpI6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724859876; c=relaxed/simple; bh=OcYzBsyaeOgxKJAg6PlKlqM97X7QKiWYivn1Q7sK+eM=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=DgaE6O3aewnBjFjHLYEOKWHvGXFyFqciX8b1gGZR5dZB2/cT/oSx7VM8d74RhAg6AIOj4vTFWLmYbUPET+9ibVF3DAeOxkBfoRzazuRbMGQEpv/RtBLTKvH4E/Z0lt0FPdsHbFHLp+v/ml10e3vinU3EjuWsAmZ3lo75fxAmOTE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0E6FD11FB; Wed, 28 Aug 2024 08:45:01 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 961AE3F73B; Wed, 28 Aug 2024 08:44:33 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,Jennifer Schmitz , Kyrylo Tkachov , Tamar Christina , richard.sandiford@arm.com Cc: Jennifer Schmitz , Kyrylo Tkachov , Tamar Christina Subject: [pushed] aarch64: Fix gather x32/x64 selection Date: Wed, 28 Aug 2024 16:44:30 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Spam-Status: No, score=-18.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The SVE gather and scatter costs are classified based on whether they do 4 loads per 128 bits (x32) or 2 loads per 128 bits (x64). The number after the "x" refers to the number of bits in each "container". However, the test for which to use was based on the element size rather than the container size. This meant that we'd use the overly conservative x32 costs for VNx2SI gathers. VNx2SI gathers are really .D gathers in which the upper half of each extension result is ignored. This patch is necessary to switch -mtune=generic over to the "new" vector costs. Tested on aarch64-linux-gnu. Pushed as previously agreed with Tamar and Kyrill. Richard gcc/ * config/aarch64/aarch64.cc (aarch64_detect_vector_stmt_subtype) (aarch64_vector_costs::add_stmt_cost): Use the x64 cost rather than x32 cost for all VNx2 modes. --- gcc/config/aarch64/aarch64.cc | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 40dacfcf2e7..033ea61d3a8 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -16819,7 +16819,8 @@ aarch64_detect_vector_stmt_subtype (vec_info *vinfo, vect_cost_for_stmt kind, && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER) { unsigned int nunits = vect_nunits_for_cost (vectype); - if (GET_MODE_UNIT_BITSIZE (TYPE_MODE (vectype)) == 64) + /* Test for VNx2 modes, which have 64-bit containers. */ + if (known_eq (GET_MODE_NUNITS (TYPE_MODE (vectype)), aarch64_sve_vg)) return { sve_costs->gather_load_x64_cost, nunits }; return { sve_costs->gather_load_x32_cost, nunits }; } @@ -17309,7 +17310,9 @@ aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, const sve_vec_cost *sve_costs = aarch64_tune_params.vec_costs->sve; if (sve_costs) { - if (GET_MODE_UNIT_BITSIZE (TYPE_MODE (vectype)) == 64) + /* Test for VNx2 modes, which have 64-bit containers. */ + if (known_eq (GET_MODE_NUNITS (TYPE_MODE (vectype)), + aarch64_sve_vg)) m_sve_gather_scatter_init_cost += sve_costs->gather_load_x64_init_cost; else