From patchwork Wed Feb 21 11:14:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1902075 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TftwF3q5Bz23cl for ; Wed, 21 Feb 2024 22:14:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6D1B2385842A for ; Wed, 21 Feb 2024 11:14:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id ACB66385841C for ; Wed, 21 Feb 2024 11:14:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ACB66385841C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ACB66385841C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708514065; cv=none; b=kA5O/V57+pNc8KVGVboX1P4rkkhRaQpKk+rNrmaVhUD0RJ61juCQuKEbaO9qaNJdc/1iShfG3B+Gvtjxt5VlpjRXK78ztjB7Wf+Cmsixf/Tqp0gfG0ahZXyJkA3AhsIdDMXEoldZg5K2cs08PkPIAwsCdJICIEcGG/rnQeRyOfg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708514065; c=relaxed/simple; bh=A12lJ2lW4+BeeFTdOttYx1c+ko9t8DyzWkgM2oKC75c=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=A3ZhdoBapKmGwbfjdlXgYA6OlcxpousYKVQRMXsZKahsh5wQhe52ZLcu1E+u64jLzCXnrDaqDTjlcwKrU+/UvtonXzl/x8aHc/OUSfD2KWdWZ5V2ZbQU08po2zYITOkNHo5XzRgHGdVZh8xQ52rVIldyMA896R7tboMKyXzG+jE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CC407FEC for ; Wed, 21 Feb 2024 03:14:59 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E73A53F73F for ; Wed, 21 Feb 2024 03:14:20 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [pushed] aarch64: Remove the aarch64_commit_lazy_save pattern Date: Wed, 21 Feb 2024 11:14:19 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-20.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The main purpose of the aarch64_commit_lazy_save pattern was to defer insertion of a half-diamond until splitting, since splitting knew how to create the associated basic blocks. However, the fix for PR113220 means that mode-switching also knows how to do that. This patch therefore removes the pattern and emits the subinstructions directly. On its own, this is actually a slight regression, since it means we keep an unnecessary zero { za }. But the cases where that happens are wrong for a different reason, and this patch is a prerequisite to fixing it. Tested on aarch64-linux-gnu & pushed. Richard gcc/ * config/aarch64/aarch64-sme.md (aarch64_commit_lazy_save): Remove, directly inserting the associated sequence * config/aarch64/aarch64.cc (aarch64_mode_emit_local_sme_state): ...here instead. gcc/testsuite/ * gcc.target/aarch64/sme/zt0_state_5.c (test3, test5): Expect zero { za }s. --- gcc/config/aarch64/aarch64-sme.md | 45 ------------------- gcc/config/aarch64/aarch64.cc | 13 ++++-- .../gcc.target/aarch64/sme/zt0_state_5.c | 2 + 3 files changed, 11 insertions(+), 49 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 81d941871ac..c95d4aa696c 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -455,51 +455,6 @@ (define_insn "aarch64_end_private_za_call" [(set_attr "type" "no_insn")] ) -;; This pseudo-instruction is emitted before a private-ZA function uses -;; PSTATE.ZA state for the first time. The instruction checks whether -;; ZA currently contains data belonging to a caller and commits the -;; lazy save if so. -;; -;; Operand 0 is the incoming value of TPIDR2_EL0. Operand 1 is nonzero -;; if ZA is live, and should therefore be zeroed after committing a save. -;; -;; The instruction is generated by the mode-switching pass. It is a -;; define_insn_and_split rather than a define_expand because of the -;; internal control flow. -(define_insn_and_split "aarch64_commit_lazy_save" - [(set (reg:DI ZA_FREE_REGNUM) - (unspec:DI [(match_operand 0 "pmode_register_operand" "r") - (match_operand 1 "const_int_operand") - (reg:DI SME_STATE_REGNUM) - (reg:DI TPIDR2_SETUP_REGNUM) - (reg:VNx16QI ZA_REGNUM)] UNSPEC_COMMIT_LAZY_SAVE)) - (set (reg:DI ZA_REGNUM) - (unspec:DI [(reg:DI SME_STATE_REGNUM) - (reg:DI ZA_FREE_REGNUM)] UNSPEC_INITIAL_ZERO_ZA)) - (clobber (reg:DI R14_REGNUM)) - (clobber (reg:DI R15_REGNUM)) - (clobber (reg:DI R16_REGNUM)) - (clobber (reg:DI R17_REGNUM)) - (clobber (reg:DI R18_REGNUM)) - (clobber (reg:DI R30_REGNUM)) - (clobber (reg:CC CC_REGNUM))] - "" - "#" - "true" - [(const_int 0)] - { - auto label = gen_label_rtx (); - auto jump = emit_jump_insn (gen_aarch64_cbeqdi1 (operands[0], label)); - JUMP_LABEL (jump) = label; - emit_insn (gen_aarch64_tpidr2_save ()); - emit_insn (gen_aarch64_clear_tpidr2 ()); - if (INTVAL (operands[1]) != 0) - emit_insn (gen_aarch64_initial_zero_za ()); - emit_label (label); - DONE; - } -) - ;; ========================================================================= ;; == Loads, stores and moves ;; ========================================================================= diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 6a39ed8eddf..ed7fbca512b 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -29339,12 +29339,17 @@ aarch64_mode_emit_local_sme_state (aarch64_local_sme_state mode, msr tpidr2_el0, xzr zero { za } // Only if ZA is live no_save: */ - bool is_active = (mode == aarch64_local_sme_state::ACTIVE_LIVE - || mode == aarch64_local_sme_state::ACTIVE_DEAD); auto tmp_reg = gen_reg_rtx (DImode); - auto active_flag = gen_int_mode (is_active, DImode); emit_insn (gen_aarch64_read_tpidr2 (tmp_reg)); - emit_insn (gen_aarch64_commit_lazy_save (tmp_reg, active_flag)); + auto label = gen_label_rtx (); + auto jump = emit_jump_insn (gen_aarch64_cbeqdi1 (tmp_reg, label)); + JUMP_LABEL (jump) = label; + emit_insn (gen_aarch64_tpidr2_save ()); + emit_insn (gen_aarch64_clear_tpidr2 ()); + if (mode == aarch64_local_sme_state::ACTIVE_LIVE + || mode == aarch64_local_sme_state::ACTIVE_DEAD) + emit_insn (gen_aarch64_initial_zero_za ()); + emit_label (label); } if (mode == aarch64_local_sme_state::ACTIVE_LIVE diff --git a/gcc/testsuite/gcc.target/aarch64/sme/zt0_state_5.c b/gcc/testsuite/gcc.target/aarch64/sme/zt0_state_5.c index e18b395476c..0fba21868ed 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/zt0_state_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sme/zt0_state_5.c @@ -54,6 +54,7 @@ __arm_new("zt0") int test3() ** cbz x0, [^\n]+ ** bl __arm_tpidr2_save ** msr tpidr2_el0, xzr +** zero { za } ** smstart za ** bl in_zt0 ** smstop za @@ -101,6 +102,7 @@ __arm_new("zt0") void test5() ** cbz x0, [^\n]+ ** bl __arm_tpidr2_save ** msr tpidr2_el0, xzr +** zero { za } ** smstart za ** bl out_zt0 ** ...