From patchwork Sun Dec 10 19:55:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1874237 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SpFwf4rT7z1ySY for ; Mon, 11 Dec 2023 06:55:30 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AF7C43858284 for ; Sun, 10 Dec 2023 19:55:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F3BD038582B8 for ; Sun, 10 Dec 2023 19:55:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F3BD038582B8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F3BD038582B8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702238119; cv=none; b=qSKbmLuvuMo8Ea5NY1/nAxaYzKKha7a9Lue/ZpTHw/vlR9TJp9eE43ojUzMASlf2rjFC//2ITNMjem8BJPn76mUncFQpkJEDLuh9PoaA6AOqPd4QFd5N+ZWHdMc5rWCWob2pxyjpcNkeH1IEuxOw3uRUL2hKXAlUJLV71YSOZyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702238119; c=relaxed/simple; bh=FkpW7t9hvkNz1e36p2NwKwTa32MPhjdxDzLxxswWcLc=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=U8FgF+4HBdef1ouPhAbfE2C5kOka8CC0fH/YC5Y5h2/iVDvOb7uIzJ15SFCo9aVOfGFyMGY15AFhyEcB3yXjhXi7Ni4F+CoyT2eT0Jv7WV9GdrmO0dRRk4eFOSRO09soFpIbDEnYfr/uwUZlu6M4ysbW6kBw+ytuZuaqyLnBMcs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D5BC1FEC for ; Sun, 10 Dec 2023 11:56:03 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5BD083F738 for ; Sun, 10 Dec 2023 11:55:17 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [pushed] aarch64: Fix invalid subregs for BE svread/write_za Date: Sun, 10 Dec 2023 19:55:16 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-21.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Multi-register svread_za and svwrite_za are implemented using one pattern per register count, with the register contents being bitcast on entry (for writes) or return (for reads). Previously we relied on subregs for this, with the subreg for reads being handled by target-independent code. But using subregs isn't correct for many big-endian cases, where following subreg rules often requires actual instructions. The semantics are instead supposed to be those of svreinterpret. Tested on aarch64-linux-gnu and aarch64_be-elf, pushed to trunk. Richard gcc/ PR target/112931 PR target/112933 * config/aarch64/aarch64-protos.h (aarch64_sve_reinterpret): Declare. * config/aarch64/aarch64.cc (aarch64_sve_reinterpret): New function. * config/aarch64/aarch64-sve-builtins-sme.cc (svread_za_impl::expand) (svwrite_za_impl::expand): Use it to cast the SVE register to the right mode. --- gcc/config/aarch64/aarch64-protos.h | 1 + .../aarch64/aarch64-sve-builtins-sme.cc | 5 +++-- gcc/config/aarch64/aarch64.cc | 22 +++++++++++++++++++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index d1af7f40891..eaf74a725e7 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -789,6 +789,7 @@ bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx); bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT); +rtx aarch64_sve_reinterpret (machine_mode, rtx); bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx); bool aarch64_move_imm (unsigned HOST_WIDE_INT, machine_mode); machine_mode aarch64_sve_int_mode (machine_mode); diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.cc b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc index 8d06a72f384..047a333ef47 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sme.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc @@ -365,7 +365,8 @@ public: expand (function_expander &e) const override { machine_mode mode = e.vectors_per_tuple () == 4 ? VNx8DImode : VNx4DImode; - return e.use_exact_insn (code_for_aarch64_sme_read (mode)); + rtx res = e.use_exact_insn (code_for_aarch64_sme_read (mode)); + return aarch64_sve_reinterpret (e.result_mode (), res); } }; @@ -457,7 +458,7 @@ public: expand (function_expander &e) const override { machine_mode mode = e.vectors_per_tuple () == 4 ? VNx8DImode : VNx4DImode; - e.args[1] = lowpart_subreg (mode, e.args[1], e.tuple_mode (1)); + e.args[1] = aarch64_sve_reinterpret (mode, e.args[1]); return e.use_exact_insn (code_for_aarch64_sme_write (mode)); } }; diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 2a64053f675..0889ceb7db1 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3226,6 +3226,28 @@ aarch64_split_simd_move (rtx dst, rtx src) } } +/* Return a register that contains SVE value X reinterpreted as SVE mode MODE. + The semantics of those of svreinterpret rather than those of subregs; + see the comment at the head of aarch64-sve.md for details about the + difference. */ + +rtx +aarch64_sve_reinterpret (machine_mode mode, rtx x) +{ + if (GET_MODE (x) == mode) + return x; + + /* can_change_mode_class must only return true if subregs and svreinterprets + have the same semantics. */ + if (targetm.can_change_mode_class (GET_MODE (x), mode, FP_REGS)) + return lowpart_subreg (mode, x, GET_MODE (x)); + + rtx res = gen_reg_rtx (mode); + x = force_reg (GET_MODE (x), x); + emit_insn (gen_aarch64_sve_reinterpret (mode, res, x)); + return res; +} + bool aarch64_zero_extend_const_eq (machine_mode xmode, rtx x, machine_mode ymode, rtx y)