From patchwork Tue May 22 13:13:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 918310 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-478157-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Ukb4T7tj"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40qx122vgmz9s7X for ; Tue, 22 May 2018 23:13:40 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=QKIVwW6X2uC+St+AHgCYsGZ8pUZ2VgnSlq33tXDk0DI XGDAmj72biLap6IedCXZmUxiwbUX12QfXay6uPyxoT4EMY2TJyFtk74X2MOWEDcP 22/T4eYroDKUL5taofuexTizapg2ZQlOOhCM/JBGZ/VjepyBEDbK8os95+ZKuoMQ = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=lkKW61HcDF6ot/f5XCx+sCmacTs=; b=Ukb4T7tjjeYUoHFWo K/KN8lA5Mgr5MZPPPNVZHd3UH2d99ZVXayS6L/+zQSav+/zlziKB3HbD33jHC8iq s2p0Q9rN1YiXrfPnP/ZKkR97664W8Kc9OXKh3gKPsDJxHSbONBCV9vqrCaNBmR1/ UIPgdmW7gmDeTTuKpDgw3pCfQg= Received: (qmail 54528 invoked by alias); 22 May 2018 13:13:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 53249 invoked by uid 89); 22 May 2018 13:13:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH autolearn=ham version=3.3.2 spammy=jackson, UD:predicates.md, predicates.md, predicatesmd X-HELO: foss.arm.com Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 22 May 2018 13:13:26 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4271A1435; Tue, 22 May 2018 06:13:25 -0700 (PDT) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2DB053F577; Tue, 22 May 2018 06:13:24 -0700 (PDT) Message-ID: <5B041772.80209@foss.arm.com> Date: Tue, 22 May 2018 14:13:22 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft , "Richard Earnshaw (lists)" , James Greenhalgh Subject: [PATCH][AArch64] Merge stores of D-register values with different modes [sending on behalf of Jackson Woodruff] Hi all, This patch merges loads and stores from D-registers that are of different modes. Code like this: typedef int __attribute__((vector_size(8))) vec; struct pair { vec v; double d; } Now generates a store pair instruction: void assign (struct pair *p, vec v) { p->v = v; p->d = 1.0; } Whereas previously it generated two `str` instructions. This patch also merges storing of double zero values with long integer values: struct pair { long long l; double d; } void foo (struct pair *p) { p->l = 10; p->d = 0.0; } Now generates a single store pair instruction rather than two `str` instructions. The patch basically generalises the mode iterators on the patterns in aarch64.md and the peepholes in aarch64-ldpstp.md to take all combinations of pairs of modes so, while it may be a large-ish patch, it does fairly mechanical stuff. Bootstrap and testsuite run OK. OK for trunk? Jackson 2018-05-22 Jackson Woodruff Kyrylo Tkachov * config/aarch64/aarch64.md: New patterns to generate stp and ldp. (store_pair_sw, store_pair_dw): New patterns to generate stp for single words and double words. (load_pair_sw, load_pair_dw): Likewise. (store_pair_sf, store_pair_df, store_pair_si, store_pair_di): Delete. (load_pair_sf, load_pair_df, load_pair_si, load_pair_di): Delete. * config/aarch64/aarch64-ldpstp.md: Modify peephole for different mode ldpstp and add peephole for merged zero stores. Likewise for loads. * config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp): Add size check. (aarch64_gen_store_pair): Rename calls to match new patterns. (aarch64_gen_load_pair): Rename calls to match new patterns. * config/aarch64/aarch64-simd.md (load_pair): Rename to... (load_pair): ... This. (store_pair): Rename to... (vec_store_pair): ... This. * config/aarch64/iterators.md (DREG, DREG2, DX2, SX, SX2, DSX): New mode iterators. (V_INT_EQUIV): Handle SImode. * config/aarch64/predicates.md (aarch64_reg_zero_or_fp_zero): New predicate. 2018-05-22 Jackson Woodruff * gcc.target/aarch64/ldp_stp_6.c: New. * gcc.target/aarch64/ldp_stp_7.c: New. * gcc.target/aarch64/ldp_stp_8.c: New. diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md index c008477c741d20f530b66ad9420113fa84a54ebb..f6fe8a6a93b5466723e3ed6b892e0ec1e67ee89d 100644 --- a/gcc/config/aarch64/aarch64-ldpstp.md +++ b/gcc/config/aarch64/aarch64-ldpstp.md @@ -99,11 +99,11 @@ (define_peephole2 }) (define_peephole2 - [(set (match_operand:VD 0 "register_operand" "") - (match_operand:VD 1 "aarch64_mem_pair_operand" "")) - (set (match_operand:VD 2 "register_operand" "") - (match_operand:VD 3 "memory_operand" ""))] - "aarch64_operands_ok_for_ldpstp (operands, true, mode)" + [(set (match_operand:DREG 0 "register_operand" "") + (match_operand:DREG 1 "aarch64_mem_pair_operand" "")) + (set (match_operand:DREG2 2 "register_operand" "") + (match_operand:DREG2 3 "memory_operand" ""))] + "aarch64_operands_ok_for_ldpstp (operands, true, mode)" [(parallel [(set (match_dup 0) (match_dup 1)) (set (match_dup 2) (match_dup 3))])] { @@ -119,11 +119,12 @@ (define_peephole2 }) (define_peephole2 - [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "") - (match_operand:VD 1 "register_operand" "")) - (set (match_operand:VD 2 "memory_operand" "") - (match_operand:VD 3 "register_operand" ""))] - "TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, mode)" + [(set (match_operand:DREG 0 "aarch64_mem_pair_operand" "") + (match_operand:DREG 1 "register_operand" "")) + (set (match_operand:DREG2 2 "memory_operand" "") + (match_operand:DREG2 3 "register_operand" ""))] + "TARGET_SIMD + && aarch64_operands_ok_for_ldpstp (operands, false, mode)" [(parallel [(set (match_dup 0) (match_dup 1)) (set (match_dup 2) (match_dup 3))])] { @@ -138,7 +139,6 @@ (define_peephole2 } }) - ;; Handle sign/zero extended consecutive load/store. (define_peephole2 @@ -181,6 +181,36 @@ (define_peephole2 } }) +;; Handle storing of a floating point zero with integer data. +;; This handles cases like: +;; struct pair { int a; float b; } +;; +;; p->a = 1; +;; p->b = 0.0; +;; +;; We can match modes that won't work for a stp instruction +;; as aarch64_operands_ok_for_ldpstp checks that the modes are +;; compatible. +(define_peephole2 + [(set (match_operand:DSX 0 "aarch64_mem_pair_operand" "") + (match_operand:DSX 1 "aarch64_reg_zero_or_fp_zero" "")) + (set (match_operand: 2 "memory_operand" "") + (match_operand: 3 "aarch64_reg_zero_or_fp_zero" ""))] + "aarch64_operands_ok_for_ldpstp (operands, false, mode)" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +{ + rtx base, offset_1, offset_2; + + extract_base_offset_in_addr (operands[0], &base, &offset_1); + extract_base_offset_in_addr (operands[2], &base, &offset_2); + if (INTVAL (offset_1) > INTVAL (offset_2)) + { + std::swap (operands[0], operands[2]); + std::swap (operands[1], operands[3]); + } +}) + ;; Handle consecutive load/store whose offset is out of the range ;; supported by ldp/ldpsw/stp. We firstly adjust offset in a scratch ;; register, then merge them into ldp/ldpsw/stp by using the adjusted diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e96afeaf2a5cbca631c0f7f7731b59f2d986df4f..82588a706016da8207eb5aafdf076aea8e9db0fd 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -177,30 +177,30 @@ (define_insn "aarch64_store_lane0" [(set_attr "type" "neon_store1_1reg")] ) -(define_insn "load_pair" - [(set (match_operand:VD 0 "register_operand" "=w") - (match_operand:VD 1 "aarch64_mem_pair_operand" "Ump")) - (set (match_operand:VD 2 "register_operand" "=w") - (match_operand:VD 3 "memory_operand" "m"))] +(define_insn "load_pair" + [(set (match_operand:DREG 0 "register_operand" "=w") + (match_operand:DREG 1 "aarch64_mem_pair_operand" "Ump")) + (set (match_operand:DREG2 2 "register_operand" "=w") + (match_operand:DREG2 3 "memory_operand" "m"))] "TARGET_SIMD && rtx_equal_p (XEXP (operands[3], 0), plus_constant (Pmode, XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" + GET_MODE_SIZE (mode)))" "ldp\\t%d0, %d2, %1" [(set_attr "type" "neon_ldp")] ) -(define_insn "store_pair" - [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "=Ump") - (match_operand:VD 1 "register_operand" "w")) - (set (match_operand:VD 2 "memory_operand" "=m") - (match_operand:VD 3 "register_operand" "w"))] +(define_insn "vec_store_pair" + [(set (match_operand:DREG 0 "aarch64_mem_pair_operand" "=Ump") + (match_operand:DREG 1 "register_operand" "w")) + (set (match_operand:DREG2 2 "memory_operand" "=m") + (match_operand:DREG2 3 "register_operand" "w"))] "TARGET_SIMD && rtx_equal_p (XEXP (operands[2], 0), plus_constant (Pmode, XEXP (operands[0], 0), - GET_MODE_SIZE (mode)))" + GET_MODE_SIZE (mode)))" "stp\\t%d1, %d3, %0" [(set_attr "type" "neon_stp")] ) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 6bf6c05535b61eef1021d46bcd8448fb3a0b25f4..b75a588eb9aa49b1796161d34494d452a6742e4d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4279,10 +4279,10 @@ aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2, switch (mode) { case E_DImode: - return gen_store_pairdi (mem1, reg1, mem2, reg2); + return gen_store_pair_dw_didi (mem1, reg1, mem2, reg2); case E_DFmode: - return gen_store_pairdf (mem1, reg1, mem2, reg2); + return gen_store_pair_dw_dfdf (mem1, reg1, mem2, reg2); default: gcc_unreachable (); @@ -4299,10 +4299,10 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx mem1, rtx reg2, switch (mode) { case E_DImode: - return gen_load_pairdi (reg1, mem1, reg2, mem2); + return gen_load_pair_dw_didi (reg1, mem1, reg2, mem2); case E_DFmode: - return gen_load_pairdf (reg1, mem1, reg2, mem2); + return gen_load_pair_dw_dfdf (reg1, mem1, reg2, mem2); default: gcc_unreachable (); @@ -16853,6 +16853,10 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load, if (!rtx_equal_p (base_1, base_2)) return false; + /* The operands must be of the same size. */ + gcc_assert (known_eq (GET_MODE_SIZE (GET_MODE (mem_1)), + GET_MODE_SIZE (GET_MODE (mem_2)))); + offval_1 = INTVAL (offset_1); offval_2 = INTVAL (offset_2); /* We should only be trying this for fixed-sized modes. There is no diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1277e489185b4f1c8bb8f31da4c4b53d19a2249c..d1290563255d64d05c94771af13e45c002111949 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1309,15 +1309,15 @@ (define_expand "movmemdi" ;; Operands 1 and 3 are tied together by the final condition; so we allow ;; fairly lax checking on the second memory operation. -(define_insn "load_pairsi" - [(set (match_operand:SI 0 "register_operand" "=r,w") - (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump")) - (set (match_operand:SI 2 "register_operand" "=r,w") - (match_operand:SI 3 "memory_operand" "m,m"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (SImode)))" +(define_insn "load_pair_sw_" + [(set (match_operand:SX 0 "register_operand" "=r,w") + (match_operand:SX 1 "aarch64_mem_pair_operand" "Ump,Ump")) + (set (match_operand:SX2 2 "register_operand" "=r,w") + (match_operand:SX2 3 "memory_operand" "m,m"))] + "rtx_equal_p (XEXP (operands[3], 0), + plus_constant (Pmode, + XEXP (operands[1], 0), + GET_MODE_SIZE (mode)))" "@ ldp\\t%w0, %w2, %1 ldp\\t%s0, %s2, %1" @@ -1325,15 +1325,16 @@ (define_insn "load_pairsi" (set_attr "fp" "*,yes")] ) -(define_insn "load_pairdi" - [(set (match_operand:DI 0 "register_operand" "=r,w") - (match_operand:DI 1 "aarch64_mem_pair_operand" "Ump,Ump")) - (set (match_operand:DI 2 "register_operand" "=r,w") - (match_operand:DI 3 "memory_operand" "m,m"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (DImode)))" +;; Storing different modes that can still be merged +(define_insn "load_pair_dw_" + [(set (match_operand:DX 0 "register_operand" "=r,w") + (match_operand:DX 1 "aarch64_mem_pair_operand" "Ump,Ump")) + (set (match_operand:DX2 2 "register_operand" "=r,w") + (match_operand:DX2 3 "memory_operand" "m,m"))] + "rtx_equal_p (XEXP (operands[3], 0), + plus_constant (Pmode, + XEXP (operands[1], 0), + GET_MODE_SIZE (mode)))" "@ ldp\\t%x0, %x2, %1 ldp\\t%d0, %d2, %1" @@ -1341,18 +1342,17 @@ (define_insn "load_pairdi" (set_attr "fp" "*,yes")] ) - ;; Operands 0 and 2 are tied together by the final condition; so we allow ;; fairly lax checking on the second memory operation. -(define_insn "store_pairsi" - [(set (match_operand:SI 0 "aarch64_mem_pair_operand" "=Ump,Ump") - (match_operand:SI 1 "aarch64_reg_or_zero" "rZ,w")) - (set (match_operand:SI 2 "memory_operand" "=m,m") - (match_operand:SI 3 "aarch64_reg_or_zero" "rZ,w"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (SImode)))" +(define_insn "store_pair_sw_" + [(set (match_operand:SX 0 "aarch64_mem_pair_operand" "=Ump,Ump") + (match_operand:SX 1 "aarch64_reg_zero_or_fp_zero" "rYZ,w")) + (set (match_operand:SX2 2 "memory_operand" "=m,m") + (match_operand:SX2 3 "aarch64_reg_zero_or_fp_zero" "rYZ,w"))] + "rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, + XEXP (operands[0], 0), + GET_MODE_SIZE (mode)))" "@ stp\\t%w1, %w3, %0 stp\\t%s1, %s3, %0" @@ -1360,15 +1360,16 @@ (define_insn "store_pairsi" (set_attr "fp" "*,yes")] ) -(define_insn "store_pairdi" - [(set (match_operand:DI 0 "aarch64_mem_pair_operand" "=Ump,Ump") - (match_operand:DI 1 "aarch64_reg_or_zero" "rZ,w")) - (set (match_operand:DI 2 "memory_operand" "=m,m") - (match_operand:DI 3 "aarch64_reg_or_zero" "rZ,w"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (DImode)))" +;; Storing different modes that can still be merged +(define_insn "store_pair_dw_" + [(set (match_operand:DX 0 "aarch64_mem_pair_operand" "=Ump,Ump") + (match_operand:DX 1 "aarch64_reg_zero_or_fp_zero" "rYZ,w")) + (set (match_operand:DX2 2 "memory_operand" "=m,m") + (match_operand:DX2 3 "aarch64_reg_zero_or_fp_zero" "rYZ,w"))] + "rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, + XEXP (operands[0], 0), + GET_MODE_SIZE (mode)))" "@ stp\\t%x1, %x3, %0 stp\\t%d1, %d3, %0" @@ -1376,74 +1377,6 @@ (define_insn "store_pairdi" (set_attr "fp" "*,yes")] ) -;; Operands 1 and 3 are tied together by the final condition; so we allow -;; fairly lax checking on the second memory operation. -(define_insn "load_pairsf" - [(set (match_operand:SF 0 "register_operand" "=w,r") - (match_operand:SF 1 "aarch64_mem_pair_operand" "Ump,Ump")) - (set (match_operand:SF 2 "register_operand" "=w,r") - (match_operand:SF 3 "memory_operand" "m,m"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (SFmode)))" - "@ - ldp\\t%s0, %s2, %1 - ldp\\t%w0, %w2, %1" - [(set_attr "type" "neon_load1_2reg,load_8") - (set_attr "fp" "yes,*")] -) - -(define_insn "load_pairdf" - [(set (match_operand:DF 0 "register_operand" "=w,r") - (match_operand:DF 1 "aarch64_mem_pair_operand" "Ump,Ump")) - (set (match_operand:DF 2 "register_operand" "=w,r") - (match_operand:DF 3 "memory_operand" "m,m"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (DFmode)))" - "@ - ldp\\t%d0, %d2, %1 - ldp\\t%x0, %x2, %1" - [(set_attr "type" "neon_load1_2reg,load_16") - (set_attr "fp" "yes,*")] -) - -;; Operands 0 and 2 are tied together by the final condition; so we allow -;; fairly lax checking on the second memory operation. -(define_insn "store_pairsf" - [(set (match_operand:SF 0 "aarch64_mem_pair_operand" "=Ump,Ump") - (match_operand:SF 1 "aarch64_reg_or_fp_zero" "w,rY")) - (set (match_operand:SF 2 "memory_operand" "=m,m") - (match_operand:SF 3 "aarch64_reg_or_fp_zero" "w,rY"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (SFmode)))" - "@ - stp\\t%s1, %s3, %0 - stp\\t%w1, %w3, %0" - [(set_attr "type" "neon_store1_2reg,store_8") - (set_attr "fp" "yes,*")] -) - -(define_insn "store_pairdf" - [(set (match_operand:DF 0 "aarch64_mem_pair_operand" "=Ump,Ump") - (match_operand:DF 1 "aarch64_reg_or_fp_zero" "w,rY")) - (set (match_operand:DF 2 "memory_operand" "=m,m") - (match_operand:DF 3 "aarch64_reg_or_fp_zero" "w,rY"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (DFmode)))" - "@ - stp\\t%d1, %d3, %0 - stp\\t%x1, %x3, %0" - [(set_attr "type" "neon_store1_2reg,store_16") - (set_attr "fp" "yes,*")] -) - ;; Load pair with post-index writeback. This is primarily used in function ;; epilogues. (define_insn "loadwb_pair_" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 36c8aa81dd700f5df48df0184cbfd6c79d2dafbf..29c198e22f79510018e3bc79b499863f857cc7a2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -69,6 +69,12 @@ (define_mode_iterator VSDQ_I_DI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI DI]) ;; Double vector modes. (define_mode_iterator VD [V8QI V4HI V4HF V2SI V2SF]) +;; All modes stored in registers d0-d31. +(define_mode_iterator DREG [V8QI V4HI V4HF V2SI V2SF DF]) + +;; Copy of the above. +(define_mode_iterator DREG2 [V8QI V4HI V4HF V2SI V2SF DF]) + ;; Advanced SIMD, 64-bit container, all integer modes. (define_mode_iterator VD_BHSI [V8QI V4HI V2SI]) @@ -236,6 +242,19 @@ (define_mode_iterator VSTRUCT [OI CI XI]) ;; Double scalar modes (define_mode_iterator DX [DI DF]) +;; Duplicate of the above +(define_mode_iterator DX2 [DI DF]) + +;; Single scalar modes +(define_mode_iterator SX [SI SF]) + +;; Duplicate of the above +(define_mode_iterator SX2 [SI SF]) + +;; Single and double integer and float modes +(define_mode_iterator DSX [DF DI SF SI]) + + ;; Modes available for Advanced SIMD mul lane operations. (define_mode_iterator VMUL [V4HI V8HI V2SI V4SI (V4HF "TARGET_SIMD_F16INST") @@ -855,7 +874,8 @@ (define_mode_attr V_INT_EQUIV [(V8QI "V8QI") (V16QI "V16QI") (V4HF "V4HI") (V8HF "V8HI") (V2SF "V2SI") (V4SF "V4SI") (DF "DI") (V2DF "V2DI") - (SF "SI") (HF "HI") + (SF "SI") (SI "SI") + (HF "HI") (VNx16QI "VNx16QI") (VNx8HI "VNx8HI") (VNx8HF "VNx8HI") (VNx4SI "VNx4SI") (VNx4SF "VNx4SI") diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 5d41d4350402b2a9e5941f160c6ab6f933bfff90..7aec76d681f5eca87b7b5e1d63d12dc0205ad113 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -62,6 +62,10 @@ (define_predicate "aarch64_reg_or_fp_zero" (and (match_code "const_double") (match_test "aarch64_float_const_zero_rtx_p (op)")))) +(define_predicate "aarch64_reg_zero_or_fp_zero" + (ior (match_operand 0 "aarch64_reg_or_fp_zero") + (match_operand 0 "aarch64_reg_or_zero"))) + (define_predicate "aarch64_reg_zero_or_m1_or_1" (and (match_code "reg,subreg,const_int") (ior (match_operand 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c new file mode 100644 index 0000000000000000000000000000000000000000..2d982f3389b668f2042d48ba3db04e619fd999f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c @@ -0,0 +1,20 @@ +/* { dg-options "-O2" } */ + +typedef float __attribute__ ((vector_size (8))) vec; + +struct pair +{ + vec e1; + double e2; +}; + +vec tmp; + +void +stp (struct pair *p) +{ + p->e1 = tmp; + p->e2 = 1.0; + + /* { dg-final { scan-assembler "stp\td\[0-9\]+, d\[0-9\]+, \\\[x\[0-9\]+\\\]" } } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c new file mode 100644 index 0000000000000000000000000000000000000000..be24c8a97009a77b510e2b1caee754a75bb9c3d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c @@ -0,0 +1,47 @@ +/* { dg-options "-O2" } */ + +struct pair +{ + double a; + long long int b; +}; + +void +stp (struct pair *p) +{ + p->a = 0.0; + p->b = 1; +} + +/* { dg-final { scan-assembler "stp\txzr, x\[0-9\]+, \\\[x\[0-9\]+\\\]" } } */ + +void +stp2 (struct pair *p) +{ + p->a = 0.0; + p->b = 0; +} + +struct reverse_pair +{ + long long int a; + double b; +}; + +void +stp_reverse (struct reverse_pair *p) +{ + p->a = 1; + p->b = 0.0; +} + +/* { dg-final { scan-assembler "stp\tx\[0-9\]+, xzr, \\\[x\[0-9\]+\\\]" } } */ + +void +stp_reverse2 (struct reverse_pair *p) +{ + p->a = 0; + p->b = 0.0; +} + +/* { dg-final { scan-assembler-times "stp\txzr, xzr, \\\[x\[0-9\]+\\\]" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c new file mode 100644 index 0000000000000000000000000000000000000000..2d9cb6b19d563a7e1bb11c174f6d642291260f29 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c @@ -0,0 +1,30 @@ +/* { dg-options "-O2" } */ + +typedef float __attribute__ ((vector_size (8))) fvec; +typedef int __attribute__ ((vector_size (8))) ivec; + +struct pair +{ + double a; + fvec b; +}; + +void ldp (double *a, fvec *b, struct pair *p) +{ + *a = p->a + 1; + *b = p->b; +} + +struct vec_pair +{ + fvec a; + ivec b; +}; + +void ldp2 (fvec *a, ivec *b, struct vec_pair *p) +{ + *a = p->a; + *b = p->b; +} + +/* { dg-final { scan-assembler-times "ldp\td\[0-9\], d\[0-9\]+, \\\[x\[0-9\]+\\\]" 2 } } */