From patchwork Fri Oct 18 14:22:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1999207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=k41AbV7L; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVRnS2KHkz1xw2 for ; Sat, 19 Oct 2024 01:25:32 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 84DC6385840C for ; Fri, 18 Oct 2024 14:25:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by sourceware.org (Postfix) with ESMTPS id 463193858C32 for ; Fri, 18 Oct 2024 14:22:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 463193858C32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 463193858C32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261357; cv=none; b=WCuISX6/FWiUaBamVAaRdC+iptQ/dNPsiJM1SR5nHymH/x8DazMB3K1EoKTN8GzWI61vpn1nAHIPLph36yjPlkJdDPAq+Y1IzMjyhFKznWiLP1boO4vtrsMo7RiggtgEqqjwmErnPPq5g0FHl5TYL8e0bxstfjCwNsFTo8/GqBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261357; c=relaxed/simple; bh=KCS7JuETTlVFsUwcPcl93h95X66/ScPGbp47QuwX9/E=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Blv78zg4cNV+5Hy59Kp7IgnRKL4hPw5bYPJl2fDFw2bQo5DRS1i+JgmX4fsJwnTZJBHf5XG09Lh/svoeb1BSaWy2x3H7SQXbOk1VVKAJ+PgsTgLV4jrbts3dXbxbllB0lH740om7hUPUihdLBmQlo+J81tbhaYUMcrzYP1LW7hE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12b.google.com with SMTP id 2adb3069b0e04-539fb49c64aso3219523e87.0 for ; Fri, 18 Oct 2024 07:22:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729261347; x=1729866147; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7/Q9Nxj19Mch3awf7UQv3IMRqI1tHGJmxGQH+mqVC8w=; b=k41AbV7Lq/H3aUJ0hXr48uPk1IOIozvu6KSocglZQ4grJh0f9FL6qj3zbx4VF0PVUS ttqfFm18BeUntzCWy1MszXtSYkl8RsguVIyQk1MDx6g/4n/Ci7z+JRTiwtTmae5Nbo0H D5+EzDNZJcjZ2AAw2WsGlqbRiIDZqvxUGJdq/crjI4lXYtisA3xQDkCgBNZfLZnliO6w R5Hv6H16gA5zJt95Lc0RxAm1GzYlI1ab6EoUrB+DlumgMWBdTu669P4nmkgtKnCP5Oxs RELI3PvGEpqLMYao07P7+1NEHhfX2fy47TBkpW3yg2zzYaoxqMJipos1C/zOAteneSTP xs7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729261347; x=1729866147; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7/Q9Nxj19Mch3awf7UQv3IMRqI1tHGJmxGQH+mqVC8w=; b=eipBS88GSffyYIw2BIm67rNRGv1kCUPLWmitAYS5/t2tr6czb/AhDbU2Q1RVwstumL upvw0NcATtwu3iT1gWhG93KZvBkMcvzMN8yVCSxhrefqLx6NEQKRQQepBVI2wkGzuRBp YfkCgDNlRaur8kZFaSPl92bil5sWNirugJEmg5niV2SkC5zmmX5lyil1IDYG0iskh6ZQ KtyMeXZLIpGQ9Ax/fjtc9aJBezDKBcfg+G2fsJeyEem7PSvBobE9KACu0/V1jI3rXH+t NDc4fiSx+exq6dkc1fhXdawGOPWRlbJmMR6K/21VWOEfHGSsgCohNqUU6a8TzjlSRGJp NKvQ== X-Gm-Message-State: AOJu0Yy+798qGBDq2h/6iPHgs1E2BGbxIvjgmEOly7MoK+1vEjrz/5HV UFcUacFNSUOp+krmoCR811BBk1OYd6TdlBZc/J71TV4tc+uRJNOBxVDwgA== X-Google-Smtp-Source: AGHT+IGGTv3VDQmyd3nsx/zA4O6W6asXDCklz2VYJAZGVm+ET8cEzeZcgvqpl8lKCTaaRLwW0ZFS0w== X-Received: by 2002:a05:6512:3b89:b0:52c:cd77:fe03 with SMTP id 2adb3069b0e04-53a1544481emr2644221e87.14.1729261346745; Fri, 18 Oct 2024 07:22:26 -0700 (PDT) Received: from x1c10.dc1.ventanamicro.com (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9a68c2677esm102812166b.188.2024.10.18.07.22.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 07:22:26 -0700 (PDT) From: Robin Dapp X-Google-Original-From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com Subject: [PATCH v2 5/8] aarch64: Add masked-load else operands. Date: Fri, 18 Oct 2024 16:22:17 +0200 Message-ID: <20241018142220.173482-6-rdapp@ventanamicro.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241018142220.173482-1-rdapp@ventanamicro.com> References: <20241018142220.173482-1-rdapp@ventanamicro.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. For the lack of a better idea I used a function call property to specify whether a builtin needs an else operand or not. Somebody with better knowledge of the aarch64 target can surely improve that. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc: Add else handling. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Ditto. * config/aarch64/aarch64-sve-builtins.h: Add "has else". * config/aarch64/aarch64-sve.md (*aarch64_load __mov): Add else operands. * config/aarch64/aarch64-sve2.md: Ditto. * config/aarch64/predicates.md (aarch64_maskload_else_operand): Add zero else operand. --- .../aarch64/aarch64-sve-builtins-base.cc | 58 ++++++++++++++----- gcc/config/aarch64/aarch64-sve-builtins.cc | 5 ++ gcc/config/aarch64/aarch64-sve-builtins.h | 1 + gcc/config/aarch64/aarch64-sve.md | 47 +++++++++++++-- gcc/config/aarch64/aarch64-sve2.md | 3 +- gcc/config/aarch64/predicates.md | 4 ++ 6 files changed, 98 insertions(+), 20 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 1c17149e1f0..08d2fb796dd 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -1476,7 +1476,7 @@ public: unsigned int call_properties (const function_instance &) const override { - return CP_READ_MEMORY; + return CP_READ_MEMORY | CP_HAS_ELSE; } gimple * @@ -1491,11 +1491,12 @@ public: gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); tree base = f.fold_contiguous_base (stmts, vectype); + tree els = build_zero_cst (vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, f.lhs); return new_call; } @@ -1505,10 +1506,16 @@ public: { insn_code icode; if (e.vectors_per_tuple () == 1) - icode = convert_optab_handler (maskload_optab, - e.vector_mode (0), e.gp_mode (0)); + { + icode = convert_optab_handler (maskload_optab, + e.vector_mode (0), e.gp_mode (0)); + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); + } else - icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0)); + { + icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0)); + e.args.quick_push (CONST0_RTX (e.tuple_mode (0))); + } return e.use_contiguous_load_insn (icode); } }; @@ -1519,12 +1526,20 @@ class svld1_extend_impl : public extending_load public: using extending_load::extending_load; + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_MEMORY | CP_HAS_ELSE; + } + rtx expand (function_expander &e) const override { insn_code icode = code_for_aarch64_load (UNSPEC_LD1_SVE, extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (1))); return e.use_contiguous_load_insn (icode); } }; @@ -1535,7 +1550,7 @@ public: unsigned int call_properties (const function_instance &) const override { - return CP_READ_MEMORY; + return CP_READ_MEMORY | CP_HAS_ELSE; } rtx @@ -1544,6 +1559,8 @@ public: e.prepare_gather_address_operands (1); /* Put the predicate last, as required by mask_gather_load_optab. */ e.rotate_inputs_left (0, 5); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); machine_mode mem_mode = e.memory_vector_mode (); machine_mode int_mode = aarch64_sve_int_mode (mem_mode); insn_code icode = convert_optab_handler (mask_gather_load_optab, @@ -1567,6 +1584,8 @@ public: e.rotate_inputs_left (0, 5); /* Add a constant predicate for the extension rtx. */ e.args.quick_push (CONSTM1_RTX (VNx16BImode)); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (1))); insn_code icode = code_for_aarch64_gather_load (extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); @@ -1697,7 +1716,7 @@ public: unsigned int call_properties (const function_instance &) const override { - return CP_READ_MEMORY; + return CP_READ_MEMORY | CP_HAS_ELSE; } gimple * @@ -1709,6 +1728,7 @@ public: /* Get the predicate and base pointer. */ gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); + tree els = build_zero_cst (vectype); tree base = f.fold_contiguous_base (stmts, vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); @@ -1727,8 +1747,8 @@ public: /* Emit the load itself. */ tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, lhs_array); gsi_insert_after (f.gsi, new_call, GSI_SAME_STMT); @@ -1741,6 +1761,7 @@ public: machine_mode tuple_mode = e.result_mode (); insn_code icode = convert_optab_handler (vec_mask_load_lanes_optab, tuple_mode, e.vector_mode (0)); + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); return e.use_contiguous_load_insn (icode); } }; @@ -1802,16 +1823,23 @@ public: unsigned int call_properties (const function_instance &) const override { - return CP_READ_MEMORY; + return CP_READ_MEMORY | CP_HAS_ELSE; } rtx expand (function_expander &e) const override { - insn_code icode = (e.vectors_per_tuple () == 1 - ? code_for_aarch64_ldnt1 (e.vector_mode (0)) - : code_for_aarch64 (UNSPEC_LDNT1_COUNT, - e.tuple_mode (0))); + insn_code icode; + if (e.vectors_per_tuple () == 1) + { + icode = code_for_aarch64_ldnt1 (e.vector_mode (0)); + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); + } + else + { + icode = code_for_aarch64 (UNSPEC_LDNT1_COUNT, e.tuple_mode (0)); + e.args.quick_push (CONST0_RTX (e.tuple_mode (0))); + } return e.use_contiguous_load_insn (icode); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index e7c703c987e..7214f1f5a3e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -4207,6 +4207,11 @@ function_expander::use_contiguous_load_insn (insn_code icode) add_input_operand (icode, args[0]); if (GET_MODE_UNIT_BITSIZE (mem_mode) < type_suffix (0).element_bits) add_input_operand (icode, CONSTM1_RTX (VNx16BImode)); + + /* If we have an else operand, add it. */ + if (call_properties () & CP_HAS_ELSE) + add_input_operand (icode, args.last ()); + return generate_insn (icode); } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 645e56badbe..6cda8bd8a8c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -103,6 +103,7 @@ const unsigned int CP_READ_ZA = 1U << 7; const unsigned int CP_WRITE_ZA = 1U << 8; const unsigned int CP_READ_ZT0 = 1U << 9; const unsigned int CP_WRITE_ZT0 = 1U << 10; +const unsigned int CP_HAS_ELSE = 1U << 11; /* Enumerates the SVE predicate and (data) vector types, together called "vector types" for brevity. */ diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 06bd3e4bb2c..1e12fa3c982 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1291,7 +1291,8 @@ (define_insn "maskload" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_ALL 1 "memory_operand" "m")] + (match_operand:SVE_ALL 1 "memory_operand" "m") + (match_operand:SVE_ALL 3 "aarch64_maskload_else_operand")] UNSPEC_LD1_SVE))] "TARGET_SVE" "ld1\t%0., %2/z, %1" @@ -1302,11 +1303,14 @@ (define_expand "vec_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand") (unspec:SVE_STRUCT [(match_dup 2) - (match_operand:SVE_STRUCT 1 "memory_operand")] + (match_operand:SVE_STRUCT 1 "memory_operand") + (match_dup 3) + ] UNSPEC_LDN))] "TARGET_SVE" { operands[2] = aarch64_ptrue_reg (mode); + operands[3] = CONST0_RTX (mode); } ) @@ -1315,7 +1319,8 @@ (define_insn "vec_mask_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand" "=w") (unspec:SVE_STRUCT [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_STRUCT 1 "memory_operand" "m")] + (match_operand:SVE_STRUCT 1 "memory_operand" "m") + (match_operand 3 "aarch64_maskload_else_operand")] UNSPEC_LDN))] "TARGET_SVE" "ld\t%0, %2/z, %1" @@ -1335,6 +1340,27 @@ (define_insn "vec_mask_load_lanes" ;; Predicated load and extend, with 8 elements per 128-bit block. (define_insn_and_rewrite "@aarch64_load_" + [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") + (unspec:SVE_HSDI + [(match_operand: 3 "general_operand" "UplDnm") + (ANY_EXTEND:SVE_HSDI + (unspec:SVE_PARTIAL_I + [(match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m") + (match_operand:SVE_PARTIAL_I 4 "aarch64_maskload_else_operand")] + SVE_PRED_LOAD))] + UNSPEC_PRED_X))] + "TARGET_SVE && (~ & ) == 0" + "ld1\t%0., %2/z, %1" + "&& !CONSTANT_P (operands[3])" + { + operands[3] = CONSTM1_RTX (mode); + } +) + +;; Same as above without the maskload_else_operand to still allow combine to +;; match a sign-extended pred_mov pattern. +(define_insn_and_rewrite "*aarch64_load__mov" [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") (unspec:SVE_HSDI [(match_operand: 3 "general_operand" "UplDnm") @@ -1433,7 +1459,8 @@ (define_insn "@aarch64_ldnt1" [(set (match_operand:SVE_FULL 0 "register_operand" "=w") (unspec:SVE_FULL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_FULL 1 "memory_operand" "m")] + (match_operand:SVE_FULL 1 "memory_operand" "m") + (match_operand:SVE_FULL 3 "aarch64_maskload_else_operand")] UNSPEC_LDNT1_SVE))] "TARGET_SVE" "ldnt1\t%0., %2/z, %1" @@ -1456,11 +1483,13 @@ (define_expand "gather_load" (match_operand: 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_dup 6) (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); + operands[6] = CONST0_RTX (mode); } ) @@ -1474,6 +1503,7 @@ (define_insn "mask_gather_load" (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1503,6 +1533,7 @@ (define_insn "mask_gather_load" (match_operand:VNx2DI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1531,6 +1562,7 @@ (define_insn_and_rewrite "*mask_gather_load_xtw_unpac UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1561,6 +1593,7 @@ (define_insn_and_rewrite "*mask_gather_load_sxtw" UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1588,6 +1621,7 @@ (define_insn "*mask_gather_load_uxtw" (match_operand:VNx2DI 6 "aarch64_sve_uxtw_immediate")) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1624,6 +1658,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_ (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4BHI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1663,6 +1698,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_") + (match_operand:SVE_2BHSI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1701,6 +1737,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1738,6 +1775,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1772,6 +1810,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 5f2697c3179..22e8632af80 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -138,7 +138,8 @@ (define_insn "@aarch64_" [(set (match_operand:SVE_FULLx24 0 "aligned_register_operand" "=Uw") (unspec:SVE_FULLx24 [(match_operand:VNx16BI 2 "register_operand" "Uph") - (match_operand:SVE_FULLx24 1 "memory_operand" "m")] + (match_operand:SVE_FULLx24 1 "memory_operand" "m") + (match_operand:SVE_FULLx24 3 "aarch64_maskload_else_operand")] LD1_COUNT))] "TARGET_STREAMING_SME2" "\t%0, %K2/z, %1" diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 8f3aab2272c..744f36ff67d 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -1069,3 +1069,7 @@ (define_predicate "aarch64_granule16_simm9" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -4096, 4080) && !(INTVAL (op) & 0xf)"))) + +(define_predicate "aarch64_maskload_else_operand" + (and (match_code "const_int,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))")))