From patchwork Sat Nov 2 12:58:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 2005448 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=bvqL5ecN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XgdDh6L1pz1xwc for ; Sun, 3 Nov 2024 00:02:28 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4D281385783B for ; Sat, 2 Nov 2024 13:02:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 8D846385842D for ; Sat, 2 Nov 2024 12:58:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D846385842D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8D846385842D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; cv=none; b=Nw9phwTmpV1AAqjith+fZXYeVjV9wFb8uefyS0Vu9nTAn3vgouQugz83Z5+PsARDrrnh87WiUHBFPYYy0pte0HldYSm8+8d/HrDelItjIH3HLLUSrUP7AGoHrUjOz2Ky8A/5xW6lCEf5LX8oUZHOJ009jFqOvU6NiY78XP0stgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; c=relaxed/simple; bh=AablEesrfHgseL//46+kNW7lt2o1nIufcAe4SjKUEYc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=BgUcyBOrlX/v75uBmJATsD2U2U/zhJfanR0MuoEhmnHoWuMBoE5ILM2cqV/28ZxCPkSWXehY9GvWbvj2KP6uwnuRcoNKesCT+y6sq3iy3aQu3Am5drQ/bwIh1tFFYTbqnSXBEHErWNouhPeG2Y4Bm4MTeEt9rrQnYvAWAupBdO8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-a9e8522445dso70753466b.1 for ; Sat, 02 Nov 2024 05:58:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552317; x=1731157117; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nosboQ17/Dhnv9TXxnKdt/EPXpFKLG/Jg/6HOkyEQNs=; b=bvqL5ecNbpY1/hmd5BhSQUecnMjnLQX9dhLfbP1Y89KXRaX4Frac21o42x2HOrP7EX VM/CFz/Fk1Jdv1Er+8GOKELEH4H6qtuQFe2YgV6e+cKa4fgih2rND/zC9X35LNulne42 wVCbK8tz1OnyOCh8bh/LsWRPjZf4Fq4nOXaL4Dr2DlXtUDyzKV1aae0nwCAVKl6Lb8F0 x2fD0q/paFSUIjeSJvbG4aikWAmlLllOJey+qW+DmiZTjNYG4PIqIxyTjXsOZokDv7xE XiTDVPGI/s/cNzDb2yyEJ8aYi1UWVTPsoMaXbCdHb3usW92dbr/TJtGHeFUMqIrG17x3 pojQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552317; x=1731157117; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nosboQ17/Dhnv9TXxnKdt/EPXpFKLG/Jg/6HOkyEQNs=; b=Ubtney77ddgl9ov7QmsGd15n6LKQhWcjGQEgghgXDL/LEyB+mS31TfiiHnZC6UT+9g p0JeWhfUJBpeaWqfrw0G1yc78mIvkOBibr7gjxyWNeX3URF0ck6XrPyyw2fPkhsyCo+U bnYR6318FB+zDGZBr/Ej0byXnnJy297KC9IyztLnU+AnuaSma8/f+bVPoaJmyX6MgELs 68Tq/Yc/wSk7wwOu7pkDyE6V0kQcdzY9S+YQlBb78zhANq85A4dXc4OVrwL8HD6yFJxO BPcoDU6xe5LRHYBx442Zz7zCk/t6aNx8hDfOv7oCpJKUZGF1AVr7qajEok04dob8aOtR EPHw== X-Gm-Message-State: AOJu0Yxq5kVKLBg41RYx/rdbjIOEzzqDr4c0En5JuUY0szvJzLQVGtSP ZhqlO+WplcECGW9fOIMmEskqTfcws4FluFJkIeFT/pAXr6mDsXLysDljyg== X-Google-Smtp-Source: AGHT+IGUV8kWbk+lGPURTThRnlNxt0ry7/bHlUGOUZRrWS3EM/1aXiqRU0M52f+UJ3p5nK8CtFQbQg== X-Received: by 2002:a17:907:608a:b0:a9a:ea4:2834 with SMTP id a640c23a62f3a-a9e655ab355mr603548666b.33.1730552315971; Sat, 02 Nov 2024 05:58:35 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:35 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 5/8] aarch64: Add masked-load else operands. Date: Sat, 2 Nov 2024 13:58:25 +0100 Message-ID: <20241102125828.29183-6-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Robin Dapp This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc: Add else handling. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Ditto. * config/aarch64/aarch64-sve-builtins.h: Add else operand to contiguous load. * config/aarch64/aarch64-sve.md (@aarch64_load _): Split and add else operand. (@aarch64_load_): Ditto. (*aarch64_load__mov): Ditto. * config/aarch64/aarch64-sve2.md: Ditto. * config/aarch64/iterators.md: Remove unused iterators. * config/aarch64/predicates.md (aarch64_maskload_else_operand): Add zero else operand. --- .../aarch64/aarch64-sve-builtins-base.cc | 24 +++++---- gcc/config/aarch64/aarch64-sve-builtins.cc | 12 ++++- gcc/config/aarch64/aarch64-sve-builtins.h | 2 +- gcc/config/aarch64/aarch64-sve.md | 52 ++++++++++++++++--- gcc/config/aarch64/aarch64-sve2.md | 3 +- gcc/config/aarch64/iterators.md | 4 -- gcc/config/aarch64/predicates.md | 4 ++ 7 files changed, 77 insertions(+), 24 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index fe16d93adcd..d840f590202 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -1523,11 +1523,12 @@ public: gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); tree base = f.fold_contiguous_base (stmts, vectype); + tree els = build_zero_cst (vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, f.lhs); return new_call; } @@ -1541,7 +1542,7 @@ public: e.vector_mode (0), e.gp_mode (0)); else icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0)); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1554,10 +1555,10 @@ public: rtx expand (function_expander &e) const override { - insn_code icode = code_for_aarch64_load (UNSPEC_LD1_SVE, extend_rtx_code (), + insn_code icode = code_for_aarch64_load (extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1576,6 +1577,8 @@ public: e.prepare_gather_address_operands (1); /* Put the predicate last, as required by mask_gather_load_optab. */ e.rotate_inputs_left (0, 5); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); machine_mode mem_mode = e.memory_vector_mode (); machine_mode int_mode = aarch64_sve_int_mode (mem_mode); insn_code icode = convert_optab_handler (mask_gather_load_optab, @@ -1599,6 +1602,8 @@ public: e.rotate_inputs_left (0, 5); /* Add a constant predicate for the extension rtx. */ e.args.quick_push (CONSTM1_RTX (VNx16BImode)); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (1))); insn_code icode = code_for_aarch64_gather_load (extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); @@ -1741,6 +1746,7 @@ public: /* Get the predicate and base pointer. */ gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); + tree els = build_zero_cst (vectype); tree base = f.fold_contiguous_base (stmts, vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); @@ -1759,8 +1765,8 @@ public: /* Emit the load itself. */ tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, lhs_array); gsi_insert_after (f.gsi, new_call, GSI_SAME_STMT); @@ -1773,7 +1779,7 @@ public: machine_mode tuple_mode = e.result_mode (); insn_code icode = convert_optab_handler (vec_mask_load_lanes_optab, tuple_mode, e.vector_mode (0)); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1844,7 +1850,7 @@ public: ? code_for_aarch64_ldnt1 (e.vector_mode (0)) : code_for_aarch64 (UNSPEC_LDNT1_COUNT, e.tuple_mode (0))); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index ef14f8cd39d..0db9a7e9dbe 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -4227,9 +4227,12 @@ function_expander::use_vcond_mask_insn (insn_code icode, /* Implement the call using instruction ICODE, which loads memory operand 1 into register operand 0 under the control of predicate operand 2. Extending loads have a further predicate (operand 3) that nominally - controls the extension. */ + controls the extension. + HAS_ELSE is true if the pattern has an additional operand that specifies + the values of inactive lanes. This exists to match the general maskload + interface and is always zero for AArch64. */ rtx -function_expander::use_contiguous_load_insn (insn_code icode) +function_expander::use_contiguous_load_insn (insn_code icode, bool has_else) { machine_mode mem_mode = memory_vector_mode (); @@ -4238,6 +4241,11 @@ function_expander::use_contiguous_load_insn (insn_code icode) add_input_operand (icode, args[0]); if (GET_MODE_UNIT_BITSIZE (mem_mode) < type_suffix (0).element_bits) add_input_operand (icode, CONSTM1_RTX (VNx16BImode)); + + /* If we have an else operand, add it. */ + if (has_else) + add_input_operand (icode, CONST0_RTX (mem_mode)); + return generate_insn (icode); } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 4cdc0541bdc..1aa9caf84ba 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -695,7 +695,7 @@ public: rtx use_pred_x_insn (insn_code); rtx use_cond_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); rtx use_vcond_mask_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); - rtx use_contiguous_load_insn (insn_code); + rtx use_contiguous_load_insn (insn_code, bool = false); rtx use_contiguous_prefetch_insn (insn_code); rtx use_contiguous_store_insn (insn_code); diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 06bd3e4bb2c..17cca97555c 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1291,7 +1291,8 @@ (define_insn "maskload" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_ALL 1 "memory_operand" "m")] + (match_operand:SVE_ALL 1 "memory_operand" "m") + (match_operand:SVE_ALL 3 "aarch64_maskload_else_operand")] UNSPEC_LD1_SVE))] "TARGET_SVE" "ld1\t%0., %2/z, %1" @@ -1302,11 +1303,13 @@ (define_expand "vec_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand") (unspec:SVE_STRUCT [(match_dup 2) - (match_operand:SVE_STRUCT 1 "memory_operand")] + (match_operand:SVE_STRUCT 1 "memory_operand") + (match_dup 3)] UNSPEC_LDN))] "TARGET_SVE" { operands[2] = aarch64_ptrue_reg (mode); + operands[3] = CONST0_RTX (mode); } ) @@ -1315,7 +1318,8 @@ (define_insn "vec_mask_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand" "=w") (unspec:SVE_STRUCT [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_STRUCT 1 "memory_operand" "m")] + (match_operand:SVE_STRUCT 1 "memory_operand" "m") + (match_operand 3 "aarch64_maskload_else_operand")] UNSPEC_LDN))] "TARGET_SVE" "ld\t%0, %2/z, %1" @@ -1334,15 +1338,16 @@ (define_insn "vec_mask_load_lanes" ;; ------------------------------------------------------------------------- ;; Predicated load and extend, with 8 elements per 128-bit block. -(define_insn_and_rewrite "@aarch64_load_" +(define_insn_and_rewrite "@aarch64_load_" [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") (unspec:SVE_HSDI [(match_operand: 3 "general_operand" "UplDnm") (ANY_EXTEND:SVE_HSDI (unspec:SVE_PARTIAL_I [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m")] - SVE_PRED_LOAD))] + (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m") + (match_operand:SVE_PARTIAL_I 4 "aarch64_maskload_else_operand")] + UNSPEC_LD1_SVE))] UNSPEC_PRED_X))] "TARGET_SVE && (~ & ) == 0" "ld1\t%0., %2/z, %1" @@ -1352,6 +1357,26 @@ (define_insn_and_rewrite "@aarch64_load__mov" + [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") + (unspec:SVE_HSDI + [(match_operand: 3 "general_operand" "UplDnm") + (ANY_EXTEND:SVE_HSDI + (unspec:SVE_PARTIAL_I + [(match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m")] + UNSPEC_PRED_X))] + UNSPEC_PRED_X))] + "TARGET_SVE && (~ & ) == 0" + "ld1\t%0., %2/z, %1" + "&& !CONSTANT_P (operands[3])" + { + operands[3] = CONSTM1_RTX (mode); + } +) + ;; ------------------------------------------------------------------------- ;; ---- First-faulting contiguous loads ;; ------------------------------------------------------------------------- @@ -1433,7 +1458,8 @@ (define_insn "@aarch64_ldnt1" [(set (match_operand:SVE_FULL 0 "register_operand" "=w") (unspec:SVE_FULL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_FULL 1 "memory_operand" "m")] + (match_operand:SVE_FULL 1 "memory_operand" "m") + (match_operand:SVE_FULL 3 "aarch64_maskload_else_operand")] UNSPEC_LDNT1_SVE))] "TARGET_SVE" "ldnt1\t%0., %2/z, %1" @@ -1456,11 +1482,13 @@ (define_expand "gather_load" (match_operand: 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_dup 6) (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); + operands[6] = CONST0_RTX (mode); } ) @@ -1474,6 +1502,7 @@ (define_insn "mask_gather_load" (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1503,6 +1532,7 @@ (define_insn "mask_gather_load" (match_operand:VNx2DI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1531,6 +1561,7 @@ (define_insn_and_rewrite "*mask_gather_load_xtw_unpac UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1561,6 +1592,7 @@ (define_insn_and_rewrite "*mask_gather_load_sxtw" UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1588,6 +1620,7 @@ (define_insn "*mask_gather_load_uxtw" (match_operand:VNx2DI 6 "aarch64_sve_uxtw_immediate")) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1624,6 +1657,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_ (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4BHI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1663,6 +1697,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_") + (match_operand:SVE_2BHSI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1701,6 +1736,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1738,6 +1774,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1772,6 +1809,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 5f2697c3179..22e8632af80 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -138,7 +138,8 @@ (define_insn "@aarch64_" [(set (match_operand:SVE_FULLx24 0 "aligned_register_operand" "=Uw") (unspec:SVE_FULLx24 [(match_operand:VNx16BI 2 "register_operand" "Uph") - (match_operand:SVE_FULLx24 1 "memory_operand" "m")] + (match_operand:SVE_FULLx24 1 "memory_operand" "m") + (match_operand:SVE_FULLx24 3 "aarch64_maskload_else_operand")] LD1_COUNT))] "TARGET_STREAMING_SME2" "\t%0, %K2/z, %1" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0bc98315bb6..6592b3df3b2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3224,10 +3224,6 @@ (define_int_iterator SVE_SHIFT_WIDE [UNSPEC_ASHIFT_WIDE (define_int_iterator SVE_LDFF1_LDNF1 [UNSPEC_LDFF1 UNSPEC_LDNF1]) -(define_int_iterator SVE_PRED_LOAD [UNSPEC_PRED_X UNSPEC_LD1_SVE]) - -(define_int_attr pred_load [(UNSPEC_PRED_X "_x") (UNSPEC_LD1_SVE "")]) - (define_int_iterator LD1_COUNT [UNSPEC_LD1_COUNT UNSPEC_LDNT1_COUNT]) (define_int_iterator ST1_COUNT [UNSPEC_ST1_COUNT UNSPEC_STNT1_COUNT]) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 6ad9a4bd8b9..26cfaed2402 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -1067,3 +1067,7 @@ (define_predicate "aarch64_granule16_simm9" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -4096, 4080) && !(INTVAL (op) & 0xf)"))) + +(define_predicate "aarch64_maskload_else_operand" + (and (match_code "const_int,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))")))