From patchwork Fri Oct 18 14:22:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1999196 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=FpzWvcWd; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVRkW4cpzz1xvV for ; Sat, 19 Oct 2024 01:22:59 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BD64D385841E for ; Fri, 18 Oct 2024 14:22:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id 9E40E3858D37 for ; Fri, 18 Oct 2024 14:22:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9E40E3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9E40E3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::534 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261349; cv=none; b=lC355l8MEIfbMo4EPhWTAABbigbbR9p/H2lZIfsQelFlZ0smxsj0ceOyZp6GYy12K2CfzvKREhl2KUo/c4F7cOlu2LaI9i+oSapDn/9+pja/wcFRrJVHdo1UC35lXlAr9MA2cwqodt5x+3lZ07ff7I/y3v8Uuq7A7Skui8KD+5A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261349; c=relaxed/simple; bh=sgj4dbiNCasUojCaDOfHtFMAUzrlzSznuQCwvEqKLCc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=VBxjnVf0Pt/Uy4L5ZwwEqJYs9hxXRUUxj8yOkiZL2LAeJshW/r2Ro6DbKh8w4o4jCu3M/VsjY5Mhf5gHqRXIFZ+kORn/5WIi9kFJpbBdUOrVBboQm6aU2ampvZ+Mh4zOjY/7r0aST3JmIBNFtiG6sWDs0vHo6y0+ls3bpPc82WE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-5c9625cfe4dso2642152a12.0 for ; Fri, 18 Oct 2024 07:22:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729261343; x=1729866143; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KhJO8uKERAhlVQtdkBzE2wZoVQVyy76Y+8GTqq/jovA=; b=FpzWvcWd40kQZgTF64XpmSIPZfuLbcCYWm6tuQmsVLkyOjo2LJ6hrMM/PNQOf5jPv0 3oBPO0sRxbNFoB+p90HkHHVGlKVq16x/iMmywmV2ySAWJZnPhBOkcSQDX4HuYfP8VLcF ngcJHWusp6XrH6ASisbaxla3Bu7HstIa2klsImXoWIA5Zt8Eqj5HN2E/4AujrFi1mHag cBCBrvRdKsrT+cP8i7ubjzC/jUrn0aRfkMIcsBVmnpVrn1Frue6fERs4VP9agis8mdU/ oTqk6vKF6f+6VX84Ueqjjtbm/a6GwcTbzUOg9EaNwiIDyHt/i9FDkNU80D+Ml8ZQaxCQ AHhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729261343; x=1729866143; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KhJO8uKERAhlVQtdkBzE2wZoVQVyy76Y+8GTqq/jovA=; b=FT6v1v4Bi/bCUdh+r9Zy2NgWYo0xKn99FAJU9eP+Z1OuZ2EwFU74eOwzOgg3ITJ4Y0 of5jLylD4dvVTWpfZpqXo23AAiQoBO+bhbYSn67U8mVc9iIP71gXy2EI2t5rzy4xSIus yuLvbDsaqUXLF3qKJ3qM+nfqxUy6vzUbHggzgow0TWQUANNfFbBFF6aPv4CXTw8TDXKS n29zzQmlZ8WFV7IF+jda6vzlsr7noU5PTLOD9jfKFCQlpdDZtNBQYkMkuYS6YiJ81nbd ASTR3Sb7T8LCktoAUYtq6XpoKw0I+yZGlLKmHv+yFe71fXiBW/1Q4Whi5n5lEugHV02s +Jqw== X-Gm-Message-State: AOJu0YwhXsnmo6rOs+q6dMMmtcpVIEvHdTGYbgBMCi/Yywaj72grbumj +BUSmOcDi1ua2k6k2JMq36PjFn1oxCbnTkRAm4HonnordRBiMErxsLOXpA== X-Google-Smtp-Source: AGHT+IH5EVBfYdaUZhRM6eWp0rjt51U0jAHVQpIUtbeBYgbBT9WkXPy+GsfulNUisBUxG6V0lrorQQ== X-Received: by 2002:a17:907:86a5:b0:a99:77f0:51f7 with SMTP id a640c23a62f3a-a9a69ca375amr223915566b.61.1729261342517; Fri, 18 Oct 2024 07:22:22 -0700 (PDT) Received: from x1c10.dc1.ventanamicro.com (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9a68c2677esm102812166b.188.2024.10.18.07.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 07:22:22 -0700 (PDT) From: Robin Dapp X-Google-Original-From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com Subject: [PATCH v2 1/8] docs: Document maskload else operand and behavior. Date: Fri, 18 Oct 2024 16:22:13 +0200 Message-ID: <20241018142220.173482-2-rdapp@ventanamicro.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241018142220.173482-1-rdapp@ventanamicro.com> References: <20241018142220.173482-1-rdapp@ventanamicro.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 41 insertions(+), 22 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 603f74a78c0..632b036b36c 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5017,8 +5017,10 @@ This pattern is not allowed to @code{FAIL}. @item @samp{vec_mask_load_lanes@var{m}@var{n}} Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional mask operand (operand 2) that specifies which elements of the destination -vectors should be loaded. Other elements of the destination -vectors are set to zero. The operation is equivalent to: +vectors should be loaded. Other elements of the destination vectors are +taken from operand 3, which is an else operand similar to the one in +@code{maskload}. +The operation is equivalent to: @smallexample int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n}); @@ -5028,7 +5030,7 @@ for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++) operand0[i][j] = operand1[j * c + i]; else for (i = 0; i < c; i++) - operand0[i][j] = 0; + operand0[i][j] = operand3[j]; @end smallexample This pattern is not allowed to @code{FAIL}. @@ -5036,16 +5038,20 @@ This pattern is not allowed to @code{FAIL}. @cindex @code{vec_mask_len_load_lanes@var{m}@var{n}} instruction pattern @item @samp{vec_mask_len_load_lanes@var{m}@var{n}} Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional -mask operand (operand 2), length operand (operand 3) as well as bias operand (operand 4) -that specifies which elements of the destination vectors should be loaded. -Other elements of the destination vectors are undefined. The operation is equivalent to: +mask operand (operand 2), length operand (operand 4) as well as bias operand +(operand 5) that specifies which elements of the destination vectors should be +loaded. Other elements of the destination vectors are taken from operand 3, +which is an else operand similar to the one in @code{maskload}. +The operation is equivalent to: @smallexample int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n}); -for (j = 0; j < operand3 + operand4; j++) - if (operand2[j]) - for (i = 0; i < c; i++) +for (j = 0; j < operand4 + operand5; j++) + for (i = 0; i < c; i++) + if (operand2[j]) operand0[i][j] = operand1[j * c + i]; + else + operand0[i][j] = operand3[j]; @end smallexample This pattern is not allowed to @code{FAIL}. @@ -5125,18 +5131,25 @@ address width. @cindex @code{mask_gather_load@var{m}@var{n}} instruction pattern @item @samp{mask_gather_load@var{m}@var{n}} Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand as -operand 5. Bit @var{i} of the mask is set if element @var{i} +operand 5. +Other elements of the destination vectors are taken from operand 6, +which is an else operand similar to the one in @code{maskload}. +Bit @var{i} of the mask is set if element @var{i} of the result should be loaded from memory and clear if element @var{i} -of the result should be set to zero. +of the result should be set to operand 6. @cindex @code{mask_len_gather_load@var{m}@var{n}} instruction pattern @item @samp{mask_len_gather_load@var{m}@var{n}} -Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand (operand 5), -a len operand (operand 6) as well as a bias operand (operand 7). Similar to mask_len_load, -the instruction loads at most (operand 6 + operand 7) elements from memory. +Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand +(operand 5) and an else operand (operand 6) as well as a len operand +(operand 7) and a bias operand (operand 8). + +Similar to mask_len_load the instruction loads at +most (operand 7 + operand 8) elements from memory. Bit @var{i} of the mask is set if element @var{i} of the result should -be loaded from memory and clear if element @var{i} of the result should be undefined. -Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored. +be loaded from memory and clear if element @var{i} of the result should +be set to element @var{i} of operand 6. +Mask elements @var{i} with @var{i} > (operand 7 + operand 8) are ignored. @cindex @code{scatter_store@var{m}@var{n}} instruction pattern @item @samp{scatter_store@var{m}@var{n}} @@ -5368,8 +5381,13 @@ Operands 4 and 5 have a target-dependent scalar integer mode. @cindex @code{maskload@var{m}@var{n}} instruction pattern @item @samp{maskload@var{m}@var{n}} Perform a masked load of vector from memory operand 1 of mode @var{m} -into register operand 0. Mask is provided in register operand 2 of -mode @var{n}. +into register operand 0. The mask is provided in register operand 2 of +mode @var{n}. Operand 3 (the "else value") is of mode @var{m} and +specifies which value is loaded when the mask is unset. +The predicate of operand 3 must only accept the else values that the target +actually supports. Currently three values are attempted, zero, -1, and +undefined. GCC handles an else value of zero more efficiently than -1 or +undefined. This pattern is not allowed to @code{FAIL}. @@ -5435,15 +5453,16 @@ Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3 has whichever integer mode the target prefers. A mask is specified in operand 2 which must be of type @var{n}. The mask has lower precedence than the length and is itself subject to length masking, -i.e. only mask indices < (operand 3 + operand 4) are used. +i.e. only mask indices < (operand 4 + operand 5) are used. +Operand 3 is an else operand similar to the one in @code{maskload}. Operand 4 conceptually has mode @code{QI}. -Operand 2 can be a variable or a constant amount. Operand 4 specifies a +Operand 4 can be a variable or a constant amount. Operand 5 specifies a constant bias: it is either a constant 0 or a constant -1. The predicate on -operand 4 must only accept the bias values that the target actually supports. +operand 5 must only accept the bias values that the target actually supports. GCC handles a bias of 0 more efficiently than a bias of -1. -If (operand 2 + operand 4) exceeds the number of elements in mode +If (operand 4 + operand 5) exceeds the number of elements in mode @var{m}, the behavior is undefined. If the target prefers the length to be measured in bytes