From patchwork Mon Jun 10 13:31:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1945870 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=hEHKyCx/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VyXlp6qW7z20Py for ; Mon, 10 Jun 2024 23:32:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 10805385DDDC for ; Mon, 10 Jun 2024 13:32:04 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id 82BB43857340 for ; Mon, 10 Jun 2024 13:31:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 82BB43857340 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 82BB43857340 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::536 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718026302; cv=none; b=Rti/k+TC+csZhD06+zCIHPk+9CBOCnyWfOwHUe73jcdt5tYnwHD4Fr+v0fSTtwb4fLTGYGh4oA9iFuJ1ZEwwHdIOFdsu0vrummiNHsBsZ1/gUBNnNqQYXrc5tOfnGcjCifXw/LzfBkrpECmW6B4M7K+dIqtnZvv2w+IJ+UmmaEY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718026302; c=relaxed/simple; bh=um72ZpfoqJqaRrypGK84izwYOMSDusjseMsnexI/xfU=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=H4nLCNGAQO5uST65SnFtR0RfRYxCmXmWKjynz7Zahd1Rj1mCNDVL3yLo6HtLEAqzh7SjrfnEHqEMXDGqOIXI2voN7m20AbfPwCTnWESor7Fxfd6Ykz/lNIrfWRTB4H71yosqNLpuGk5Qpz2DOoowkR5wtUPN2nSeA+2FPCOKMuY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-57a44c2ce80so5034347a12.0 for ; Mon, 10 Jun 2024 06:31:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718026299; x=1718631099; darn=gcc.gnu.org; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=835rEoBaXWL1BIv1fS/AwO4H8rSur69QoHmwL8SyTrw=; b=hEHKyCx/7RrSrdzUZBJMlPGnnWbKYV+kOmgKsj7D58U23O0Wo1EN5FvhxfYcj80gmB 6sFQpSLNgJIKYeC9ZdSHZIrdOWiRTj+JZDUl/lyd86XG4AInaDBLV4uUFIgtxvFhbEpI lv83Uu+N1ZfIhP/Z0XNNv8RJ/Q3xtpqQtZZAUXAQ44xx9l8vyTPB+LwfvztyQ8i2oDEh l+VkbTTRLnyGC7S7i1RQ5XkeXKIuova6BcJlvEb/XMrEUhhyA5l9Ze92+Mg8i4wPLctx TUafriHPHHAPgyeTqIZrOyn3kFEYiezTO5a+khNYGAQ5J0n9thO7+1yYq7bRCp6reLCm dk1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718026299; x=1718631099; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=835rEoBaXWL1BIv1fS/AwO4H8rSur69QoHmwL8SyTrw=; b=jbhHkd15wVwWONhD/i2FyEodp1aPcXsIF41rvtubHLc6RZao52CNaoTaejo/zGS0AU EpIk8x5zuIjRAUQIeTnT2FFAp26lWkpHQVgrGNBeeVh+9hVZKZzxB9tfO6vtQz/HXXw3 JdtS92I1RLjvB8+ubqVjU8x8bEloxXqWWdT9yeEBVK6AcO97KE6ZqfVaxuuzeDyWhNwN qk4klciicLtK/yjKhoLd5yo1VLCj54bzTGM4kXiUEjJ+WT0+YClrR0NnHskgZ3Q8Ml0s 96RAPwwPdRuU37rtDRxJrmr3gJ+jqlS9ocafAZt3BLEL8dVTkzmXO5wjY/2Fq3lYQaXX DhzQ== X-Gm-Message-State: AOJu0YwT5bHK6Pc1azz5Sw1ujHT2Pmpnd+QFgmgFgAyipiyOcqGt+Fg5 ttFLq63pwbGxGQaOwqfP9hVqMW0sWF4JB2sgwQqOuBrRtOGp52VBWwwxjw== X-Google-Smtp-Source: AGHT+IFniOXelnmGqsV55+bSUtSm9uQQPmnRHBTe/a654iv7ymuTxuRO0h+9VijXqRD/LrglAJGwgQ== X-Received: by 2002:a50:cd5b:0:b0:57c:6f67:b17c with SMTP id 4fb4d7f45d1cf-57c6f67b6cfmr2986223a12.7.1718026298370; Mon, 10 Jun 2024 06:31:38 -0700 (PDT) Received: from [192.168.1.23] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-57c708136b2sm3925636a12.83.2024.06.10.06.31.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Jun 2024 06:31:37 -0700 (PDT) Message-ID: Date: Mon, 10 Jun 2024 15:31:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com Content-Language: en-US To: gcc-patches , "richard.sandiford" From: Robin Dapp Subject: [PATCH v2] vect: Merge loop mask and cond_op mask in fold-left, reduction [PR115382]. X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org > Actually, as Richard mentioned in the PR, it would probably be better > to use prepare_vec_mask instead. It should work in this context too > and would avoid redundant double masking. Attached is v2 that uses prepare_vec_mask. Regtested on riscv64 and armv8.8-a+sve via qemu. Bootstrap and regtest running on x86 and aarch64. Regards Robin Currently we discard the cond-op mask when the loop is fully masked which causes wrong code in gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c when compiled with -O3 -march=cascadelake --param vect-partial-vector-usage=2. This patch ANDs both masks. gcc/ChangeLog: PR tree-optimization/115382 * tree-vect-loop.cc (vectorize_fold_left_reduction): Merge loop mask and cond-op mask. --- gcc/tree-vect-loop.cc | 10 +++++++++- gcc/tree-vect-stmts.cc | 2 +- gcc/tree-vectorizer.h | 2 ++ 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 028692614bb..c9b037b8daf 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -7215,7 +7215,15 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, tree len = NULL_TREE; tree bias = NULL_TREE; if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) - mask = vect_get_loop_mask (loop_vinfo, gsi, masks, vec_num, vectype_in, i); + { + tree loop_mask = vect_get_loop_mask (loop_vinfo, gsi, masks, + vec_num, vectype_in, i); + if (is_cond_op) + mask = prepare_vec_mask (loop_vinfo, TREE_TYPE (loop_mask), + loop_mask, vec_opmask[i], gsi); + else + mask = loop_mask; + } else if (is_cond_op) mask = vec_opmask[i]; if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 5098b7fab6a..124a3462753 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1643,7 +1643,7 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, MASK_TYPE is the type of both masks. If new statements are needed, insert them before GSI. */ -static tree +tree prepare_vec_mask (loop_vec_info loop_vinfo, tree mask_type, tree loop_mask, tree vec_mask, gimple_stmt_iterator *gsi) { diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 97ec9c341e7..1f87c6c8ca2 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2508,6 +2508,8 @@ extern void vect_free_slp_tree (slp_tree); extern bool compatible_calls_p (gcall *, gcall *); extern int vect_slp_child_index_for_operand (const gimple *, int op, bool); +extern tree prepare_vec_mask (loop_vec_info, tree, tree, tree, gimple_stmt_iterator *); + /* In tree-vect-patterns.cc. */ extern void vect_mark_pattern_stmts (vec_info *, stmt_vec_info, gimple *, tree);