From patchwork Fri Jul 5 11:45:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1957301 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=T3hDJIzx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFsDH3Lzpz1xpP for ; Fri, 5 Jul 2024 21:46:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C1A203885C04 for ; Fri, 5 Jul 2024 11:46:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 50C0938845FF for ; Fri, 5 Jul 2024 11:45:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 50C0938845FF Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 50C0938845FF Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::634 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720179958; cv=none; b=DP6WpCqUQD7AM/4Zda2HRN8kGzQq2dGdKHV7vaE9nd956EgGtscV46enzKELre7ifaeRgQZURaoGeowU2DkQ8DCcgiJ2b2jYRxonGvC/IsZYXthaHPibjGdXFsnxNYaLlRGgeyQJ86vvsbz9YCYSa6Yp2FMEvY87RtrFg17LnHI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720179958; c=relaxed/simple; bh=/tJvK6R00oTss4R9RRCMtGa1agT9wOqnSJnbiKZsHI0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:To:Subject; b=H0HKpITLNvXBcDJhEo0oWNmJlXuJPU+6YW4m+k7tF1RG8CF5r50DPPW3VLpdkuZNVtZm5NA/Go+JDlE0mc4sFY46ZQbpaA3PqosUVr6wNE9eHMrpXxC17P66JIqrwVF8Ubyr2HVWenAjG98D98uCT5lFgMNyoQ51sE0t3/EJd8Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-a724a8097deso191637066b.1 for ; Fri, 05 Jul 2024 04:45:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720179952; x=1720784752; darn=gcc.gnu.org; h=content-transfer-encoding:subject:to:from:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=CiDk/J0tW3ulzejsZZeCvGxg3Tt69/JOn3UONcQlgrI=; b=T3hDJIzxLq9D5oSppGbrgRzS/RAMo/LfIoLN9u9isTxdDTNMldL+rUVl9qEsdfvIhG dsx4C/l9pkMD3rjNUg1GCO+zkxYN42GtLjVObgH8z1LdbWR51XUmlGdIKhpM20yAxh11 hZifrEJC9FnrfLOVyRqYNkBh6ykAqEWj51iYKAaseDCtZ3W9NzsTDFrqMF2/Na6ZIa7I WMK43UINKZaO+1F1DscBNnfkxrxOHc5m0uV3HIZvp3cwU8ubrBC1ZVfHKdyiXhjVBLoX QLeVU8DFw3luME2MZf6pfOmiYvMhtjcZHmfSHE4UVCyxQCs8FbxbyRH9bgIirsnWo3UG jrIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720179952; x=1720784752; h=content-transfer-encoding:subject:to:from:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=CiDk/J0tW3ulzejsZZeCvGxg3Tt69/JOn3UONcQlgrI=; b=VxjCvZebU1Tm1RUGCLJJSWjhZhuOwmvx7ATCiwrfiWk34u7DT0AQmnSFrpFx1K8Gxb e/HefF/P1hZfZAWRx0WOBh6UCvCIJUd1TrnVvdFBEDYc7LetR9hyb5H48W+w86a3vMKi kWPJGPxc7smbntiJMcKlve5Y6EGTtj3FRxe+L9DowmGdu+IcE6w7kxXYgUGyb2jJ1cJA axsVB7qVnV+hyIHU5/BAjl5aVd/cxjjpPiNM4936m8GGXcwB4W532HsyV7jBeLu6+HpL NvZ6rRIIE4QYdfyUUjx/cvkJTfziISawWoaBhnfZhKLDYRPK1U4NEeijlYRRxsP8lPR1 d8sQ== X-Gm-Message-State: AOJu0YzSVx4z6UZKdUhxe092/TIXB6SnzBFftnsOSBkgYBnF5CotVKO3 dm2YAAK8bz/CA6X/PwXajd3KZlvXuelG3cNHNZFQFINNQtBvUdcVXAlTNw== X-Google-Smtp-Source: AGHT+IE6bH+3xm/pAHgk6R5agCiDqaT88llzFZnaEXQtx/uJkw9BRM6YzFklOl8E/QFi5TP9JRlZeg== X-Received: by 2002:a17:906:ce53:b0:a75:23bb:6087 with SMTP id a640c23a62f3a-a77ba4690dbmr282001066b.29.1720179951267; Fri, 05 Jul 2024 04:45:51 -0700 (PDT) Received: from [192.168.1.23] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a77c0706eaesm108214866b.50.2024.07.05.04.45.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 05 Jul 2024 04:45:50 -0700 (PDT) Message-ID: <7ea24dfc-34dd-4931-8614-6fd0ef86972a@gmail.com> Date: Fri, 5 Jul 2024 13:45:49 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, "richard.sandiford" , Richard Biener Content-Language: en-US From: Robin Dapp To: gcc-patches Subject: [RFC] tree-if-conv: Handle nonzero masked elements [PR115336]. X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, in PR115336 we have the following vect_patt_391 = .MASK_LEN_GATHER_LOAD (_470, vect__59, 1, { 0, ... }, { 0, ... }, _482, 0); vect_iftmp.44 = vect_patt_391 | { 1, ... }; .MASK_LEN_STORE (vectp_f.45, 8B, { -1, ... }, _482, 0, vect_iftmp.44); which assumes that a maskload sets the masked-out elements to zero. RVV does not guarantee this and, unfortunately, leaves it to the hardware implementation to do basically anything it wants (even keep the previous value). In the PR this leads to us reusing a previous vector register and stale, nonzero values causing an invalid result. Based on a Richard Sandiford's feedback I started with the attached patch - it's more an RFC in its current shape and obviously not tested exhaustively. The idea is: - In ifcvt emit a VCOND_MASK (mask, load_result, preferred_else_value) after a MASK_LOAD if the target requires it. - Elide the VCOND_MASK when there is a COND_OP with a replacing else value later. Is this, generally, reasonable or is there a better way? My initial idea was to add an else value to MASK_LOAD. Richard's concern was, though, that this might make nonzero else values appear inexpensive when they are actually not. Even though I, mechanically, added match.pd patterns to catch the most common cases (already at a point where a separate function maybe in gimple-match-exports? would make more sense), there is still significant code-quality fallout. The regressing cases are often of a form where the VCOND_MASK is not just a conversion away but rather hidden behind some unmasked operation. I'm not sure if we could ever recognize everything that way without descending very deep. Regards Robin gcc/ChangeLog: PR middle-end/115336 * config/riscv/riscv.cc (riscv_preferred_else_value): Add MASK_LOAD. * config/riscv/riscv.h (TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO): Set to true. * defaults.h (TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO): New. * doc/tm.texi: Document. * doc/tm.texi.in: Document. * match.pd: Add patterns to allow replacing the else value of a VEC_COND. * tree-if-conv.cc (predicate_load_or_store): Emit a VEC_COND after a MASK_LOAD if the target does not guarantee zeroing. (predicate_statements): Add temporary lhs argument. * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Re-fold fold result. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr115336.c: New test. --- gcc/config/riscv/riscv.cc | 2 + gcc/config/riscv/riscv.h | 2 + gcc/defaults.h | 4 + gcc/doc/tm.texi | 5 + gcc/doc/tm.texi.in | 5 + gcc/match.pd | 125 +++++++++++++++++- .../gcc.target/riscv/rvv/autovec/pr115336.c | 20 +++ gcc/tree-if-conv.cc | 57 +++++++- gcc/tree-ssa-math-opts.cc | 8 ++ 9 files changed, 217 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index c17141d909a..e10bc6824b9 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -11481,6 +11481,8 @@ riscv_preferred_else_value (unsigned ifn, tree vectype, unsigned int nops, { if (riscv_v_ext_mode_p (TYPE_MODE (vectype))) return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype)); + else if (ifn == IFN_MASK_LOAD) + return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype)); return default_preferred_else_value (ifn, vectype, nops, ops); } diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 57910eecd3e..dc15eb5e60f 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -1264,4 +1264,6 @@ extern void riscv_remove_unneeded_save_restore_calls (void); /* Check TLS Descriptors mechanism is selected. */ #define TARGET_TLSDESC (riscv_tls_dialect == TLS_DESCRIPTORS) +#define TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO 1 + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/defaults.h b/gcc/defaults.h index 92f3e07f742..6ffbdaea229 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1457,6 +1457,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define DWARF_VERSION_DEFAULT 5 #endif +#ifndef TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO +#define TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO 0 +#endif + #ifndef USED_FOR_TARGET /* Done this way to keep gengtype happy. */ #if BITS_PER_UNIT == 8 diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index be5543b72f8..2fdebe7fd21 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -12683,6 +12683,11 @@ maintainer is familiar with. @end defmac +@defmac TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO +Bla + +@end defmac + @deftypefn {Target Hook} bool TARGET_HAVE_SPECULATION_SAFE_VALUE (bool @var{active}) This hook is used to determine the level of target support for @code{__builtin_speculation_safe_value}. If called with an argument diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 87a7f895174..276c38325dc 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -8097,6 +8097,11 @@ maintainer is familiar with. @end defmac +@defmac TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO +Bla + +@end defmac + @hook TARGET_HAVE_SPECULATION_SAFE_VALUE @hook TARGET_SPECULATION_SAFE_VALUE diff --git a/gcc/match.pd b/gcc/match.pd index 3d0689c9312..2fdc14ef56f 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -9744,7 +9744,22 @@ and, #endif /* Detect cases in which a VEC_COND_EXPR effectively replaces the - "else" value of an IFN_COND_*. */ + "else" value of an IFN_COND_* as well as when the IFN_COND_* + replaces the else of the VEC_COND_EXPR. */ +(for cond_op (COND_UNARY) + (simplify + (cond_op @0 (view_convert? (vec_cond @0 @1 @2)) @3) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 (view_convert:op_type @1) @3))))) + +(for cond_len_op (COND_LEN_UNARY) + (simplify + (cond_len_op @0 (view_convert? (vec_cond @0 @1 @2)) @3 @4 @5) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 (view_convert:op_type @1) @3 @4 @5))))) + (for cond_op (COND_BINARY) (simplify (vec_cond @0 (view_convert? (cond_op @0 @1 @2 @3)) @4) @@ -9756,7 +9771,27 @@ and, (with { tree op_type = TREE_TYPE (@5); } (if (inverse_conditions_p (@0, @2) && element_precision (type) == element_precision (op_type)) - (view_convert (cond_op @2 @3 @4 (view_convert:op_type @1))))))) + (view_convert (cond_op @2 @3 @4 (view_convert:op_type @1)))))) + (simplify + (cond_op @0 (view_convert? (vec_cond @0 @1 @2)) @3 @4) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 (view_convert:op_type @1) @3 @4)))) + (simplify + (cond_op @0 @1 (view_convert? (vec_cond @0 @2 @3)) @4) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 (view_convert:op_type @2) @4)))) + (simplify + (cond_op @0 (convert? (vec_cond @0 @1 @2)) @3 @4) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 (convert:op_type @1) @3 @4)))) + (simplify + (cond_op @0 @1 (convert? (vec_cond @0 @2 @3)) @4) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 (convert:op_type @2) @4))))) /* Same for ternary operations. */ (for cond_op (COND_TERNARY) @@ -9770,7 +9805,37 @@ and, (with { tree op_type = TREE_TYPE (@6); } (if (inverse_conditions_p (@0, @2) && element_precision (type) == element_precision (op_type)) - (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1))))))) + (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))))) + (simplify + (cond_op @0 (view_convert? (vec_cond @0 @1 @2)) @3 @4 @5) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 (view_convert:op_type @1) @3 @4 @5)))) + (simplify + (cond_op @0 @1 (view_convert? (vec_cond @0 @2 @3)) @4 @5) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 (view_convert:op_type @2) @4 @5)))) + (simplify + (cond_op @0 @1 @2 (view_convert? (vec_cond @0 @3 @4)) @5) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 @2 (view_convert:op_type @3) @5)))) + (simplify + (cond_op @0 (convert? (vec_cond @0 @1 @2)) @3 @4 @5) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 (convert:op_type @1) @3 @4 @5)))) + (simplify + (cond_op @0 @1 (convert? (vec_cond @0 @2 @3)) @4 @5) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 (convert:op_type @2) @4 @5)))) + (simplify + (cond_op @0 @1 @2 (convert? (vec_cond @0 @3 @4)) @5) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_op @0 @1 @2 (convert:op_type @3) @5))))) /* Detect cases in which a VEC_COND_EXPR effectively replaces the "else" value of an IFN_COND_LEN_*. */ @@ -9785,7 +9850,27 @@ and, (with { tree op_type = TREE_TYPE (@5); } (if (inverse_conditions_p (@0, @2) && element_precision (type) == element_precision (op_type)) - (view_convert (cond_len_op @2 @3 @4 (view_convert:op_type @1) @6 @7)))))) + (view_convert (cond_len_op @2 @3 @4 (view_convert:op_type @1) @6 @7))))) + (simplify + (cond_len_op @0 (view_convert? (vec_cond @0 @1 @2)) @3 @4 @5 @6) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 (view_convert:op_type @1) @3 @4 @5 @6)))) + (simplify + (cond_len_op @0 @1 (view_convert? (vec_cond @0 @2 @3)) @4 @5 @6) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 (view_convert:op_type @2) @4 @5 @6)))) + (simplify + (cond_len_op @0 (convert? (vec_cond @0 @1 @2)) @3 @4 @5 @6) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 (convert:op_type @1) @3 @4 @5 @6)))) + (simplify + (cond_len_op @0 @1 (convert? (vec_cond @0 @2 @3)) @4 @5 @6) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 (convert:op_type @2) @4 @5 @6))))) /* Same for ternary operations. */ (for cond_len_op (COND_LEN_TERNARY) @@ -9799,7 +9884,37 @@ and, (with { tree op_type = TREE_TYPE (@6); } (if (inverse_conditions_p (@0, @2) && element_precision (type) == element_precision (op_type)) - (view_convert (cond_len_op @2 @3 @4 @5 (view_convert:op_type @1) @7 @8)))))) + (view_convert (cond_len_op @2 @3 @4 @5 (view_convert:op_type @1) @7 @8))))) + (simplify + (cond_len_op @0 (view_convert? (vec_cond @0 @1 @2)) @3 @4 @5 @6 @7) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 (view_convert:op_type @1) @3 @4 @5 @6 @7)))) + (simplify + (cond_len_op @0 @1 (view_convert? (vec_cond @0 @2 @3)) @4 @5 @6 @7) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 (view_convert:op_type @2) @4 @5 @6 @7)))) + (simplify + (cond_len_op @0 @1 @2 (view_convert? (vec_cond @0 @3 @4)) @5 @6 @7) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 @2 (view_convert:op_type @3) @5 @6 @7)))) + (simplify + (cond_len_op @0 (convert? (vec_cond @0 @1 @2)) @3 @4 @5 @6 @7) + (with { tree op_type = TREE_TYPE (@3); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 (convert:op_type @1) @3 @4 @5 @6 @7)))) + (simplify + (cond_len_op @0 @1 (convert? (vec_cond @0 @2 @3)) @4 @5 @6 @7) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 (convert:op_type @2) @4 @5 @6 @7)))) + (simplify + (cond_len_op @0 @1 @2 (convert? (vec_cond @0 @3 @4)) @5 @6 @7) + (with { tree op_type = TREE_TYPE (@1); } + (if (element_precision (type) == element_precision (op_type)) + (cond_len_op @0 @1 @2 (convert:op_type @3) @5 @6 @7))))) /* Detect simplication for a conditional reduction where diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c new file mode 100644 index 00000000000..29e55705a7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options { -O3 -march=rv64gcv_zvl256b -mabi=lp64d } } */ + +short d[19]; +_Bool e[100][19][19]; +_Bool f[10000]; + +int main() +{ + for (long g = 0; g < 19; ++g) + d[g] = 3; + _Bool(*h)[19][19] = e; + for (short g = 0; g < 9; g++) + for (int i = 4; i < 16; i += 3) + f[i * 9 + g] = d[i] ? d[i] : h[g][i][2]; + for (long i = 120; i < 122; ++i) + __builtin_printf("%d\n", f[i]); +} + +/* { dg-final { scan-assembler-times {vmv.v.i\s*v[0-9]+,0} 3 } } */ diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 57992b6deca..c0c7013c817 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -124,6 +124,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-vectorizer.h" #include "tree-eh.h" #include "cgraph.h" +#include "tree-dfa.h" /* For lang_hooks.types.type_for_mode. */ #include "langhooks.h" @@ -2453,11 +2454,14 @@ mask_exists (int size, const vec &vec) that does so. */ static gimple * -predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask) +predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask, + tree override_lhs = NULL_TREE) { gcall *new_stmt; - tree lhs = gimple_assign_lhs (stmt); + tree lhs = override_lhs; + if (!lhs) + lhs = gimple_assign_lhs (stmt); tree rhs = gimple_assign_rhs1 (stmt); tree ref = TREE_CODE (lhs) == SSA_NAME ? rhs : lhs; mark_addressable (ref); @@ -2789,11 +2793,52 @@ predicate_statements (loop_p loop) vect_masks.safe_push (mask); } if (gimple_assign_single_p (stmt)) - new_stmt = predicate_load_or_store (&gsi, stmt, mask); - else - new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names); + { + bool target_has_else + = TARGET_MASK_LOAD_MASKED_MAYBE_NONZERO; + tree lhs = gimple_get_lhs (stmt); + + bool is_load = TREE_CODE (lhs) == SSA_NAME; + + gimple_seq stmts2 = NULL; + tree tmplhs = is_load && target_has_else + ? make_temp_ssa_name (TREE_TYPE (lhs), NULL, + "_ifc_") : lhs; + gimple *call_stmt + = predicate_load_or_store (&gsi, stmt, mask, tmplhs); + + gimple_seq_add_stmt (&stmts2, call_stmt); + + if (lhs != tmplhs) + ssa_names.add (tmplhs); - gsi_replace (&gsi, new_stmt, true); + if (is_load && target_has_else) + { + unsigned nops = gimple_call_num_args (call_stmt); + tree *ops = XALLOCAVEC (tree, nops); + + for (unsigned i = 0; i < nops; i++) + ops[i] = gimple_call_arg (call_stmt, i); + + tree els_operand + = targetm.preferred_else_value (IFN_MASK_LOAD, + TREE_TYPE (lhs), + nops, ops); + + tree rhs + = fold_build_cond_expr (TREE_TYPE (tmplhs), + mask, tmplhs, els_operand); + gassign *cond_stmt + = gimple_build_assign (gimple_get_lhs (stmt), rhs); + gimple_seq_add_stmt (&stmts2, cond_stmt); + } + gsi_replace_with_seq (&gsi, stmts2, true); + } + else + { + new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names); + gsi_replace (&gsi, new_stmt, true); + } } else if (((lhs = gimple_assign_lhs (stmt)), true) && (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 57085488722..65e6f91fe8b 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3193,6 +3193,14 @@ convert_mult_to_fma_1 (tree mul_result, tree op1, tree op2) /* Follow all SSA edges so that we generate FMS, FNMA and FNMS regardless of where the negation occurs. */ gimple *orig_stmt = gsi_stmt (gsi); + if (fold_stmt (&gsi, follow_all_ssa_edges)) + { + if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) + gcc_unreachable (); + update_stmt (gsi_stmt (gsi)); + } + /* Fold the result again. */ + orig_stmt = gsi_stmt (gsi); if (fold_stmt (&gsi, follow_all_ssa_edges)) { if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))