From patchwork Fri Oct 18 14:22:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1999205 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=kuvAUJBt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVRm30J19z1xvV for ; Sat, 19 Oct 2024 01:24:19 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8627C3858430 for ; Fri, 18 Oct 2024 14:24:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 434983858C48 for ; Fri, 18 Oct 2024 14:22:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 434983858C48 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 434983858C48 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261354; cv=none; b=WItv7nsuQiR3MwYImkR6aRbPLZ4bJwOENhu2Zp76Aoq3Sgz7R8jnX8YP7YgUC+KA2vpgOwoXLytpd73QyUwvGNAlnPG+54HFAk851OWQHZ1eWlGdcGJRunV+XMeEOegOh77c6rBSa3rk1A2LNQcxNMs28cW8+0fLyNtVGjMKX0s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729261354; c=relaxed/simple; bh=5lTZbFhJdHMgcDiODSFoe1aHaCLj/GXqrw6Rj5OivdM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=gWChb9Jqldogui9pTF3rkmtRVWAJizFVv2jCBUesCzHwD7F/it4Tl7qd4gYwnDY2rtGcytSp4r48SUQHI2FhW5HkOQbdn3fSVIZqnEVuCdGG5n9GB0VEpTWzLq8GmYOb6Rihq5QrKbW6gjjW1kWfPaUOTdHc8ZKQJQNTJ3cdsSk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-a9a68480164so99054366b.3 for ; Fri, 18 Oct 2024 07:22:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729261345; x=1729866145; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WrxsIIY9b4olEl4/98R12txGi5ou4nZfWBerbC+6ExY=; b=kuvAUJBtaKyWCOdxINW3b3RfCn2Wpm2egYP3s42H7anR8vZ8jfrYiUKWGrhPYnnY8n y693UP+k3/98WSM3A95sQzukUkozkffnYJFnkvdNdW+oDD9qq448VggmtcC49SQLKRbL lncde7/kgBUZF5LUMpcxRIJZFfJrmuOHHNSaNPw+FqAAoC8PGqRLa7xL6SzrDZLlK4yo /v/r2j26uupc38m8JvxpZWwGaCnE+umvzbe3CYnYLtVTBy9g/AzmGwKzgyFTS8p4TYm/ tVfV6FQmS8YcAvXW7gmsomrbHM08oQ7cY7QH8EWyEvlGnqWXGcKI4q+Jun1gJpqTQPPm lRew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729261345; x=1729866145; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WrxsIIY9b4olEl4/98R12txGi5ou4nZfWBerbC+6ExY=; b=H+rsOPWXMsdl6s9Zl4wdy0dTseVSCVDMd8nq/eJ2qq479JBt+R9y+dP8BciaIRWGDb BGqAn96vr7Z0RtnCwVeER4f4AM3ps4m3oBIX47S+DfjzOfNKDr66nIm2nX5WxfD5Gf5e apDZSRW3IdM6CLKqpV7BMI2EPi1l4c92dU5G/htMZ9UWP7+iWmy3WssyYIgFc19EdboZ wqpDLa3yNRkO33BrxSHehwNk8jD41rN3P9MwjKW3LF1DbUG7llw/yGzRFqs3jhjwZkql E0nUKxvfXo3fUuco9Vuz60/zZZ90nMB326rWoLdrOMbmv6Gdxgea3j6BjHLVeN5a3LqN l7PQ== X-Gm-Message-State: AOJu0Yw9tq4ZMOVlm3WKipMF54LQkM47PLzGcWcnOxr5c8pbUtV439UJ onH7J9U0jh6OGdDmnJmZpOkOZpTtyyLhse9dhENInqf6KemURFyB7afZlA== X-Google-Smtp-Source: AGHT+IEeYkTcsrtMsqbI/CnYeRMH/amiQqUKV/QkOGakHhKa1k1k7jdI7qpZIYOVU69GpKAP+YjjOw== X-Received: by 2002:a17:907:94d4:b0:a99:f56e:ce40 with SMTP id a640c23a62f3a-a9a69c9e9c1mr258739466b.47.1729261344161; Fri, 18 Oct 2024 07:22:24 -0700 (PDT) Received: from x1c10.dc1.ventanamicro.com (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9a68c2677esm102812166b.188.2024.10.18.07.22.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 07:22:23 -0700 (PDT) From: Robin Dapp X-Google-Original-From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com Subject: [PATCH v2 3/8] tree-ifcvt: Enforce zero else value after maskload. Date: Fri, 18 Oct 2024 16:22:15 +0200 Message-ID: <20241018142220.173482-4-rdapp@ventanamicro.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241018142220.173482-1-rdapp@ventanamicro.com> References: <20241018142220.173482-1-rdapp@ventanamicro.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. In order to formalize this this patch queries the target for its supported else operand and uses that for the maskload call. Subsequently, if the else operand is nonzero, a cond_expr enforcing a zero else value is emitted. gcc/ChangeLog: * tree-if-conv.cc (predicate_load_or_store): Enforce zero else value for padded types. (predicate_statements): Use sequence instead of statement. --- gcc/tree-if-conv.cc | 112 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 94 insertions(+), 18 deletions(-) diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 90c754a4814..9623426e1e1 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -2531,12 +2531,15 @@ mask_exists (int size, const vec &vec) /* Helper function for predicate_statements. STMT is a memory read or write and it needs to be predicated by MASK. Return a statement - that does so. */ + that does so. SSA_NAMES is the set of SSA names defined earlier in + STMT's block. */ -static gimple * -predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask) +static gimple_seq +predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask, + hash_set *ssa_names) { - gcall *new_stmt; + gimple_seq stmts = NULL; + gcall *call_stmt; tree lhs = gimple_assign_lhs (stmt); tree rhs = gimple_assign_rhs1 (stmt); @@ -2552,21 +2555,88 @@ predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask) ref); if (TREE_CODE (lhs) == SSA_NAME) { - new_stmt - = gimple_build_call_internal (IFN_MASK_LOAD, 3, addr, - ptr, mask); - gimple_call_set_lhs (new_stmt, lhs); - gimple_set_vuse (new_stmt, gimple_vuse (stmt)); + /* Get the preferred vector mode and its corresponding mask for the + masked load. We need this to query the target's supported else + operands. */ + machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); + scalar_mode smode = as_a (mode); + + machine_mode vmode = targetm.vectorize.preferred_simd_mode (smode); + machine_mode mask_mode + = targetm.vectorize.get_mask_mode (vmode).require (); + + auto_vec elsvals; + internal_fn ifn; + bool have_masked_load + = target_supports_mask_load_store_p (vmode, mask_mode, true, &ifn, + &elsvals); + + /* We might need to explicitly zero inactive elements if there are + padding bits in the type that might leak otherwise. + Refer to PR115336. */ + bool need_zero + = TYPE_PRECISION (TREE_TYPE (lhs)) < GET_MODE_PRECISION (smode); + + int elsval; + bool implicit_zero = false; + if (have_masked_load) + { + gcc_assert (elsvals.length ()); + + /* But not if the target already provide implicit zeroing of inactive + elements. */ + implicit_zero = elsvals.contains (MASK_LOAD_ELSE_ZERO); + + /* For now, just use the first else value if zero is unsupported. */ + elsval = implicit_zero ? MASK_LOAD_ELSE_ZERO : *elsvals.begin (); + } + else + { + /* We cannot vectorize this either way so just use a zero even + if it is unsupported. */ + elsval = MASK_LOAD_ELSE_ZERO; + } + + tree els = vect_get_mask_load_else (elsval, TREE_TYPE (lhs)); + + call_stmt + = gimple_build_call_internal (IFN_MASK_LOAD, 4, addr, + ptr, mask, els); + + /* Build the load call and, if the else value is nonzero, + a COND_EXPR that enforces it. */ + tree loadlhs; + if (!need_zero || implicit_zero) + gimple_call_set_lhs (call_stmt, gimple_get_lhs (stmt)); + else + { + loadlhs = make_temp_ssa_name (TREE_TYPE (lhs), NULL, "_ifc_"); + ssa_names->add (loadlhs); + gimple_call_set_lhs (call_stmt, loadlhs); + } + gimple_set_vuse (call_stmt, gimple_vuse (stmt)); + gimple_seq_add_stmt (&stmts, call_stmt); + + if (need_zero && !implicit_zero) + { + tree cond_rhs + = fold_build_cond_expr (TREE_TYPE (loadlhs), mask, loadlhs, + build_zero_cst (TREE_TYPE (loadlhs))); + gassign *cond_stmt + = gimple_build_assign (gimple_get_lhs (stmt), cond_rhs); + gimple_seq_add_stmt (&stmts, cond_stmt); + } } else { - new_stmt + call_stmt = gimple_build_call_internal (IFN_MASK_STORE, 4, addr, ptr, mask, rhs); - gimple_move_vops (new_stmt, stmt); + gimple_move_vops (call_stmt, stmt); + gimple_seq_add_stmt (&stmts, call_stmt); } - gimple_call_set_nothrow (new_stmt, true); - return new_stmt; + gimple_call_set_nothrow (call_stmt, true); + return stmts; } /* STMT uses OP_LHS. Check whether it is equivalent to: @@ -2836,7 +2906,6 @@ predicate_statements (loop_p loop) { tree lhs = gimple_assign_lhs (stmt); tree mask; - gimple *new_stmt; gimple_seq stmts = NULL; machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); /* We checked before setting GF_PLF_2 that an equivalent @@ -2870,11 +2939,18 @@ predicate_statements (loop_p loop) vect_masks.safe_push (mask); } if (gimple_assign_single_p (stmt)) - new_stmt = predicate_load_or_store (&gsi, stmt, mask); - else - new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names); + { + gimple_seq call_seq + = predicate_load_or_store (&gsi, stmt, mask, &ssa_names); - gsi_replace (&gsi, new_stmt, true); + gsi_replace_with_seq (&gsi, call_seq, true); + } + else + { + gimple *new_stmt; + new_stmt = predicate_rhs_code (stmt, mask, cond, &ssa_names); + gsi_replace (&gsi, new_stmt, true); + } } else if (((lhs = gimple_assign_lhs (stmt)), true) && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))