From patchwork Fri Jul 19 17:43:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1962582 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WQcVF2myGz1xpQ for ; Sat, 20 Jul 2024 03:43:49 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0857A386103D for ; Fri, 19 Jul 2024 17:43:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E3A7C385DC20 for ; Fri, 19 Jul 2024 17:43:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E3A7C385DC20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E3A7C385DC20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721411007; cv=none; b=M7YrhKnynUlhd9JK21TEZlrlRynxKaW7LEecIs5s++iHFwg464Gfca626C+SeUa3zKBhR01EFCvEaZEtHSnL/paWxGIRznQpUvaOstOOP3MEWzwe7Nt7KotFEMetFQn2U4IdagCit09v+2E7XZKMhcXFreimZ3or31N2lNDZyj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721411007; c=relaxed/simple; bh=gF9C7FetNW5Z3EMieVONx0j4M0Kny6Df5b8onZqbeXQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=ZdGuCakJhAoytj5vgcJiBQhtuJnOm+Z5xM+cCwt/40enSudUxZ8Y9Cn6iaImVYkMBvH5dcYCWQkSajovqqEvLyCK4k04ajc9UOjg/cD4OC0dyFYmedNvldMvCwle7HZMaOpk2UJLsl0sSAuRo6ey2l2KOvPLBeffQQqXeogVUx0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E44D2169E for ; Fri, 19 Jul 2024 10:43:50 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2AFEF3F762 for ; Fri, 19 Jul 2024 10:43:25 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH] Treat boolean vector elements as 0/-1 [PR115406] Date: Fri, 19 Jul 2024 18:43:23 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Spam-Status: No, score=-19.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Previously we built vector boolean constants using 1 for true elements and 0 for false elements. This matches the predicates produced by SVE's PTRUE instruction, but leads to a miscompilation on AVX512, where all bits of a boolean element should be set. One option for RTL would be to make this target-configurable. But that isn't really possible at the tree level, where vectors should work in a more target-independent way. (There is currently no way to create a "generic" packed boolean vector, but never say never :)) And, if we were going to pick a generic behaviour, it would make sense to use 0/-1 rather than 0/1, for consistency with integer vectors. Both behaviours should work with SVE on read, since SVE ignores the upper bits in each predicate element. And the choice shouldn't make much difference for RTL, since all SVE predicate modes are expressed as vectors of BI, rather than of multi-bit booleans. I suspect there might be some fallout from this change on SVE. But I think we should at least give it a go, and see whether any fallout provides a strong counterargument against the approach. Tested on aarch64-linux-gnu (configured with --with-cpu=neoverse-v1 for extra coverage) and x86_64-linux-gnu. OK to install? Richard gcc/ PR middle-end/115406 * fold-const.cc (native_encode_vector_part): For vector booleans, check whether an element is nonzero and, if so, set all of the correspending bits in the target image. * simplify-rtx.cc (native_encode_rtx): Likewise. gcc/testsuite/ PR middle-end/115406 * gcc.dg/torture/pr115406.c: New test. --- gcc/fold-const.cc | 5 +++-- gcc/simplify-rtx.cc | 3 ++- gcc/testsuite/gcc.dg/torture/pr115406.c | 18 ++++++++++++++++++ 3 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr115406.c diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 710d697c021..a509b46b904 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -8097,16 +8097,17 @@ native_encode_vector_part (const_tree expr, unsigned char *ptr, int len, unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits; unsigned int first_elt = off * elts_per_byte; unsigned int extract_elts = extract_bytes * elts_per_byte; + unsigned int elt_mask = (1 << elt_bits) - 1; for (unsigned int i = 0; i < extract_elts; ++i) { tree elt = VECTOR_CST_ELT (expr, first_elt + i); if (TREE_CODE (elt) != INTEGER_CST) return 0; - if (ptr && wi::extract_uhwi (wi::to_wide (elt), 0, 1)) + if (ptr && integer_nonzerop (elt)) { unsigned int bit = i * elt_bits; - ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT); + ptr[bit / BITS_PER_UNIT] |= elt_mask << (bit % BITS_PER_UNIT); } } return extract_bytes; diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index 35ba54c6292..a49eefb34d4 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -7232,7 +7232,8 @@ native_encode_rtx (machine_mode mode, rtx x, vec &bytes, target_unit value = 0; for (unsigned int j = 0; j < BITS_PER_UNIT; j += elt_bits) { - value |= (INTVAL (CONST_VECTOR_ELT (x, elt)) & mask) << j; + if (INTVAL (CONST_VECTOR_ELT (x, elt))) + value |= mask << j; elt += 1; } bytes.quick_push (value); diff --git a/gcc/testsuite/gcc.dg/torture/pr115406.c b/gcc/testsuite/gcc.dg/torture/pr115406.c new file mode 100644 index 00000000000..800ef2f8317 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr115406.c @@ -0,0 +1,18 @@ +// { dg-do run } +// { dg-additional-options "-mavx512f" { target avx512f_runtime } } + +typedef __attribute__((__vector_size__ (1))) signed char V; + +signed char +foo (V v) +{ + return ((V) v == v)[0]; +} + +int +main () +{ + signed char x = foo ((V) { }); + if (x != -1) + __builtin_abort (); +}