From patchwork Tue Aug 13 04:30:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 1971733 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=Ok2KEfkP; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WjdkK3cd1z1yXl for ; Tue, 13 Aug 2024 14:31:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5D2783858429 for ; Tue, 13 Aug 2024 04:31:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by sourceware.org (Postfix) with ESMTPS id 790F33858D39 for ; Tue, 13 Aug 2024 04:30:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 790F33858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 790F33858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723523440; cv=none; b=BdTNtnlZ7A8be5TJmqq48HKwjKgf+stwzgLS4ndh8GFJb8QAevKrH77I3i74oooncOrq28easlnY2yrEqrf5ocaO0YOw3YA3F2kX8tW3OQOplXhuUXXqUDOW+wfqFY/JD1krY76NxhALh1L5a5APs1rddRuEgsh1VFceKGBHc7c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723523440; c=relaxed/simple; bh=7+RC4ElQkwv507gyEn9oaJtlfWHxAts7dMVk/iIBR9A=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=fUY7AJam+wWz9Gx5SVDR9YGq/WVig6HX8tK2Io1E7KO1igzVWmNybV+QJYSh9z+fSY9imtYjU4dbFiui8HZhIwrMp+fIMhf7tltLdNARXXxkzQIVsi9yIRFs0xg2I+B7OShYrCexHChalkOtSUGBnFD/VjGuFBroKEDJvJbmStc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 47D49aMg008223 for ; Tue, 13 Aug 2024 04:30:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= oieneu1Cc5YR1gBZctrsMOnbM3bpTs27NIsSD+ATx+k=; b=Ok2KEfkP6ZIYrCmV xeGRKLLI9K/ADwG+FHPPuRJp8rfA4huVbR6O/kSo2DD8Sa2PymtMWr3d9oVt7/ia oYbyaCUBkTb/I6y4cKT/b8pwZaDvzYqjaQ14SY/PPMWcw6yH0pFIKbVPapSAWyH5 oAOU7EIL2q+DUa8nB66gJFCCcpR/Rc1JzC/Z4EqLR+6YS7wBqoBGdMThlyQagWJA m+hr3oUuiCel76Ol91IE0zKm9ynkcWX4WkyYA53CUXfmX64dp/0NLLPhdM5Co1/2 3F2SjYYtGlBkgh98S/UJDIv/UGPAfCA/h72TvNr3mVdFRrSrqNb5A8oXNhRpdAjk H0J7IA== Received: from nasanppmta04.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 40x15e6a50-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 13 Aug 2024 04:30:35 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA04.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 47D4UZlO012041 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 13 Aug 2024 04:30:35 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Mon, 12 Aug 2024 21:30:35 -0700 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 2/3] match: extend the `((a CMP b) ? c : 0) | ((a CMP' b) ? d : 0)` patterns to support ^ and + [PR103660] Date: Mon, 12 Aug 2024 21:30:22 -0700 Message-ID: <20240813043023.3685386-2-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240813043023.3685386-1-quic_apinski@quicinc.com> References: <20240813043023.3685386-1-quic_apinski@quicinc.com> MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01b.na.qualcomm.com (10.47.209.197) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: xsSmki-KrAFUYutw4xrEp5Qs5HBHaPYE X-Proofpoint-ORIG-GUID: xsSmki-KrAFUYutw4xrEp5Qs5HBHaPYE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-12_12,2024-08-12_02,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 malwarescore=0 phishscore=0 suspectscore=0 priorityscore=1501 mlxlogscore=916 clxscore=1015 spamscore=0 impostorscore=0 lowpriorityscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2407110000 definitions=main-2408130030 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org r13-4620-g4d9db4bdd458 Added a few patterns and some of them can be extended to support XOR and PLUS. This extends the patterns to support XOR and PLUS instead of just IOR. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/103660 gcc/ChangeLog: * match.pd (`((a CMP b) ? c : 0) | ((a CMP' b) ? d : 0)`): Extend to support XOR and PLUS. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr103660-2.C: New test. * g++.dg/tree-ssa/pr103660-3.C: New test. * gcc.dg/tree-ssa/pr103660-2.c: New test. * gcc.dg/tree-ssa/pr103660-3.c: New test. Signed-off-by: Andrew Pinski --- gcc/match.pd | 42 +++++++++++--------- gcc/testsuite/g++.dg/tree-ssa/pr103660-2.C | 30 +++++++++++++++ gcc/testsuite/g++.dg/tree-ssa/pr103660-3.C | 30 +++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/pr103660-2.c | 45 ++++++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/pr103660-3.c | 35 +++++++++++++++++ 5 files changed, 163 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr103660-2.C create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr103660-3.C create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103660-2.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103660-3.c diff --git a/gcc/match.pd b/gcc/match.pd index c9c8478d286..b43ceb6def0 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2356,18 +2356,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Fold ((-(a < b) & c) | (-(a >= b) & d)) into a < b ? c : d. This is canonicalized further and we recognize the conditional form: - (a < b ? c : 0) | (a >= b ? d : 0) into a < b ? c : d. */ - (simplify - (bit_ior - (cond (cmp@0 @01 @02) @3 zerop) - (cond (icmp@4 @01 @02) @5 zerop)) - (if (INTEGRAL_TYPE_P (type) - && invert_tree_comparison (cmp, HONOR_NANS (@01)) == icmp - /* The scalar version has to be canonicalized after vectorization - because it makes unconditional loads conditional ones, which - means we lose vectorization because the loads may trap. */ - && canonicalize_math_after_vectorization_p ()) - (cond @0 @3 @5))) + (a < b ? c : 0) | (a >= b ? d : 0) into a < b ? c : d. + Handle also ^ and + in replacement of `|`. */ + (for op (bit_ior bit_xor plus) + (simplify + (op + (cond (cmp@0 @01 @02) @3 zerop) + (cond (icmp@4 @01 @02) @5 zerop)) + (if (INTEGRAL_TYPE_P (type) + && invert_tree_comparison (cmp, HONOR_NANS (@01)) == icmp + /* The scalar version has to be canonicalized after vectorization + because it makes unconditional loads conditional ones, which + means we lose vectorization because the loads may trap. */ + && canonicalize_math_after_vectorization_p ()) + (cond @0 @3 @5)))) /* Vector Fold (((a < b) & c) | ((a >= b) & d)) into a < b ? c : d. and ((~(a < b) & c) | (~(a >= b) & d)) into a < b ? c : d. */ @@ -2391,13 +2393,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (vec_cond @0 @3 @2)))))) /* Scalar Vectorized Fold ((-(a < b) & c) | (-(a >= b) & d)) - into a < b ? d : c. */ - (simplify - (bit_ior - (vec_cond:s (cmp@0 @4 @5) @2 integer_zerop) - (vec_cond:s (icmp@1 @4 @5) @3 integer_zerop)) - (if (invert_tree_comparison (cmp, HONOR_NANS (@4)) == icmp) - (vec_cond @0 @2 @3)))) + into a < b ? d : c. + Handle also ^ and + in replacement of `|`. */ + (for op (bit_ior bit_xor plus) + (simplify + (op + (vec_cond:s (cmp@0 @4 @5) @2 integer_zerop) + (vec_cond:s (icmp@1 @4 @5) @3 integer_zerop)) + (if (invert_tree_comparison (cmp, HONOR_NANS (@4)) == icmp) + (vec_cond @0 @2 @3))))) /* Transform X & -Y into X * Y when Y is { 0 or 1 }. */ (simplify diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr103660-2.C b/gcc/testsuite/g++.dg/tree-ssa/pr103660-2.C new file mode 100644 index 00000000000..95205c02bc3 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr103660-2.C @@ -0,0 +1,30 @@ +/* PR tree-optimization/103660 */ +/* Vector type version. */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-forwprop1-raw -Wno-psabi" } */ + +typedef int v4si __attribute((__vector_size__(4 * sizeof(int)))); +#define funcs(OP,n) \ +v4si min_##n(v4si a, v4si b) { \ + v4si X = a < b ? a : 0; \ + v4si Y = a >= b ? b : 0; \ + return (X OP Y); \ +} \ +v4si f_##n(v4si a, v4si b, \ + v4si c, v4si d) { \ + v4si X = a < b ? c : 0; \ + v4si Y = a >= b ? d : 0; \ + return (X OP Y); \ +} + + +funcs(^, xor) +funcs(+, plus) + +/* min_xor/min_plus should produce min or `a < b ? a : b` depending on if the target + supports min on the vector type or not. */ +/* f_xor/f_plus should produce (a < b) ? c : d */ +/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "forwprop1" } } */ +/* { dg-final { scan-tree-dump-not "plus_expr, " "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "(?:lt_expr|min_expr), " 4 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "(?:vec_cond_expr|min_expr), " 4 "forwprop1" } } */ diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr103660-3.C b/gcc/testsuite/g++.dg/tree-ssa/pr103660-3.C new file mode 100644 index 00000000000..0800ad8e90e --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr103660-3.C @@ -0,0 +1,30 @@ +/* PR tree-optimization/103660 */ +/* Vector type version. */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-forwprop1-raw -Wno-psabi" } */ + +typedef int v4si __attribute((__vector_size__(4 * sizeof(int)))); +#define funcs(OP,n) \ +v4si min_##n(v4si a, v4si b) { \ + v4si X = -(a < b) * a; \ + v4si Y = -(a >= b) * b; \ + return (X OP Y); \ +} \ +v4si f_##n(v4si a, v4si b, \ + v4si c, v4si d) { \ + v4si X = -(a < b) * c; \ + v4si Y = -(a >= b) * d; \ + return (X OP Y); \ +} + + +funcs(^, xor) +funcs(+, plus) + +/* min_xor/min_plus should produce min or `a < b ? a : b` depending on if the target + supports min on the vector type or not. */ +/* f_xor/f_plus should produce (a < b) ? c : d */ +/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "forwprop1" } } */ +/* { dg-final { scan-tree-dump-not "plus_expr, " "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "(?:lt_expr|min_expr), " 4 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "(?:vec_cond_expr|min_expr), " 4 "forwprop1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103660-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103660-2.c new file mode 100644 index 00000000000..ce4da00a888 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103660-2.c @@ -0,0 +1,45 @@ +/* PR tree-optimization/103660 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fgimple -fdump-tree-forwprop4-raw" } */ + +#define funcs(OP,n) \ +__GIMPLE() \ +int min_##n(int a, int b) { \ + _Bool X; \ + _Bool Y; \ + int t; \ + int t1; \ + int t2; \ + X = a < b; \ + Y = a >= b; \ + t1 = X ? a : 0; \ + t2 = Y ? b : 0; \ + t = t1 OP t2; \ + return t; \ +} \ +__GIMPLE() \ +int f_##n(int a, int b, int c, \ + int d) { \ + _Bool X; \ + _Bool Y; \ + int t; \ + int t1; \ + int t2; \ + X = a < b; \ + Y = a >= b; \ + t1 = X ? c : 0; \ + t2 = Y ? d : 0; \ + t = t1 OP t2; \ + return t; \ +} + +funcs(^, xor) +funcs(+, plus) + +/* min_xor/min_plus should produce min */ +/* f_xor/f_plus should produce (a < b) ? c : d */ +/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "forwprop4" } } */ +/* { dg-final { scan-tree-dump-not "plus_expr, " "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "min_expr, " 2 "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "lt_expr, " 2 "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "cond_expr, " 2 "forwprop4" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103660-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103660-3.c new file mode 100644 index 00000000000..bd770b1b6d7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103660-3.c @@ -0,0 +1,35 @@ +/* PR tree-optimization/103660 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fdump-tree-forwprop4-raw" } */ + +#define funcs(OP,n) \ +int min_##n(int a, int b) { \ + int t; \ + int t1; \ + int t2; \ + t1 = (a < b) * a; \ + t2 = (a >= b) * b; \ + t = t1 OP t2; \ + return t; \ +} \ +int f_##n(int a, int b, int c, \ + int d) { \ + int t; \ + int t1; \ + int t2; \ + t1 = (a < b) * c; \ + t2 = (a >= b) * d; \ + t = t1 OP t2; \ + return t; \ +} + +funcs(^, xor) +funcs(+, plus) + +/* min_xor/min_plus should produce min */ +/* f_xor/f_plus should produce (a < b) ? c : d */ +/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "forwprop4" } } */ +/* { dg-final { scan-tree-dump-not "plus_expr, " "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "min_expr, " 2 "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "lt_expr, " 2 "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "cond_expr, " 2 "forwprop4" } } */