From patchwork Fri Aug 16 21:35:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 1973376 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=UZ0N1gTm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WlwL14K7Hz1yXZ for ; Sat, 17 Aug 2024 07:36:38 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFAE1385EC59 for ; Fri, 16 Aug 2024 21:36:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 85D173858C66 for ; Fri, 16 Aug 2024 21:36:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 85D173858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 85D173858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723844174; cv=none; b=OKpWcttnt1oIVjTyhxuYxDf1YajizDxeS1scW2IOIyfjPR5JHQNoBWdRKb8YWrguRh253774U1faDNjUNJo5QacC687PoDbV6QY5Rb4qNFoSfJYfjf/UbnkMEQ4PXy9fRjpcbQLsfwzqt+a+rn2RO1tSYLryJIRJSFsmO5OQNqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723844174; c=relaxed/simple; bh=o6unx7at5DoJYd/CfeKphArSo1yUq/uqIfaYWKtQT70=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=igTqXphywn4iTvS8dIeqkHn2SUMFkoezzr+qad0lkQ7t2CmdmBKZP+RzVo8H2llo3DY642eyFqeFqDH98IFXMbEj4ipZyuFu2eOW6/W90tA9k57npIZFNgLyVQ78VvkJ5t8pTsnNIDE/AcX0HkENMF06xew2UNuORidxkiYJmlg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279868.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 47GJLGeE013468 for ; Fri, 16 Aug 2024 21:36:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:message-id :mime-version:subject:to; s=qcppdkim1; bh=tUdBMXnvzb+EL9r7y5GQRn rgZIl8iShNVSefcs4CdHE=; b=UZ0N1gTmN749DNxy7ijQqQUMAsZMr9Q6be/NGm MT134TpPvZ63yg3e1YqZsP9rYIgcyRbYc97F5ecoUfMAabM2gJgaDAVcTNFqtxOG zVEUrngt/76JG3vYhvFRDbfC4zpJEfVB/9Tw8qgNJK4HtfCAnI8Y21G9YuIcGCfA bnbVRKTHMs/RjzNoLOM7/M27jX2mvJ/teQtl8ASjQ+pqcJiftd+AfEQs/yQf6y3Y CEhlrNEphvw3jV2oQ+NJsv12x2Vp2oHbfyB8XFlOLi62lxOrKE+gX+4FtqKeomWg C3IHPBzkDTf06fVpgn6G9HFAxI5+DRMi7Tsjmz4OJ19yrXjw== Received: from nasanppmta04.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4112r3x2gt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 16 Aug 2024 21:36:11 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA04.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 47GLaAjS025349 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 16 Aug 2024 21:36:10 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Fri, 16 Aug 2024 14:36:10 -0700 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 1/2] builtins: Don't expand bit query builtins for __int128_t if the target supports an optab for it Date: Fri, 16 Aug 2024 14:35:58 -0700 Message-ID: <20240816213559.1486438-1-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01a.na.qualcomm.com (10.47.209.196) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: ec_e8s5nO3BjjrufzLKoneqXRyBqFlHt X-Proofpoint-ORIG-GUID: ec_e8s5nO3BjjrufzLKoneqXRyBqFlHt X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-16_16,2024-08-16_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 lowpriorityscore=3 phishscore=0 bulkscore=3 mlxscore=0 suspectscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 impostorscore=0 mlxlogscore=531 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2407110000 definitions=main-2408160153 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org On aarch64 (without !CSSC instructions), since popcount is implemented using the SIMD instruction cnt, instead of using two SIMD cnt (V8QI mode), it is better to use one 128bit cnt (V16QI mode). And only one reduction addition instead of 2. Currently fold_builtin_bit_query will expand always without checking if there was an optab for the type, so this changes that to check the optab to see if we should expand or have the backend handle it. Bootstrapped and tested on x86_64-linux-gnu and built and tested for aarch64-linux-gnu. gcc/ChangeLog: * builtins.cc (fold_builtin_bit_query): Don't expand double `unsigned long long` typess if there is an optab entry for that type. Signed-off-by: Andrew Pinski --- gcc/builtins.cc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 0b902896ddd..b4d51eaeba5 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -10185,7 +10185,9 @@ fold_builtin_bit_query (location_t loc, enum built_in_function fcode, tree call = NULL_TREE, tem; if (TYPE_PRECISION (arg0_type) == MAX_FIXED_MODE_SIZE && (TYPE_PRECISION (arg0_type) - == 2 * TYPE_PRECISION (long_long_unsigned_type_node))) + == 2 * TYPE_PRECISION (long_long_unsigned_type_node)) + /* If the target supports the optab, then don't do the expansion. */ + && !direct_internal_fn_supported_p (ifn, arg0_type, OPTIMIZE_FOR_BOTH)) { /* __int128 expansions using up to 2 long long builtins. */ arg0 = save_expr (arg0); From patchwork Fri Aug 16 21:35:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 1973377 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=jKq4CkTf; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WlwLv39dXz1yXZ for ; Sat, 17 Aug 2024 07:37:27 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6C4BB385EC59 for ; Fri, 16 Aug 2024 21:37:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 8B5073858430 for ; Fri, 16 Aug 2024 21:36:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B5073858430 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8B5073858430 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723844174; cv=none; b=pEm+ZAN9K17kYA8FMIL9pCNjVM6wqL7sv08eAvjNxHS2nt1K40Sx9MGeoc4f3u8dUDuHRepW22hvmZ3NrfrVP+PY9n3h0xu3wWyV7e+r5X5XGWZiEoDOD27lF3HBEyiNjl0Cixd1ZuEczdF634qGD0VKCRN51pYNI/z1eC63nCE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723844174; c=relaxed/simple; bh=yPoAJKxGG7KIljDTBgkUedvdDCF+ao+proHduDXqPtU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=f0hFXdPiZNmtDyUSi+48gE9bps5FZCl/wYD/orqHoaA3IkrliQg+KExWZ/gDGXrvmsiTxU1i+RQIAIkvI3uy1H7FY3niu4eGUvFJclGJQnY2pVayKHE098pB6qWhXfxVwENUgnIefB3xMPOZT+Bkaa/z+XbttL31d0e3vQP+0d4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 47GJLEHV007027 for ; Fri, 16 Aug 2024 21:36:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= /7vV8/HsU46CqZwJVTvyURCMn/WlfMn0+elMz0wtI0Q=; b=jKq4CkTfszOIngUJ F+Nf6L/yKnotFA9dwb/Pe24GdOaw5NA4cMUn2BP/zVXpCgJxzunTKJxV4HXF+w1i dWCiNMVb9N+y8wDbXk6TrKCr/ge9ALyH+StaJ/7X5pug8t1oTYNWP1m+242sdadF RHQ+5dDQ9u2m2tgIgRbXkP0Uz7IedriGk30g4aSslfBZqiASRrwk9TQwANdc5Cej zSV7HcUIayvmzPzOmc9BEeGjWUDyQ6eKk+NxvAzVvfhNaYYtSjCncSmqSNWWTZSJ zPXKlmY1zYWtTi5AEF3pk+uJ4huoutkV65B8PqeecQ5qBArps+XOAfslkDptL1AE XWhVnw== Received: from nasanppmta05.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 411957wcp7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 16 Aug 2024 21:36:11 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA05.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 47GLaAKt017252 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 16 Aug 2024 21:36:10 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Fri, 16 Aug 2024 14:36:10 -0700 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 2/2] aarch64: Implement popcountti2 pattern [PR113042] Date: Fri, 16 Aug 2024 14:35:59 -0700 Message-ID: <20240816213559.1486438-2-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240816213559.1486438-1-quic_apinski@quicinc.com> References: <20240816213559.1486438-1-quic_apinski@quicinc.com> MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01a.na.qualcomm.com (10.47.209.196) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: X1MQVLdqqPBIYAgAukgy6RH2DhxUgOZf X-Proofpoint-ORIG-GUID: X1MQVLdqqPBIYAgAukgy6RH2DhxUgOZf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-16_16,2024-08-16_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 phishscore=0 lowpriorityscore=0 mlxlogscore=798 clxscore=1015 suspectscore=0 impostorscore=0 bulkscore=0 malwarescore=0 spamscore=0 mlxscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2407110000 definitions=main-2408160153 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org When CSSC is not enabled, 128bit popcount can be implemented just via the vector (v16qi) cnt instruction followed by a reduction, like how the 64bit one is currently implemented instead of splitting into 2 64bit popcount. Build and tested for aarch64-linux-gnu. PR target/113042 gcc/ChangeLog: * config/aarch64/aarch64.md (popcountti2): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/aarch64/popcnt10.c: New test. * gcc.target/aarch64/popcnt9.c: New test. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64.md | 16 +++++++++++++ gcc/testsuite/gcc.target/aarch64/popcnt10.c | 25 +++++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/popcnt9.c | 25 +++++++++++++++++++++ 3 files changed, 66 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/popcnt10.c create mode 100644 gcc/testsuite/gcc.target/aarch64/popcnt9.c diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 12dcc16529a..73506e71f43 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5378,6 +5378,22 @@ (define_expand "popcount2" } }) +(define_expand "popcountti2" + [(set (match_operand:TI 0 "register_operand") + (popcount:TI (match_operand:TI 1 "register_operand")))] + "TARGET_SIMD && !TARGET_CSSC" +{ + rtx v = gen_reg_rtx (V16QImode); + rtx v1 = gen_reg_rtx (V16QImode); + emit_move_insn (v, gen_lowpart (V16QImode, operands[1])); + emit_insn (gen_popcountv16qi2 (v1, v)); + rtx out = gen_reg_rtx (DImode); + emit_insn (gen_aarch64_zero_extenddi_reduc_plus_v16qi (out, v1)); + out = convert_to_mode (TImode, out, true); + emit_move_insn (operands[0], out); + DONE; +}) + (define_insn "clrsb2" [(set (match_operand:GPI 0 "register_operand" "=r") (clrsb:GPI (match_operand:GPI 1 "register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt10.c b/gcc/testsuite/gcc.target/aarch64/popcnt10.c new file mode 100644 index 00000000000..4d01fc67022 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/popcnt10.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +/* PR target/113042 */ + +#pragma GCC target "+cssc" + +/* +** h128: +** ldp x([0-9]+), x([0-9]+), \[x0\] +** cnt x([0-9]+), x([0-9]+) +** cnt x([0-9]+), x([0-9]+) +** add w0, w([0-9]+), w([0-9]+) +** ret +*/ + + +unsigned h128 (const unsigned __int128 *a) { + return __builtin_popcountg (a[0]); +} + +/* popcount with CSSC should be split into 2 sections. */ +/* { dg-final { scan-tree-dump-not "POPCOUNT " "optimized" } } */ +/* { dg-final { scan-tree-dump-times " __builtin_popcount" 2 "optimized" } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt9.c b/gcc/testsuite/gcc.target/aarch64/popcnt9.c new file mode 100644 index 00000000000..c778fc7f420 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/popcnt9.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +/* PR target/113042 */ + +#pragma GCC target "+nocssc" + +/* +** h128: +** ldr q([0-9]+), \[x0\] +** cnt v([0-9]+).16b, v\1.16b +** addv b([0-9]+), v\2.16b +** fmov w0, s\3 +** ret +*/ + + +unsigned h128 (const unsigned __int128 *a) { + return __builtin_popcountg (a[0]); +} + +/* There should be only one POPCOUNT. */ +/* { dg-final { scan-tree-dump-times "POPCOUNT " 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-not " __builtin_popcount" "optimized" } } */ +