From patchwork Mon Dec 11 02:54:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1874277 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=KrdcKOia; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SpRDd6nmbz20Gb for ; Mon, 11 Dec 2023 13:54:56 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7D3AB385770C for ; Mon, 11 Dec 2023 02:54:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 24817385C6D5 for ; Mon, 11 Dec 2023 02:54:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24817385C6D5 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 24817385C6D5 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702263281; cv=none; b=KXiME5cfnnjCG3hnAQBR1mpyj2kjaDPcZX5sLxOm5LjFWpqDxmjk8hJa+1kkN+twrOmm4e/dbYczZY3IHDqUtQhqOFZIwT58E/zFc2K3vguLUWUzPp0xQ3BgIw0W1YJdFECjle9ubX2Dq+Bhdv4T8QJLxQyTA5wAUrhwNspGZXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702263281; c=relaxed/simple; bh=ad6wEYfV7f8lQDXMuqMD0imMxY4F9knP05XUxNGY8go=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=fqJVy3i1qfqwR4jhm0e/vvNGMpCcO6OV7jX2rVCBLV2KmCXMA9vOsBm0AbkBhkSwHd0mfaHaAipoSWEDz6l8KHo6ilAYDdr7N4r5XdgQYfUSTn6OgwK9T0r0STS3WkfhqhOigLCUJbbs9npXEIJr93uC215hc+OshDSTjtMTi04= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BAMMqRX011238; Mon, 11 Dec 2023 02:54:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : cc : from : subject : content-type : content-transfer-encoding; s=pp1; bh=w9/V3MwvGOEu6nGFks7VT1Qlm81DIe3K4WY59JYxAQE=; b=KrdcKOiaoLZMtI45wq/OKUugRCvpGBJKXZKQE3P4q8E3YEDOigrSVVrjUrbvBoi/Xmpr IhmXHC6W7UTzrYrglyp2tjiZA+lcphMoBOKDyHialgU223fA379YDweG9vNJpzv4qGQk Rl0L+QfIxYMZPtQ84oGTT0y6ZQ//5CP8Genbo2EnvFeFRTq9/Av6VghlqO5dPZSdii14 1JQY37fvDgwpvcJZCti6vOHvE137KZXK24i8NFzHJu5C9G0VbjHi9Tl4zW/VxViwA5eu q8KEX4O+0PME9Mr5kfcTt1CTli/0JkQ6u/miZi9jw+1TmmBhubIcgmYPdQieCGpXKE1c iQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uwfur9y15-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Dec 2023 02:54:39 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3BB2nmcq013433; Mon, 11 Dec 2023 02:54:38 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uwfur9y0w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Dec 2023 02:54:38 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3BB2jcMQ004101; Mon, 11 Dec 2023 02:54:38 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uw4sjx70e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Dec 2023 02:54:37 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3BB2sZu414090966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 Dec 2023 02:54:35 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0990620043; Mon, 11 Dec 2023 02:54:35 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EAF3B20040; Mon, 11 Dec 2023 02:54:32 +0000 (GMT) Received: from [9.200.48.207] (unknown [9.200.48.207]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 11 Dec 2023 02:54:32 +0000 (GMT) Message-ID: Date: Mon, 11 Dec 2023 10:54:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner From: HAO CHEN GUI Subject: [Patch, rs6000] Clean up pre-checking of expand_block_compare X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: o4uTXTSFA6qxm_6VVGlGCCMW5WYfufDw X-Proofpoint-GUID: ogIwehdFzCtS99FiNv1Q0GZC6VY7gs49 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-10_16,2023-12-07_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 impostorscore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 spamscore=0 adultscore=0 mlxscore=0 malwarescore=0 clxscore=1015 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2312110023 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This patch cleans up pre-checking of expand_block_compare. It does 1. Assert only P7 above can enter this function as it's already guard by the expand. 2. Return false when optimizing for size. 3. Remove P7 CPU test as only P7 above can enter this function and P7 LE is excluded by targetm.slow_unaligned_access. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is this OK for trunk? Thanks Gui Haochen ChangeLog rs6000: Clean up pre-checking of expand_block_compare gcc/ * gcc/config/rs6000/rs6000-string.cc (expand_block_compare): Assert only P7 above can enter this function. Return false when it's optimized for size. Remove P7 CPU test as only P7 above can enter this function and P7 LE is excluded by the checking of targetm.slow_unaligned_access on word_mode. gcc/testsuite/ * gcc.target/powerpc/memcmp_for_size.c: New. patch.diff diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc index d4030854b2a..dff69e90d0c 100644 --- a/gcc/config/rs6000/rs6000-string.cc +++ b/gcc/config/rs6000/rs6000-string.cc @@ -1946,6 +1946,15 @@ expand_block_compare_gpr(unsigned HOST_WIDE_INT bytes, unsigned int base_align, bool expand_block_compare (rtx operands[]) { + gcc_assert (TARGET_POPCNTD); + + if (optimize_insn_for_size_p ()) + return false; + + /* Allow this param to shut off all expansion. */ + if (rs6000_block_compare_inline_limit == 0) + return false; + rtx target = operands[0]; rtx orig_src1 = operands[1]; rtx orig_src2 = operands[2]; @@ -1959,23 +1968,9 @@ expand_block_compare (rtx operands[]) if (TARGET_32BIT && TARGET_POWERPC64) return false; - bool isP7 = (rs6000_tune == PROCESSOR_POWER7); - - /* Allow this param to shut off all expansion. */ - if (rs6000_block_compare_inline_limit == 0) - return false; - - /* targetm.slow_unaligned_access -- don't do unaligned stuff. - However slow_unaligned_access returns true on P7 even though the - performance of this code is good there. */ - if (!isP7 - && (targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src1)) - || targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src2)))) - return false; - - /* Unaligned l*brx traps on P7 so don't do this. However this should - not affect much because LE isn't really supported on P7 anyway. */ - if (isP7 && !BYTES_BIG_ENDIAN) + /* targetm.slow_unaligned_access -- don't do unaligned stuff. */ + if (targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src1)) + || targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src2))) return false; /* If this is not a fixed size compare, try generating loop code and @@ -2023,14 +2018,6 @@ expand_block_compare (rtx operands[]) if (!IN_RANGE (bytes, 1, max_bytes)) return expand_compare_loop (operands); - /* The code generated for p7 and older is not faster than glibc - memcmp if alignment is small and length is not short, so bail - out to avoid those conditions. */ - if (!TARGET_EFFICIENT_UNALIGNED_FIXEDPOINT - && ((base_align == 1 && bytes > 16) - || (base_align == 2 && bytes > 32))) - return false; - rtx final_label = NULL; if (use_vec) diff --git a/gcc/testsuite/gcc.target/powerpc/memcmp_for_size.c b/gcc/testsuite/gcc.target/powerpc/memcmp_for_size.c new file mode 100644 index 00000000000..c7e853ad593 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/memcmp_for_size.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-Os" } */ +/* { dg-final { scan-assembler-times {\mb[l]? memcmp\M} 1 } } */ + +int foo (const char* s1, const char* s2) +{ + return __builtin_memcmp (s1, s2, 4); +}