From patchwork Tue Jun 27 18:52:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pat Haugen X-Patchwork-Id: 1800783 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=RVrLNqfn; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QrDNc3zwzz20bN for ; Wed, 28 Jun 2023 04:52:32 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 12ED23858414 for ; Tue, 27 Jun 2023 18:52:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12ED23858414 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687891950; bh=QIbENRYD9eAc5fG1c26lYCXRl5IWWrvlVvgn1TIasWA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=RVrLNqfnLmFeNvZBdhIYpiMNZw9lLgIrk0rUFz29cGpXIIc6gF3l0R0WispGEa4tK M46rQ1xw5OFjGDvxaa1GiLeu+W9tlDplbSwdCPXhRfq5u/ora5fx1L198Wukjo0ZC9 U3BetSWLYRKTtCqWCpfa2V9conlGzUG2i0JMF50I= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9A3323858D28 for ; Tue, 27 Jun 2023 18:52:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9A3323858D28 Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35RIlkmc011356; Tue, 27 Jun 2023 18:52:09 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rg5bv82pu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 18:52:08 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35RIn0iJ015084; Tue, 27 Jun 2023 18:52:07 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rg5bv82pj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 18:52:07 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35RH3Igu015204; Tue, 27 Jun 2023 18:52:07 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([9.208.130.101]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3rdr45ejxm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 18:52:07 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35RIq5mE8323710 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 27 Jun 2023 18:52:05 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BAF5258061; Tue, 27 Jun 2023 18:52:02 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0F85E5805E; Tue, 27 Jun 2023 18:52:02 +0000 (GMT) Received: from [9.67.139.18] (unknown [9.67.139.18]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Tue, 27 Jun 2023 18:52:01 +0000 (GMT) Message-ID: <68d7fbb3-59b3-6a59-a8ac-773d5d9c6817@linux.ibm.com> Date: Tue, 27 Jun 2023 13:52:01 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: GCC Patches , Segher Boessenkool , "Kewen.Lin" , Peter Bergner , David Edelsohn Subject: [PATCH V3, rs6000] Disable generation of scalar modulo instructions X-TM-AS-GCONF: 00 X-Proofpoint-GUID: pV5tPuiXIgcJlR5B1t4bBq2QLjcBbwId X-Proofpoint-ORIG-GUID: BqbMqdYsaNcKWhRgaS9gCfxxP778DIJB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-27_12,2023-06-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 phishscore=0 spamscore=0 impostorscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 malwarescore=0 adultscore=0 clxscore=1015 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306270168 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Pat Haugen via Gcc-patches From: Pat Haugen Reply-To: Pat Haugen Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Updated from prior version to address review comments (update rs6000_rtx_cost, update scan strings of mod-1.c/mod-2.c)l. Disable generation of scalar modulo instructions. It was recently discovered that the scalar modulo instructions can suffer noticeable performance issues for certain input values. This patch disables their generation since the equivalent div/mul/sub sequence does not suffer the same problem. Bootstrapped and regression tested on powerpc64/powerpc64le. Ok for master and backports after burn in? -Pat 2023-06-27 Pat Haugen gcc/ * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling scalar modulo. * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New. * config/rs6000/rs6000.md (mod3, *mod3): Disable. (define_expand umod3): New. (define_insn umod3): Rename to *umod3 and disable. (umodti3, modti3): Disable. gcc/testsuite/ * gcc.target/powerpc/clone1.c: Add xfails. * gcc.target/powerpc/clone3.c: Likewise. * gcc.target/powerpc/mod-1.c: Update scan strings and add xfails. * gcc.target/powerpc/mod-2.c: Likewise. * gcc.target/powerpc/p10-vdivq-vmodq.c: Add xfails. diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 546c353029b..2dae217bf64 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -22127,7 +22127,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, *total = rs6000_cost->divsi; } /* Add in shift and subtract for MOD unless we have a mod instruction. */ - if (!TARGET_MODULO && (code == MOD || code == UMOD)) + if ((!TARGET_MODULO + || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode))) + && (code == MOD || code == UMOD)) *total += COSTS_N_INSNS (2); return false; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 3503614efbd..22595f6ebd7 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2492,3 +2492,9 @@ while (0) rs6000_asm_output_opcode (STREAM); \ } \ while (0) + +/* Disable generation of scalar modulo instructions due to performance issues + with certain input values. This can be removed in the future when the + issues have been resolved. */ +#define RS6000_DISABLE_SCALAR_MODULO 1 + diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b0db8ae508d..6c2f237a539 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -3421,6 +3421,17 @@ (define_expand "mod3" FAIL; operands[2] = force_reg (mode, operands[2]); + + if (RS6000_DISABLE_SCALAR_MODULO) + { + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_div3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; + } } else { @@ -3440,17 +3451,42 @@ (define_insn "*mod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r") (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "mods %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) +;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is +;; removed. +(define_expand "umod3" + [(set (match_operand:GPR 0 "gpc_reg_operand") + (umod:GPR (match_operand:GPR 1 "gpc_reg_operand") + (match_operand:GPR 2 "gpc_reg_operand")))] + "" +{ + rtx temp1; + rtx temp2; + + if (!TARGET_MODULO) + FAIL; -(define_insn "umod3" + if (RS6000_DISABLE_SCALAR_MODULO) + { + temp1 = gen_reg_rtx (mode); + temp2 = gen_reg_rtx (mode); + + emit_insn (gen_udiv3 (temp1, operands[1], operands[2])); + emit_insn (gen_mul3 (temp2, temp1, operands[2])); + emit_insn (gen_sub3 (operands[0], operands[1], temp2)); + DONE; + } +}) + +(define_insn "*umod3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r") (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") (match_operand:GPR 2 "gpc_reg_operand" "r,r")))] - "TARGET_MODULO" + "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO" "modu %0,%1,%2" [(set_attr "type" "div") (set_attr "size" "")]) @@ -3507,7 +3543,7 @@ (define_insn "umodti3" [(set (match_operand:TI 0 "altivec_register_operand" "=v") (umod:TI (match_operand:TI 1 "altivec_register_operand" "v") (match_operand:TI 2 "altivec_register_operand" "v")))] - "TARGET_POWER10 && TARGET_POWERPC64" + "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO" "vmoduq %0,%1,%2" [(set_attr "type" "vecdiv") (set_attr "size" "128")]) @@ -3516,7 +3552,7 @@ (define_insn "modti3" [(set (match_operand:TI 0 "altivec_register_operand" "=v") (mod:TI (match_operand:TI 1 "altivec_register_operand" "v") (match_operand:TI 2 "altivec_register_operand" "v")))] - "TARGET_POWER10 && TARGET_POWERPC64" + "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO" "vmodsq %0,%1,%2" [(set_attr "type" "vecdiv") (set_attr "size" "128")]) diff --git a/gcc/testsuite/gcc.target/powerpc/clone1.c b/gcc/testsuite/gcc.target/powerpc/clone1.c index c69fd2aa1b8..74323ca0e8c 100644 --- a/gcc/testsuite/gcc.target/powerpc/clone1.c +++ b/gcc/testsuite/gcc.target/powerpc/clone1.c @@ -21,6 +21,7 @@ long mod_func_or (long a, long b, long c) return mod_func (a, b) | c; } -/* { dg-final { scan-assembler-times {\mdivd\M} 1 } } */ -/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */ -/* { dg-final { scan-assembler-times {\mmodsd\M} 1 } } */ +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ +/* { dg-final { scan-assembler-times {\mdivd\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmulld\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmodsd\M} 1 { xfail *-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/clone3.c b/gcc/testsuite/gcc.target/powerpc/clone3.c index 911b88b781d..d3eb4dd2378 100644 --- a/gcc/testsuite/gcc.target/powerpc/clone3.c +++ b/gcc/testsuite/gcc.target/powerpc/clone3.c @@ -27,7 +27,8 @@ long mod_func_or (long a, long b, long c) return mod_func (a, b) | c; } -/* { dg-final { scan-assembler-times {\mdivd\M} 1 } } */ -/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */ -/* { dg-final { scan-assembler-times {\mmodsd\M} 2 } } */ +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ +/* { dg-final { scan-assembler-times {\mdivd\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmulld\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmodsd\M} 2 { xfail *-*-* } } } */ /* { dg-final { scan-assembler-times {\mpld\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/mod-1.c b/gcc/testsuite/gcc.target/powerpc/mod-1.c index 861ba670af4..8720ffb3346 100644 --- a/gcc/testsuite/gcc.target/powerpc/mod-1.c +++ b/gcc/testsuite/gcc.target/powerpc/mod-1.c @@ -7,13 +7,14 @@ long lsmod (long a, long b) { return a%b; } unsigned int iumod (unsigned int a, unsigned int b) { return a%b; } unsigned long lumod (unsigned long a, unsigned long b) { return a%b; } -/* { dg-final { scan-assembler-times "modsw " 1 } } */ -/* { dg-final { scan-assembler-times "modsd " 1 } } */ -/* { dg-final { scan-assembler-times "moduw " 1 } } */ -/* { dg-final { scan-assembler-times "modud " 1 } } */ -/* { dg-final { scan-assembler-not "mullw " } } */ -/* { dg-final { scan-assembler-not "mulld " } } */ -/* { dg-final { scan-assembler-not "divw " } } */ -/* { dg-final { scan-assembler-not "divd " } } */ -/* { dg-final { scan-assembler-not "divwu " } } */ -/* { dg-final { scan-assembler-not "divdu " } } */ +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ +/* { dg-final { scan-assembler-times {\mmodsw\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmodsd\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmoduw\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmodud\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mmullw\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mmulld\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivw\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivd\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivwu\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivdu\M} { xfail *-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/mod-2.c b/gcc/testsuite/gcc.target/powerpc/mod-2.c index 441ec5878f1..54bdca88607 100644 --- a/gcc/testsuite/gcc.target/powerpc/mod-2.c +++ b/gcc/testsuite/gcc.target/powerpc/mod-2.c @@ -5,8 +5,9 @@ int ismod (int a, int b) { return a%b; } unsigned int iumod (unsigned int a, unsigned int b) { return a%b; } -/* { dg-final { scan-assembler-times "modsw " 1 } } */ -/* { dg-final { scan-assembler-times "moduw " 1 } } */ -/* { dg-final { scan-assembler-not "mullw " } } */ -/* { dg-final { scan-assembler-not "divw " } } */ -/* { dg-final { scan-assembler-not "divwu " } } */ +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ +/* { dg-final { scan-assembler-times {\mmodsw\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times {\mmoduw\M} 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mmullw\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivw\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not {\mdivwu\M} { xfail *-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c index 84685e5ff43..148998c8c9d 100644 --- a/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c +++ b/gcc/testsuite/gcc.target/powerpc/p10-vdivq-vmodq.c @@ -23,5 +23,6 @@ __int128 s_mod(__int128 a, __int128 b) /* { dg-final { scan-assembler {\mvdivsq\M} } } */ /* { dg-final { scan-assembler {\mvdivuq\M} } } */ -/* { dg-final { scan-assembler {\mvmodsq\M} } } */ -/* { dg-final { scan-assembler {\mvmoduq\M} } } */ +/* { Fail due to RS6000_DISABLE_SCALAR_MODULO. */ +/* { dg-final { scan-assembler {\mvmodsq\M} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler {\mvmoduq\M} { xfail *-*-* } } } */