[RFC/PATCH,v3] ira: Support more matching constraint forms with param [PR100328]

From: Kewen Lin <linkw@linux.ibm.com>

Hi!

on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> PR100328 has some details about this issue, I am trying to
> brief it here.  In the hottest function LBM_performStreamCollideTRT
> of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
> (27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
> insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
> class have 64 registers whose foregoing 32 ones make up the
> whole FLOAT_REG.  There are some differences for these two
> flavors, taking "*fma<mode>4_fpr" as example:
> 
> (define_insn "*fma<mode>4_fpr"
>   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa,wa")
> 	(fma:SFDF
> 	  (match_operand:SFDF 1 "gpc_reg_operand" "%<Ff>,wa,wa")
> 	  (match_operand:SFDF 2 "gpc_reg_operand" "<Ff>,wa,0")
> 	  (match_operand:SFDF 3 "gpc_reg_operand" "<Ff>,0,wa")))]
> 
> // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
> // <Ff> (f/d) => A floating point register, aka. FLOAT_REG.
> 
> So for VSX_REG, we only have the destructive form, when VSX_REG
> alternative being used, the operand 2 or operand 3 is required
> to be the same as operand 0.  reload has to take care of this
> constraint and create some non-free register copies if required.
> 
> Assuming one fma insn looks like:
>   op0 = FMA (op1, op2, op3)
> 
> The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
> IRA simply creates three shuffle copies for them (here the operand
> order matters, since with the same freq, the one with smaller number
> takes preference), but IMO both op2 and op3 should take higher priority
> in copy queue due to the matching constraint.
> 
> I noticed that there is one function ira_get_dup_out_num, which meant
> to create this kind of constraint copy, but the below code looks to
> refuse to create if there is an alternative which has valid regclass
> without spilled need. 
> 
>       default:
> 	{
> 	  enum constraint_num cn = lookup_constraint (str);
> 	  enum reg_class cl = reg_class_for_constraint (cn);
> 	  if (cl != NO_REGS
> 	      && !targetm.class_likely_spilled_p (cl))
> 	    goto fail
> 
> 	 ...
> 
> I cooked one patch attached to make ira respect this kind of matching
> constraint guarded with one parameter.  As I stated in the PR, I was
> not sure this is on the right track.  The RFC patch is to check the
> matching constraint in all alternatives, if there is one alternative
> with matching constraint and matches the current preferred regclass
> (or best of allocno?), it will record the output operand number and
> further create one constraint copy for it.  Normally it can get the
> priority against shuffle copies and the matching constraint will get
> satisfied with higher possibility, reload doesn't create extra copies
> to meet the matching constraint or the desirable register class when
> it has to.
> 
> For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
> as shuffle copies, and later any of A,B,C,D gets assigned by one
> hardware register which is a VSX register (VSX_REG) but not a FP
> register (FLOAT_REG), which means it has to pay costs once we can NOT
> go with VSX alternatives, so at that time it's important to respect
> the matching constraint then we can increase the freq for the remaining
> copies related to this (A/B, A/C, A/D).  This idea requires some side
> tables to record some information and seems a bit complicated in the
> current framework, so the proposed patch aggressively emphasizes the
> matching constraint at the time of creating copies.
> 

Comparing with the original patch (v1), this patch v3 has
considered: (this should be v2 for this mail list, but bump
it to be consistent as PR's).

  - Excluding the case where for one preferred register class
    there can be two or more alternatives, one of them has the
    matching constraint, while another doesn't have.  So for
    the given operand, even if it's assigned by a hardware reg
    which doesn't meet the matching constraint, it can simply
    use the alternative which doesn't have matching constraint
    so no register move is needed.  One typical case is
    define_insn *mov<mode>_internal2 on rs6000.  So we
    shouldn't create constraint copy for it.

  - The possible free register move in the same register class,
    disable this if so since the register move to meet the
    constraint is considered as free.

  - Making it on by default, suggested by Segher & Vladimir, we
    hope to get rid of the parameter if the benchmarking result
    looks good on major targets.

  - Tweaking cost when either of matching constraint two sides
    is hardware register.  Before this patch, the constraint
    copy is simply taken as a real move insn for pref and
    conflict cost with one hardware register, after this patch,
    it's allowed that there are several input operands
    respecting the same matching constraint (but in different
    alternatives), so we should take it to be like shuffle copy
    for some cases to avoid over preferring/disparaging.

Please check the PR comments for more details.

This patch can be bootstrapped & regtested on
powerpc64le-linux-gnu P9 and x86_64-redhat-linux, but have some
"XFAIL->XPASS" failures on aarch64-linux-gnu.  The failure list
was attached in the PR and thought the new assembly looks
improved (expected).

With option Ofast unroll, this patch can help to improve SPEC2017
bmk 508.namd_r +2.42% and 519.lbm_r +2.43% on Power8 while
508.namd_r +3.02% and 519.lbm_r +3.85% on Power9 without any
remarkable degradations.

Since this patch likely benefits x86_64 and aarch64, but I don't
have performance machines with these arches at hand, could
someone kindly help to benchmark it if possible? 

Many thanks in advance!

btw, you can simply ignore the part about parameter
ira-consider-dup-in-all-alts (its name/description), it's sort of
stale, I let it be for now as we will likely get rid of it.

BR,
Kewen
-----
gcc/ChangeLog:

	* doc/invoke.texi (ira-consider-dup-in-all-alts): Document new
	parameter.
	* ira.c (ira_get_dup_out_num): Adjust as parameter
	param_ira_consider_dup_in_all_alts.
	* params.opt (ira-consider-dup-in-all-alts): New.
	* ira-conflicts.c (process_regs_for_copy): Add one parameter
	single_input_op_has_cstr_p.
	(get_freq_for_shuffle_copy): New function.
	(add_insn_allocno_copies): Adjust as single_input_op_has_cstr_p.
	* ira-int.h (ira_get_dup_out_num): Add one bool parameter.
---
 gcc/doc/invoke.texi |   6 +++
 gcc/ira-conflicts.c |  91 +++++++++++++++++++++++++++-------
 gcc/ira-int.h       |   2 +-
 gcc/ira.c           | 118 ++++++++++++++++++++++++++++++++++++++++----
 gcc/params.opt      |   4 ++
 5 files changed, 194 insertions(+), 27 deletions(-)

Message ID	8a5fd52a-1cc9-6563-ee6c-f345b489654c@linux.ibm.com
State	New
Headers	show Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org> X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=<UNKNOWN>) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=J4SR6H5v; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GCytG0Qg8z9sV8 for <incoming@patchwork.ozlabs.org>; Mon, 28 Jun 2021 16:50:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5DF9F393C848 for <incoming@patchwork.ozlabs.org>; Mon, 28 Jun 2021 06:50:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5DF9F393C848 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1624863041; bh=9ZsEyYGOypTcFonxpHqRD659ay87iNTqxRO5BnrfSSA=; h=Subject:References:To:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=J4SR6H5vIFGml/EHWspl3cfCCmdB9hcuuhbFu/OyGchYura7jKaAgzYxGj4nPRQDU ovSA9NbQCFT+tfNPWEM6zkXZJOAZy781blgOxxBPODp88jqnC0iFKrVl9DCFgd9JX/ Ew5NN0/36LsUif/psgrNVMr12TwqXfF9Md0P2rQo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 9025C3889808 for <gcc-patches@gcc.gnu.org>; Mon, 28 Jun 2021 06:50:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9025C3889808 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 15S6Yj8U142746; Mon, 28 Jun 2021 02:50:13 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 39f907grqq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 02:50:12 -0400 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 15S6YjAJ142811; Mon, 28 Jun 2021 02:50:12 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 39f907grpn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 02:50:11 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 15S6m5sX009759; Mon, 28 Jun 2021 06:50:09 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma06ams.nl.ibm.com with ESMTP id 39dughgmg0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 28 Jun 2021 06:50:09 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 15S6mWIa27197802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 28 Jun 2021 06:48:33 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 14197A464B; Mon, 28 Jun 2021 06:48:51 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3FB0E1348F0; Mon, 28 Jun 2021 06:26:20 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.143]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 28 Jun 2021 06:26:19 +0000 (GMT) Subject: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328] References: <c8ca748c-d7fd-6fb1-5ef2-567935d38722@linux.ibm.com> To: GCC Patches <gcc-patches@gcc.gnu.org> Message-ID: <8a5fd52a-1cc9-6563-ee6c-f345b489654c@linux.ibm.com> Date: Mon, 28 Jun 2021 14:26:18 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: <c8ca748c-d7fd-6fb1-5ef2-567935d38722@linux.ibm.com> Content-Type: multipart/mixed; boundary="------------974AE30CFB6EE08B46689DC1" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: gn1xrbUflCA03_yqTiO3wisa-QB7Uy8U X-Proofpoint-ORIG-GUID: EZjG5393iHe0eQNqfgDlQJhG3OLKpxl4 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-06-28_05:2021-06-25, 2021-06-28 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 bulkscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 spamscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2106280045 X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, MIME_CHARSET_FARAWAY, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: "Kewen.Lin via Gcc-patches" <gcc-patches@gcc.gnu.org> Reply-To: "Kewen.Lin" <linkw@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org>, Richard Sandiford <richard.sandiford@arm.com>, bergner@linux.ibm.com, Bill Schmidt <wschmidt@linux.ibm.com> Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
Series	[RFC/PATCH,v3] ira: Support more matching constraint forms with param [PR100328] \| expand [RFC/PATCH,v3] ira: Support more matching constraint forms with param [PR100328]

[RFC/PATCH,v3] ira: Support more matching constraint forms with param [PR100328]

Commit Message

Comments

Patch