From patchwork Fri Apr 7 16:58:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pat Haugen X-Patchwork-Id: 748409 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3w05Pm2DJBz9s7F for ; Sat, 8 Apr 2017 02:58:35 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="tEmvf40w"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type:message-id; q=dns; s=default; b=eLgHCbu6wLBsJioGliMrCuu9D+nuhlz9WX6zpFG/XB2ZZCIpFU gd27HQ7VJqixxxCbDYju3sWgqLKb7rpWva4WWbc5SwlF2bWEXvE5WfBNAhg+QV3a 0I5t8S5dBn9KQN0qWb7Loa7Jqff3qROYvIn3Axt0RcrDGz0s/VlljW04M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:date:mime-version:content-type:message-id; s= default; bh=WjZvzMqCqhAygzEWK5nX26WODyc=; b=tEmvf40wnMTVtmMTdhrM tHZ2bzOVcsIcoOQWSTiLRxmMpPvyowi5TVoKO0fIUnPKvj24JJDqpx4jw6aWmFoh Hhl3pArny2uZJJoE5QU7V61T9RsXn0mIwAwYdZroWkf/vCIfZ8h7j1owy9y7R/BK ZJo5/fbri6T5kvCB85KOkwI= Received: (qmail 90389 invoked by alias); 7 Apr 2017 16:58:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 90350 invoked by uid 89); 7 Apr 2017 16:58:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=unavailable version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 07 Apr 2017 16:58:22 +0000 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v37GrbET124457 for ; Fri, 7 Apr 2017 12:58:22 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0a-001b2d01.pphosted.com with ESMTP id 29pavqdmd4-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 07 Apr 2017 12:58:22 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 7 Apr 2017 12:58:18 -0400 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e11.ny.us.ibm.com (146.89.104.198) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 7 Apr 2017 12:58:17 -0400 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v37GwJOk43909206; Fri, 7 Apr 2017 16:58:19 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 96213124037; Fri, 7 Apr 2017 12:58:11 -0400 (EDT) Received: from oc1687012634.ibm.com (unknown [9.10.86.159]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP id 5A767124047; Fri, 7 Apr 2017 12:58:11 -0400 (EDT) From: Pat Haugen Subject: [PATCH, rs6000] Update Power9 scheduling of vector and vector load insns To: GCC Patches Cc: Segher Boessenkool , David Edelsohn Date: Fri, 7 Apr 2017 11:58:15 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 17040716-2213-0000-0000-000001869AB2 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006894; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00844351; UDB=6.00416196; IPR=6.00622649; BA=6.00005275; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00014953; XFM=3.00000013; UTC=2017-04-07 16:58:18 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040716-2214-0000-0000-0000554E821A Message-Id: <38e578a5-418d-fcc5-4610-b9788738ec38@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-04-07_15:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704070138 X-IsSubscribed: yes The following patch changes the method of scheduling vector and vector load insns. Before it tried to pair up like insns and interleave the pairs, resulting in something like L1L2V1V2. The preferred scheduling is now to just interleave the insns, resulting in L1V1L2V2. If interleaving fails, fall back to pairing like insns. Bootstrap/regtest on powerpc64le-linux with no new regressions. I also did a -mcpu=power9 build of CPU2006 with no errors. Ok for trunk and backport to GCC 6 branch? -Pat 2017-04-07 Pat Haugen * rs6000/rs6000.c (vec_load_pendulum): Rename... (vec_pairing): ...to this. (power9_sched_reorder2): Rewrite code for pairing vector/vecload insns. (rs6000_sched_init): Adjust for name change. (struct rs6000_sched_context): Likewise. (rs6000_init_sched_context): Likewise. (rs6000_set_sched_context): Likewise. Index: config/rs6000/rs6000.c =================================================================== --- config/rs6000/rs6000.c (revision 246648) +++ config/rs6000/rs6000.c (working copy) @@ -32862,7 +32862,7 @@ static int load_store_pendulum; static int divide_cnt; /* The following variable helps pair and alternate vector and vector load insns during scheduling. */ -static int vec_load_pendulum; +static int vec_pairing; /* Power4 load update and store update instructions are cracked into a @@ -33854,183 +33854,115 @@ power9_sched_reorder2 (rtx_insn **ready, /* Last insn was the 2nd divide or not a divide, reset the counter. */ divide_cnt = 0; - /* Power9 can execute 2 vector operations and 2 vector loads in a single - cycle. So try to pair up and alternate groups of vector and vector - load instructions. + /* The best dispatch throughput for vector and vector load insns can be + achieved by interleaving a vector and vector load such that they'll + dispatch to the same superslice. If this pairing cannot be achieved + then it is best to pair vector insns together and vector load insns + together. - To aid this formation, a counter is maintained to keep track of - vec/vecload insns issued. The value of vec_load_pendulum maintains - the current state with the following values: + To aid in this pairing, vec_pairing maintains the current state with + the following values: - 0 : Initial state, no vec/vecload group has been started. + 0 : Initial state, no vecload/vector pairing has been started. - -1 : 1 vector load has been issued and another has been found on - the ready list and moved to the end. - - -2 : 2 vector loads have been issued and a vector operation has - been found and moved to the end of the ready list. - - -3 : 2 vector loads and a vector insn have been issued and a - vector operation has been found and moved to the end of the - ready list. - - 1 : 1 vector insn has been issued and another has been found and - moved to the end of the ready list. - - 2 : 2 vector insns have been issued and a vector load has been - found and moved to the end of the ready list. - - 3 : 2 vector insns and a vector load have been issued and another - vector load has been found and moved to the end of the ready + 1 : A vecload or vector insn has been issued and a candidate for + for pairing has been found and moved to the end of the ready list. */ if (type == TYPE_VECLOAD) { /* Issued a vecload. */ - if (vec_load_pendulum == 0) + if (vec_pairing == 0) { - /* We issued a single vecload, look for another and move it to - the end of the ready list so it will be scheduled next. - Set pendulum if found. */ + int vecload_pos = -1; + /* We issued a single vecload, look for a vector insn to pair it + with. If one isn't found, try to pair another vecload. */ pos = lastpos; while (pos >= 0) { - if (recog_memoized (ready[pos]) >= 0 - && get_attr_type (ready[pos]) == TYPE_VECLOAD) + if (recog_memoized (ready[pos]) >= 0) { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - vec_load_pendulum = -1; - return cached_can_issue_more; + if (is_power9_pairable_vec_type (get_attr_type + (ready[pos]))) + { + /* Found a vector insn to pair with, move it to the + end of the ready list so it is scheduled next. */ + tmp = ready[pos]; + for (i = pos; i < lastpos; i++) + ready[i] = ready[i + 1]; + ready[lastpos] = tmp; + vec_pairing = 1; + return cached_can_issue_more; + } + else if (get_attr_type (ready[pos]) == TYPE_VECLOAD + && vecload_pos == -1) + /* Remember position of first vecload seen. */ + vecload_pos = pos; } pos--; } - } - else if (vec_load_pendulum == -1) - { - /* This is the second vecload we've issued, search the ready - list for a vector operation so we can try to schedule a - pair of those next. If found move to the end of the ready - list so it is scheduled next and set the pendulum. */ - pos = lastpos; - while (pos >= 0) + if (vecload_pos >= 0) { - if (recog_memoized (ready[pos]) >= 0 - && is_power9_pairable_vec_type ( - get_attr_type (ready[pos]))) - { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - vec_load_pendulum = -2; - return cached_can_issue_more; - } - pos--; - } - } - else if (vec_load_pendulum == 2) - { - /* Two vector ops have been issued and we've just issued a - vecload, look for another vecload and move to end of ready - list if found. */ - pos = lastpos; - while (pos >= 0) - { - if (recog_memoized (ready[pos]) >= 0 - && get_attr_type (ready[pos]) == TYPE_VECLOAD) - { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - /* Set pendulum so that next vecload will be seen as - finishing a group, not start of one. */ - vec_load_pendulum = 3; - return cached_can_issue_more; - } - pos--; + /* Didn't find a vector to pair with but did find a vecload, + move it to the end of the ready list. */ + tmp = ready[vecload_pos]; + for (i = vecload_pos; i < lastpos; i++) + ready[i] = ready[i + 1]; + ready[lastpos] = tmp; + vec_pairing = 1; + return cached_can_issue_more; } } } else if (is_power9_pairable_vec_type (type)) { /* Issued a vector operation. */ - if (vec_load_pendulum == 0) - /* We issued a single vec op, look for another and move it - to the end of the ready list so it will be scheduled next. - Set pendulum if found. */ - { - pos = lastpos; - while (pos >= 0) - { - if (recog_memoized (ready[pos]) >= 0 - && is_power9_pairable_vec_type ( - get_attr_type (ready[pos]))) - { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - vec_load_pendulum = 1; - return cached_can_issue_more; - } - pos--; - } - } - else if (vec_load_pendulum == 1) + if (vec_pairing == 0) { - /* This is the second vec op we've issued, search the ready - list for a vecload operation so we can try to schedule a - pair of those next. If found move to the end of the ready - list so it is scheduled next and set the pendulum. */ + int vec_pos = -1; + /* We issued a single vector insn, look for a vecload to pair it + with. If one isn't found, try to pair another vector. */ pos = lastpos; while (pos >= 0) { - if (recog_memoized (ready[pos]) >= 0 - && get_attr_type (ready[pos]) == TYPE_VECLOAD) + if (recog_memoized (ready[pos]) >= 0) { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - vec_load_pendulum = 2; - return cached_can_issue_more; + if (get_attr_type (ready[pos]) == TYPE_VECLOAD) + { + /* Found a vecload insn to pair with, move it to the + end of the ready list so it is scheduled next. */ + tmp = ready[pos]; + for (i = pos; i < lastpos; i++) + ready[i] = ready[i + 1]; + ready[lastpos] = tmp; + vec_pairing = 1; + return cached_can_issue_more; + } + else if (is_power9_pairable_vec_type (get_attr_type + (ready[pos])) + && vec_pos == -1) + /* Remember position of first vector insn seen. */ + vec_pos = pos; } pos--; } - } - else if (vec_load_pendulum == -2) - { - /* Two vecload ops have been issued and we've just issued a - vec op, look for another vec op and move to end of ready - list if found. */ - pos = lastpos; - while (pos >= 0) + if (vec_pos >= 0) { - if (recog_memoized (ready[pos]) >= 0 - && is_power9_pairable_vec_type ( - get_attr_type (ready[pos]))) - { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; - /* Set pendulum so that next vec op will be seen as - finishing a group, not start of one. */ - vec_load_pendulum = -3; - return cached_can_issue_more; - } - pos--; + /* Didn't find a vecload to pair with but did find a vector + insn, move it to the end of the ready list. */ + tmp = ready[vec_pos]; + for (i = vec_pos; i < lastpos; i++) + ready[i] = ready[i + 1]; + ready[lastpos] = tmp; + vec_pairing = 1; + return cached_can_issue_more; } } } - /* We've either finished a vec/vecload group, couldn't find an insn to - continue the current group, or the last insn had nothing to do with - with a group. In any case, reset the pendulum. */ - vec_load_pendulum = 0; + /* We've either finished a vec/vecload pair, couldn't find an insn to + continue the current pair, or the last insn had nothing to do with + with pairing. In any case, reset the state. */ + vec_pairing = 0; } return cached_can_issue_more; @@ -34946,7 +34878,7 @@ rs6000_sched_init (FILE *dump ATTRIBUTE_ last_scheduled_insn = NULL; load_store_pendulum = 0; divide_cnt = 0; - vec_load_pendulum = 0; + vec_pairing = 0; } /* The following function is called at the end of scheduling BB. @@ -34993,7 +34925,7 @@ struct rs6000_sched_context rtx_insn *last_scheduled_insn; int load_store_pendulum; int divide_cnt; - int vec_load_pendulum; + int vec_pairing; }; typedef struct rs6000_sched_context rs6000_sched_context_def; @@ -35019,7 +34951,7 @@ rs6000_init_sched_context (void *_sc, bo sc->last_scheduled_insn = NULL; sc->load_store_pendulum = 0; sc->divide_cnt = 0; - sc->vec_load_pendulum = 0; + sc->vec_pairing = 0; } else { @@ -35027,7 +34959,7 @@ rs6000_init_sched_context (void *_sc, bo sc->last_scheduled_insn = last_scheduled_insn; sc->load_store_pendulum = load_store_pendulum; sc->divide_cnt = divide_cnt; - sc->vec_load_pendulum = vec_load_pendulum; + sc->vec_pairing = vec_pairing; } } @@ -35043,7 +34975,7 @@ rs6000_set_sched_context (void *_sc) last_scheduled_insn = sc->last_scheduled_insn; load_store_pendulum = sc->load_store_pendulum; divide_cnt = sc->divide_cnt; - vec_load_pendulum = sc->vec_load_pendulum; + vec_pairing = sc->vec_pairing; } /* Free _SC. */