From patchwork Mon Nov 18 21:24:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pat Haugen X-Patchwork-Id: 1196979 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513964-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="F2Ame/qV"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47H26h5Llrz9sP4 for ; Tue, 19 Nov 2019 08:25:15 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; q=dns; s=default; b=Kxp/Fqbm7cNWlZa7TXCdmSlHCg6h3/PISnpGGnnEUREpGhG800 h/PWwiYO5kdHyladtVJnBPVTVdmBO2iny+4A2GAtu0sReYXw7loOJvBPim+mQuTn ZSzWr7UnsaPE2d+mc92HtJUSWYeYkiD63Xsr5h+JmFwun4zruW4Pt3vA8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; s= default; bh=uEa9Kj3NsEGHGhPzsbLS7RU377M=; b=F2Ame/qVj9A3ZSLNJEXy BtlfVbEE8m9SQn9/Udo8XfcuHIyILZpVBXCT+GWKVQqXSfyydVnLJs9cKIIgEF6U qcGpo0UL6iiAyZdwFGF9ut0LVv+cH/OSNaZvMopFgeH8NDVPxuldvDNCCVc5v7Gv P1qkXkbW4oBtakIGTHKpIdo= Received: (qmail 48731 invoked by alias); 18 Nov 2019 21:25:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 48699 invoked by uid 89); 18 Nov 2019 21:25:05 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-17.6 required=5.0 tests=BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=*tmp, balanced X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Nov 2019 21:25:02 +0000 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xAILOp2Y125908; Mon, 18 Nov 2019 16:25:00 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2wayax1abp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 18 Nov 2019 16:24:59 -0500 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id xAILOpwc125868; Mon, 18 Nov 2019 16:24:59 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0b-001b2d01.pphosted.com with ESMTP id 2wayax1ab7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 18 Nov 2019 16:24:59 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id xAILOoua006409; Mon, 18 Nov 2019 21:24:58 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma01dal.us.ibm.com with ESMTP id 2wa8r6k6pq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 18 Nov 2019 21:24:58 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id xAILOvGq55836982 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Nov 2019 21:24:57 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 13875136051; Mon, 18 Nov 2019 21:24:57 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A55ED13604F; Mon, 18 Nov 2019 21:24:56 +0000 (GMT) Received: from pmac.local (unknown [9.40.44.224]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTPS; Mon, 18 Nov 2019 21:24:56 +0000 (GMT) From: Pat Haugen Subject: [PATCH] rs6000: Refactor scheduling hook code To: GCC Patches Cc: Segher Boessenkool , Bill Schmidt , David Edelsohn Message-ID: <36220a14-50fd-a035-9da6-f554426aac0c@linux.ibm.com> Date: Mon, 18 Nov 2019 15:24:56 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 X-IsSubscribed: yes The following patch factors out some common code to its own function and also moves the Power6 specific code to a new function. Bootstrap/regtest on powerpc64le with no regressions. Ok for trunk? -Pat 2019-11-18 Pat Haugen * config/rs6000/rs6000.c (move_to_end_of_ready): New, factored out from common code. (power6_sched_reorder2): Factored out from rs6000_sched_reorder2, call new function. (power9_sched_reorder2): Call new function. (rs6000_sched_reorder2): Likewise. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 278306) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -17711,14 +17711,216 @@ get_next_active_insn (rtx_insn *insn, rt return insn; } +/* Move instruction at POS to the end of the READY list. */ + +static void +move_to_end_of_ready (rtx_insn **ready, int pos, int lastpos) +{ + rtx_insn *tmp; + int i; + + tmp = ready[pos]; + for (i = pos; i < lastpos; i++) + ready[i] = ready[i + 1]; + ready[lastpos] = tmp; +} + +/* Do Power6 specific sched_reorder2 reordering of ready list. */ + +static int +power6_sched_reorder2 (rtx_insn **ready, int lastpos) +{ + /* For Power6, we need to handle some special cases to try and keep the + store queue from overflowing and triggering expensive flushes. + + This code monitors how load and store instructions are being issued + and skews the ready list one way or the other to increase the likelihood + that a desired instruction is issued at the proper time. + + A couple of things are done. First, we maintain a "load_store_pendulum" + to track the current state of load/store issue. + + - If the pendulum is at zero, then no loads or stores have been + issued in the current cycle so we do nothing. + + - If the pendulum is 1, then a single load has been issued in this + cycle and we attempt to locate another load in the ready list to + issue with it. + + - If the pendulum is -2, then two stores have already been + issued in this cycle, so we increase the priority of the first load + in the ready list to increase it's likelihood of being chosen first + in the next cycle. + + - If the pendulum is -1, then a single store has been issued in this + cycle and we attempt to locate another store in the ready list to + issue with it, preferring a store to an adjacent memory location to + facilitate store pairing in the store queue. + + - If the pendulum is 2, then two loads have already been + issued in this cycle, so we increase the priority of the first store + in the ready list to increase it's likelihood of being chosen first + in the next cycle. + + - If the pendulum < -2 or > 2, then do nothing. + + Note: This code covers the most common scenarios. There exist non + load/store instructions which make use of the LSU and which + would need to be accounted for to strictly model the behavior + of the machine. Those instructions are currently unaccounted + for to help minimize compile time overhead of this code. + */ + int pos; + rtx load_mem, str_mem; + + if (is_store_insn (last_scheduled_insn, &str_mem)) + /* Issuing a store, swing the load_store_pendulum to the left */ + load_store_pendulum--; + else if (is_load_insn (last_scheduled_insn, &load_mem)) + /* Issuing a load, swing the load_store_pendulum to the right */ + load_store_pendulum++; + else + return cached_can_issue_more; + + /* If the pendulum is balanced, or there is only one instruction on + the ready list, then all is well, so return. */ + if ((load_store_pendulum == 0) || (lastpos <= 0)) + return cached_can_issue_more; + + if (load_store_pendulum == 1) + { + /* A load has been issued in this cycle. Scan the ready list + for another load to issue with it */ + pos = lastpos; + + while (pos >= 0) + { + if (is_load_insn (ready[pos], &load_mem)) + { + /* Found a load. Move it to the head of the ready list, + and adjust it's priority so that it is more likely to + stay there */ + move_to_end_of_ready (ready, pos, lastpos); + + if (!sel_sched_p () + && INSN_PRIORITY_KNOWN (ready[lastpos])) + INSN_PRIORITY (ready[lastpos])++; + break; + } + pos--; + } + } + else if (load_store_pendulum == -2) + { + /* Two stores have been issued in this cycle. Increase the + priority of the first load in the ready list to favor it for + issuing in the next cycle. */ + pos = lastpos; + + while (pos >= 0) + { + if (is_load_insn (ready[pos], &load_mem) + && !sel_sched_p () + && INSN_PRIORITY_KNOWN (ready[pos])) + { + INSN_PRIORITY (ready[pos])++; + + /* Adjust the pendulum to account for the fact that a load + was found and increased in priority. This is to prevent + increasing the priority of multiple loads */ + load_store_pendulum--; + + break; + } + pos--; + } + } + else if (load_store_pendulum == -1) + { + /* A store has been issued in this cycle. Scan the ready list for + another store to issue with it, preferring a store to an adjacent + memory location */ + int first_store_pos = -1; + + pos = lastpos; + + while (pos >= 0) + { + if (is_store_insn (ready[pos], &str_mem)) + { + rtx str_mem2; + /* Maintain the index of the first store found on the + list */ + if (first_store_pos == -1) + first_store_pos = pos; + + if (is_store_insn (last_scheduled_insn, &str_mem2) + && adjacent_mem_locations (str_mem, str_mem2)) + { + /* Found an adjacent store. Move it to the head of the + ready list, and adjust it's priority so that it is + more likely to stay there */ + move_to_end_of_ready (ready, pos, lastpos); + + if (!sel_sched_p () + && INSN_PRIORITY_KNOWN (ready[lastpos])) + INSN_PRIORITY (ready[lastpos])++; + + first_store_pos = -1; + + break; + }; + } + pos--; + } + + if (first_store_pos >= 0) + { + /* An adjacent store wasn't found, but a non-adjacent store was, + so move the non-adjacent store to the front of the ready + list, and adjust its priority so that it is more likely to + stay there. */ + move_to_end_of_ready (ready, first_store_pos, lastpos); + if (!sel_sched_p () + && INSN_PRIORITY_KNOWN (ready[lastpos])) + INSN_PRIORITY (ready[lastpos])++; + } + } + else if (load_store_pendulum == 2) + { + /* Two loads have been issued in this cycle. Increase the priority + of the first store in the ready list to favor it for issuing in + the next cycle. */ + pos = lastpos; + + while (pos >= 0) + { + if (is_store_insn (ready[pos], &str_mem) + && !sel_sched_p () + && INSN_PRIORITY_KNOWN (ready[pos])) + { + INSN_PRIORITY (ready[pos])++; + + /* Adjust the pendulum to account for the fact that a store + was found and increased in priority. This is to prevent + increasing the priority of multiple stores */ + load_store_pendulum++; + + break; + } + pos--; + } + } + + return cached_can_issue_more; +} + /* Do Power9 specific sched_reorder2 reordering of ready list. */ static int power9_sched_reorder2 (rtx_insn **ready, int lastpos) { int pos; - int i; - rtx_insn *tmp; enum attr_type type, type2; type = get_attr_type (last_scheduled_insn); @@ -17738,10 +17940,7 @@ power9_sched_reorder2 (rtx_insn **ready, if (recog_memoized (ready[pos]) >= 0 && get_attr_type (ready[pos]) == TYPE_DIV) { - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; + move_to_end_of_ready (ready, pos, lastpos); break; } pos--; @@ -17784,10 +17983,7 @@ power9_sched_reorder2 (rtx_insn **ready, { /* Found a vector insn to pair with, move it to the end of the ready list so it is scheduled next. */ - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; + move_to_end_of_ready (ready, pos, lastpos); vec_pairing = 1; return cached_can_issue_more; } @@ -17801,10 +17997,7 @@ power9_sched_reorder2 (rtx_insn **ready, { /* Didn't find a vector to pair with but did find a vecload, move it to the end of the ready list. */ - tmp = ready[vecload_pos]; - for (i = vecload_pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; + move_to_end_of_ready (ready, vecload_pos, lastpos); vec_pairing = 1; return cached_can_issue_more; } @@ -17828,10 +18021,7 @@ power9_sched_reorder2 (rtx_insn **ready, { /* Found a vecload insn to pair with, move it to the end of the ready list so it is scheduled next. */ - tmp = ready[pos]; - for (i = pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; + move_to_end_of_ready (ready, pos, lastpos); vec_pairing = 1; return cached_can_issue_more; } @@ -17846,10 +18036,7 @@ power9_sched_reorder2 (rtx_insn **ready, { /* Didn't find a vecload to pair with but did find a vector insn, move it to the end of the ready list. */ - tmp = ready[vec_pos]; - for (i = vec_pos; i < lastpos; i++) - ready[i] = ready[i + 1]; - ready[lastpos] = tmp; + move_to_end_of_ready (ready, vec_pos, lastpos); vec_pairing = 1; return cached_can_issue_more; } @@ -17903,198 +18090,9 @@ rs6000_sched_reorder2 (FILE *dump, int s if (sched_verbose) fprintf (dump, "// rs6000_sched_reorder2 :\n"); - /* For Power6, we need to handle some special cases to try and keep the - store queue from overflowing and triggering expensive flushes. - - This code monitors how load and store instructions are being issued - and skews the ready list one way or the other to increase the likelihood - that a desired instruction is issued at the proper time. - - A couple of things are done. First, we maintain a "load_store_pendulum" - to track the current state of load/store issue. - - - If the pendulum is at zero, then no loads or stores have been - issued in the current cycle so we do nothing. - - - If the pendulum is 1, then a single load has been issued in this - cycle and we attempt to locate another load in the ready list to - issue with it. - - - If the pendulum is -2, then two stores have already been - issued in this cycle, so we increase the priority of the first load - in the ready list to increase it's likelihood of being chosen first - in the next cycle. - - - If the pendulum is -1, then a single store has been issued in this - cycle and we attempt to locate another store in the ready list to - issue with it, preferring a store to an adjacent memory location to - facilitate store pairing in the store queue. - - - If the pendulum is 2, then two loads have already been - issued in this cycle, so we increase the priority of the first store - in the ready list to increase it's likelihood of being chosen first - in the next cycle. - - - If the pendulum < -2 or > 2, then do nothing. - - Note: This code covers the most common scenarios. There exist non - load/store instructions which make use of the LSU and which - would need to be accounted for to strictly model the behavior - of the machine. Those instructions are currently unaccounted - for to help minimize compile time overhead of this code. - */ + /* Do Power6 dependent reordering if necessary. */ if (rs6000_tune == PROCESSOR_POWER6 && last_scheduled_insn) - { - int pos; - int i; - rtx_insn *tmp; - rtx load_mem, str_mem; - - if (is_store_insn (last_scheduled_insn, &str_mem)) - /* Issuing a store, swing the load_store_pendulum to the left */ - load_store_pendulum--; - else if (is_load_insn (last_scheduled_insn, &load_mem)) - /* Issuing a load, swing the load_store_pendulum to the right */ - load_store_pendulum++; - else - return cached_can_issue_more; - - /* If the pendulum is balanced, or there is only one instruction on - the ready list, then all is well, so return. */ - if ((load_store_pendulum == 0) || (*pn_ready <= 1)) - return cached_can_issue_more; - - if (load_store_pendulum == 1) - { - /* A load has been issued in this cycle. Scan the ready list - for another load to issue with it */ - pos = *pn_ready-1; - - while (pos >= 0) - { - if (is_load_insn (ready[pos], &load_mem)) - { - /* Found a load. Move it to the head of the ready list, - and adjust it's priority so that it is more likely to - stay there */ - tmp = ready[pos]; - for (i=pos; i<*pn_ready-1; i++) - ready[i] = ready[i + 1]; - ready[*pn_ready-1] = tmp; - - if (!sel_sched_p () && INSN_PRIORITY_KNOWN (tmp)) - INSN_PRIORITY (tmp)++; - break; - } - pos--; - } - } - else if (load_store_pendulum == -2) - { - /* Two stores have been issued in this cycle. Increase the - priority of the first load in the ready list to favor it for - issuing in the next cycle. */ - pos = *pn_ready-1; - - while (pos >= 0) - { - if (is_load_insn (ready[pos], &load_mem) - && !sel_sched_p () - && INSN_PRIORITY_KNOWN (ready[pos])) - { - INSN_PRIORITY (ready[pos])++; - - /* Adjust the pendulum to account for the fact that a load - was found and increased in priority. This is to prevent - increasing the priority of multiple loads */ - load_store_pendulum--; - - break; - } - pos--; - } - } - else if (load_store_pendulum == -1) - { - /* A store has been issued in this cycle. Scan the ready list for - another store to issue with it, preferring a store to an adjacent - memory location */ - int first_store_pos = -1; - - pos = *pn_ready-1; - - while (pos >= 0) - { - if (is_store_insn (ready[pos], &str_mem)) - { - rtx str_mem2; - /* Maintain the index of the first store found on the - list */ - if (first_store_pos == -1) - first_store_pos = pos; - - if (is_store_insn (last_scheduled_insn, &str_mem2) - && adjacent_mem_locations (str_mem, str_mem2)) - { - /* Found an adjacent store. Move it to the head of the - ready list, and adjust it's priority so that it is - more likely to stay there */ - tmp = ready[pos]; - for (i=pos; i<*pn_ready-1; i++) - ready[i] = ready[i + 1]; - ready[*pn_ready-1] = tmp; - - if (!sel_sched_p () && INSN_PRIORITY_KNOWN (tmp)) - INSN_PRIORITY (tmp)++; - - first_store_pos = -1; - - break; - }; - } - pos--; - } - - if (first_store_pos >= 0) - { - /* An adjacent store wasn't found, but a non-adjacent store was, - so move the non-adjacent store to the front of the ready - list, and adjust its priority so that it is more likely to - stay there. */ - tmp = ready[first_store_pos]; - for (i=first_store_pos; i<*pn_ready-1; i++) - ready[i] = ready[i + 1]; - ready[*pn_ready-1] = tmp; - if (!sel_sched_p () && INSN_PRIORITY_KNOWN (tmp)) - INSN_PRIORITY (tmp)++; - } - } - else if (load_store_pendulum == 2) - { - /* Two loads have been issued in this cycle. Increase the priority - of the first store in the ready list to favor it for issuing in - the next cycle. */ - pos = *pn_ready-1; - - while (pos >= 0) - { - if (is_store_insn (ready[pos], &str_mem) - && !sel_sched_p () - && INSN_PRIORITY_KNOWN (ready[pos])) - { - INSN_PRIORITY (ready[pos])++; - - /* Adjust the pendulum to account for the fact that a store - was found and increased in priority. This is to prevent - increasing the priority of multiple stores */ - load_store_pendulum++; - - break; - } - pos--; - } - } - } + return power6_sched_reorder2 (ready, *pn_ready - 1); /* Do Power9 dependent reordering if necessary. */ if (rs6000_tune == PROCESSOR_POWER9 && last_scheduled_insn