[0/4] sched1 improvements

Message ID	20241020194018.3051160-1-vineetg@rivosinc.com
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ACA463858D20 From: Vineet Gupta <vineetg@rivosinc.com> To: gcc-patches@gcc.gnu.org, Richard Sandiford <rdsandiford@googlemail.com>, Vladimir Makarov <vmakarov@redhat.com>, Michael Meissner <gnu@the-meissners.org>, Peter Bergner <bergner@linux.ibm.com>, Wilco Dijkstra <wdijkstr@arm.com> Cc: Jeff Law <jeffreyalaw@gmail.com>, gnu-toolchain@rivosinc.com, Vineet Gupta <vineetg@rivosinc.com> Subject: [PATCH 0/4] sched1 improvements Date: Sun, 20 Oct 2024 12:40:14 -0700 Message-ID: <20241020194018.3051160-1-vineetg@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	sched1 improvements \| expand [0/4] sched1 improvements [1/4] sched1: hookize pressure scheduling spilling agressiveness [2/4] RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729] [3/4] sched1: model: only promote true dependecies in predecessor promotion [4/4] sched1: model: ICE on infinite loops in predecessor promotion (Not for Merge)

Message ID

20241020194018.3051160-1-vineetg@rivosinc.com

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ACA463858D20
From: Vineet Gupta <vineetg@rivosinc.com>
To: gcc-patches@gcc.gnu.org, Richard Sandiford <rdsandiford@googlemail.com>,
 Vladimir Makarov <vmakarov@redhat.com>,
 Michael Meissner <gnu@the-meissners.org>,
 Peter Bergner <bergner@linux.ibm.com>, Wilco Dijkstra <wdijkstr@arm.com>
Cc: Jeff Law <jeffreyalaw@gmail.com>, gnu-toolchain@rivosinc.com,
 Vineet Gupta <vineetg@rivosinc.com>
Subject: [PATCH 0/4] sched1 improvements
Date: Sun, 20 Oct 2024 12:40:14 -0700
Message-ID: <20241020194018.3051160-1-vineetg@rivosinc.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

sched1 improvements | expand

Message

Vineet Gupta Oct. 20, 2024, 7:40 p.m. UTC

Hi,

PFA patch series which improves sched1 spilling. This all started with
SPEC2017 507.Cactu dynamic icounts on RISC-V being double than those of
aarch64 (~2.6 trillion vs. ~1.4 trillion). Robin/Jeff hinted that the
issue could be sched1 which it turned out to be.

Essentially there are 2 fixes

  - Patch 1/4 improves the main list schedular outcomes by not
    watering down negative pressure change to zero. It implements
    a target hook, which is seperately enabled in patch 2/4 for RISC-V.

  - Patch 3/4 improves model schedule to not increase register
    pressure in certain cases.

  - Patch 4/4 is just a debug hack which I would like any testers to
    apply as that helpe dme a lot during development of patch 3/4.

More details can be found in individual patches.

Results on RISC-V hardware BPI-F3 (perf stat instructions/cycles) and
on aarch64 (I could only get QEMU dynamic icounts).

RISC-V BPI-F3 (-Ofast -march=rv64gcv_zba_zbb_zbs)

  baseline  | 7,631,707,552,979      cycles:u                         #    1.600 GHz
            | 2,630,225,489,010      instructions:u                   #    0.34  insn per cycle
            |
  all       | 6,736,337,207,427      cycles:u           (12% faster)  #    1.600 GHz
  patches   | 2,078,712,047,604      instructions:u     (21% fewer)   #    0.31  insn per cycle

aarch64 (-Ofast -march=armv9-a+sve2) + implement TARGET_SCHED_PRESSURE_PREFER_NARROW=true

  baseline  | 1,382,403,783,566
            |
  all       | 1,113,896,471,282                         (19.4% fewer)
  patches   |

As a follow up to discussions at Cauldron last month, I'm CC'ing some of
the aarch64 and power folks to test this on real hardware and get the
results (please don't forget to add equivalent of patch 2/4 for your
respective backends, i.e.

+#undef  TARGET_SCHED_PRESSURE_PREFER_NARROW
+#define TARGET_SCHED_PRESSURE_PREFER_NARROW hook_bool_void_true

Thx,
-Vineet

Vineet Gupta (4):
  sched1: hookize pressure scheduling spilling agressiveness
  RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729]
  sched1: model: only promote true dependecies in predecessor promotion
  sched1: model: ICE on infinite loops in predecessor promotion (Not for
    Merge)

 gcc/config/riscv/riscv.cc                     |   3 +
 gcc/doc/tm.texi                               |  11 ++
 gcc/doc/tm.texi.in                            |   2 +
 gcc/haifa-sched.cc                            | 109 ++++++++++++++----
 gcc/sched-rgn.cc                              |  14 ++-
 gcc/target.def                                |  13 +++
 gcc/testsuite/gcc.target/riscv/riscv.exp      |   2 +
 .../gcc.target/riscv/sched1-spills/hang1.c    |  32 +++++
 .../gcc.target/riscv/sched1-spills/hang5.c    |  60 ++++++++++
 .../gcc.target/riscv/sched1-spills/spill1.cpp |  31 +++++
 .../gcc.target/riscv/sched1-spills/spill2.cpp |  37 ++++++
 11 files changed, 289 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill1.cpp
 create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill2.cpp

--
2.43.0

Comments

Vineet Gupta Oct. 28, 2024, 10:24 p.m. UTC | #1

Ping !

On 10/20/24 12:40, Vineet Gupta wrote:
> Hi,
>
> PFA patch series which improves sched1 spilling. This all started with
> SPEC2017 507.Cactu dynamic icounts on RISC-V being double than those of
> aarch64 (~2.6 trillion vs. ~1.4 trillion). Robin/Jeff hinted that the
> issue could be sched1 which it turned out to be.
>
> Essentially there are 2 fixes
>
>   - Patch 1/4 improves the main list schedular outcomes by not
>     watering down negative pressure change to zero. It implements
>     a target hook, which is seperately enabled in patch 2/4 for RISC-V.
>
>   - Patch 3/4 improves model schedule to not increase register
>     pressure in certain cases.
>
>   - Patch 4/4 is just a debug hack which I would like any testers to
>     apply as that helpe dme a lot during development of patch 3/4.
>
> More details can be found in individual patches.
>
> Results on RISC-V hardware BPI-F3 (perf stat instructions/cycles) and
> on aarch64 (I could only get QEMU dynamic icounts).
>
> RISC-V BPI-F3 (-Ofast -march=rv64gcv_zba_zbb_zbs)
>
>   baseline  | 7,631,707,552,979      cycles:u                         #    1.600 GHz
>             | 2,630,225,489,010      instructions:u                   #    0.34  insn per cycle
>             |
>   all       | 6,736,337,207,427      cycles:u           (12% faster)  #    1.600 GHz
>   patches   | 2,078,712,047,604      instructions:u     (21% fewer)   #    0.31  insn per cycle
>
> aarch64 (-Ofast -march=armv9-a+sve2) + implement TARGET_SCHED_PRESSURE_PREFER_NARROW=true
>
>   baseline  | 1,382,403,783,566
>             |
>   all       | 1,113,896,471,282                         (19.4% fewer)
>   patches   |
>
> As a follow up to discussions at Cauldron last month, I'm CC'ing some of
> the aarch64 and power folks to test this on real hardware and get the
> results (please don't forget to add equivalent of patch 2/4 for your
> respective backends, i.e.
>
> +#undef  TARGET_SCHED_PRESSURE_PREFER_NARROW
> +#define TARGET_SCHED_PRESSURE_PREFER_NARROW hook_bool_void_true
>
> Thx,
> -Vineet
>
> Vineet Gupta (4):
>   sched1: hookize pressure scheduling spilling agressiveness
>   RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729]
>   sched1: model: only promote true dependecies in predecessor promotion
>   sched1: model: ICE on infinite loops in predecessor promotion (Not for
>     Merge)
>
>  gcc/config/riscv/riscv.cc                     |   3 +
>  gcc/doc/tm.texi                               |  11 ++
>  gcc/doc/tm.texi.in                            |   2 +
>  gcc/haifa-sched.cc                            | 109 ++++++++++++++----
>  gcc/sched-rgn.cc                              |  14 ++-
>  gcc/target.def                                |  13 +++
>  gcc/testsuite/gcc.target/riscv/riscv.exp      |   2 +
>  .../gcc.target/riscv/sched1-spills/hang1.c    |  32 +++++
>  .../gcc.target/riscv/sched1-spills/hang5.c    |  60 ++++++++++
>  .../gcc.target/riscv/sched1-spills/spill1.cpp |  31 +++++
>  .../gcc.target/riscv/sched1-spills/spill2.cpp |  37 ++++++
>  11 files changed, 289 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/hang5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill1.cpp
>  create mode 100644 gcc/testsuite/gcc.target/riscv/sched1-spills/spill2.cpp
>
> --
> 2.43.0
>

Jeff Law Oct. 28, 2024, 10:53 p.m. UTC | #2

On 10/28/24 4:24 PM, Vineet Gupta wrote:
> Ping !
Pong.  I've got a response to the first patch partially written :-) 
Exec summary is I don't have a problem with functionality in that patch, 
just naming/comments stuff.  Still trying to figure out how to express 
it clearly.

jeff