mbox series

[0/5] Address std::find regression with RTL unrolling [PR116140]

Message ID ZrXgjETQj04/pDEf@arm.com
Headers show
Series Address std::find regression with RTL unrolling [PR116140] | expand

Message

Alex Coplan Aug. 9, 2024, 9:25 a.m. UTC
This patch series aims to address PR116140.  The regression in xalancbmk
(both in SPEC 2006 and SPEC 2017) occurred when removing the
hand-unrolling in std::__find_if in libstdc++.

Keeping the loop re-rolled in the source is desirable as it allows the
function to be vectorized with WIP vectorizer enhancements (peeling read
DRs for alignment in early break loops).

In theory this should have just been a single patch adding:
  #pragma GCC unroll 4
to the std::__find_if loop in libstdc++.

However, there were a couple of snags (see the PR for details).  The
series is structured as follows:
 - 1/5 fixes a bug in the C++ frontend causing the #pragma to get
   dropped under certain conditions.
 - 2/5 and 3/5 are preparatory testsuite patches for 4/5 which adds
   an LTO test that needs to scan an ltrans RTL dumpfile.
 - 4/5 fixes a bug where the has_unroll flag on functions isn't
   streamed during LTO.
 - 5/5 then finally adds the #pragma to std::__find_if.

The following table shows the performance effect of the patch series on
xalancbmk (both from SPEC CPU 2017 and SPEC CPU 2006).  This is on
Neoverse V1 with LTO.

+---------------------------+---------------+---------------+
|      Benchmark Suite      | SPEC CPU 2017 | SPEC CPU 2006 |
+---------------------------+---------------+---------------+
| Regression in PR          | 5.83%         | 11.12%        |
| Regression after patches  | 1.68%         | 3.16%         |
| % of regression recovered | 71.24%        | 71.11%        |
+---------------------------+---------------+---------------+

Bootstrapped/regtested as a series on aarch64-linux-gnu, no regressions.

Alex Coplan (5):
  cp: Ensure ANNOTATE_EXPRs remain outermost expressions in conditions [PR116140]
  testsuite: Add scan-ltrans-rtl for use in dg-final [PR116140]
  testsuite: Ensure ltrans dump files get cleaned up properly [PR116140]
  lto: Set has_unroll flag when streaming in for LTO [PR116140]
  libstdc++: Restore unrolling in std::find using pragma [PR116140]

 gcc/cp/semantics.cc                           |  26 ++--
 gcc/doc/sourcebuild.texi                      |   4 +-
 gcc/lto-streamer-in.cc                        |   2 +
 .../g++.dg/ext/pragma-unroll-lambda-lto.C     |  32 +++++
 .../g++.dg/ext/pragma-unroll-lambda.C         |  17 +++
 gcc/testsuite/lib/gcc-dg.exp                  |   4 +-
 gcc/testsuite/lib/scanltranstree.exp          | 123 ++++++++++++++++++
 libstdc++-v3/include/bits/stl_algobase.h      |   1 +
 8 files changed, 196 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
 create mode 100644 gcc/testsuite/g++.dg/ext/pragma-unroll-lambda.C