Message ID | 20191230131547.GB99729@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Series | [wwwdocs] Add GCC10 IPA/LTO changes | expand |
> Hi, > here are some of changes of LTO/IPA done in GCC10. There is also > recursive cloning and some other stuff I will add incrementally as well > as some data on overall compile time/memory use improvements as we > reported in past years. I am still running tests and fixing bugs in this > area. > > Honza Ping... Honza > > diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html > index aca76825..0f0fce18 100644 > --- a/htdocs/gcc-10/changes.html > +++ b/htdocs/gcc-10/changes.html > @@ -50,12 +50,46 @@ a work-in-progress.</p> > <!-- .................................................................. --> > <h2 id="general">General Improvements</h2> > > +<p>The following GCC command line options have been introduced or improved.</p> > +<ul> > + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-partial-training"><code>-fprofile-partial-training</code></a> > + can now be used to inform compiler that code paths not covered by the > + train run should not be optimized for size.</li> > +</ul> > <p>The following built-in functions have been introduced.</p> > <ul> > <li><code>__builtin_roundeven</code> for the corresponding function from > ISO/IEC TS 18661. > </li> > </ul> > +<p>A large number of improvements to code generation have been made, including > + but not limited to the following.</p> > +<ul> > + <li>Inter-procedural optimization improvements: > + <ul> > + <li>Inter-procedural scalar replacement for aggregates (IPA-SRA) pass was re-implemented to work at link-time. > + </li> > + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-finline-functions"><code>-finline-functions</code></a> > + is now enabled at <code>-O2</code> and was retuned for better code size > + versus runtime performance tradeofs. Inliner heuristics was also > + significantly sped up to avoid negativive impact to <code>-flto > + -O2</code> compile times. > + </li> > + <li>Inliner heuristics and function clonning can now use value-range > + information to predict effectivity of individual transformations.</li> > + <li>Selected <code>--param</code> values can now be specified at > + translation unit granuality. This includes all parameters controlling > + inliner.</li> > + <li>During link-time optimization the C++ One Definition Rule is used to > + increase precision of type based alias analysis.</li> > + </ul> > + </li> > + <li>Profile driven optimization improvements: > + <ul> > + <li>Profile maintenance during compilation was improved and hot/cold code partitioning improved.</li> > + </ul> > + </li> > +</ul> > > <!-- .................................................................. --> > <h2 id="languages">New Languages and Language specific improvements</h2>
On Mon, 30 Dec 2019, Jan Hubicka wrote: > here are some of changes of LTO/IPA done in GCC10. Quite a bit! :-) > +<p>The following GCC command line options have been introduced or improved.</p> ...command-line... > + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-partial-training"><code>-fprofile-partial-training</code></a> I suggest to use the mainline version of the docs unless you believe there is going to be significant changes (removals) in the future? https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html > + can now be used to inform compiler that code paths not covered by the > + train run should not be optimized for size.</li> ...the compiler... (Is it "train run" or "training run"?) > +<p>A large number of improvements to code generation have been made, including > + but not limited to the following.</p> Can you format for a smaller width here? Otherwise patches run a little wide. > + <li>Inter-procedural scalar replacement for aggregates (IPA-SRA) pass was re-implemented to work at link-time. The inter-procedural... > + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-finline-functions"><code>-finline-functions</code></a> > + is now enabled at <code>-O2</code> and was retuned for better code size > + versus runtime performance tradeofs. Inliner heuristics was also ...trade-offs... (dash plus double f) > + <li>Selected <code>--param</code> values can now be specified at > + translation unit granuality. This includes all parameters controlling > + inliner.</li> ...the inliner.... > + <li>Profile maintenance during compilation was improved and hot/cold How about "Profile maintenance during compilation and hot/cold code partitioning have been improved"? Okay with those changes. Thank you, Gerald
diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index aca76825..0f0fce18 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -50,12 +50,46 @@ a work-in-progress.</p> <!-- .................................................................. --> <h2 id="general">General Improvements</h2> +<p>The following GCC command line options have been introduced or improved.</p> +<ul> + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-partial-training"><code>-fprofile-partial-training</code></a> + can now be used to inform compiler that code paths not covered by the + train run should not be optimized for size.</li> +</ul> <p>The following built-in functions have been introduced.</p> <ul> <li><code>__builtin_roundeven</code> for the corresponding function from ISO/IEC TS 18661. </li> </ul> +<p>A large number of improvements to code generation have been made, including + but not limited to the following.</p> +<ul> + <li>Inter-procedural optimization improvements: + <ul> + <li>Inter-procedural scalar replacement for aggregates (IPA-SRA) pass was re-implemented to work at link-time. + </li> + <li><a href="https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-finline-functions"><code>-finline-functions</code></a> + is now enabled at <code>-O2</code> and was retuned for better code size + versus runtime performance tradeofs. Inliner heuristics was also + significantly sped up to avoid negativive impact to <code>-flto + -O2</code> compile times. + </li> + <li>Inliner heuristics and function clonning can now use value-range + information to predict effectivity of individual transformations.</li> + <li>Selected <code>--param</code> values can now be specified at + translation unit granuality. This includes all parameters controlling + inliner.</li> + <li>During link-time optimization the C++ One Definition Rule is used to + increase precision of type based alias analysis.</li> + </ul> + </li> + <li>Profile driven optimization improvements: + <ul> + <li>Profile maintenance during compilation was improved and hot/cold code partitioning improved.</li> + </ul> + </li> +</ul> <!-- .................................................................. --> <h2 id="languages">New Languages and Language specific improvements</h2>