Message ID | D4C76825A6780047854A11E93CDE84D005980DC71A@SAUSEXMBP01.amd.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 28, 2011 at 12:33 AM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote: > Hi, > > Attached are the patches we propose to backport to gcc 4.6 branch which are related to avx256 unaligned load/store splitting. > As we mentioned before, The combined effect of these patches are positive on both AMD and Intel CPUs on cpu2006 and > polyhedron 2005. > > 0001-Split-32-byte-AVX-unaligned-load-store.patch > Initial patch that implements unaligned load/store splitting > > 0001-Don-t-assert-unaligned-256bit-load-store.patch > Remove the assert. > > 0001-Fix-a-typo-in-mavx256-split-unaligned-store.patch > Fix a typo. > > 0002-pr49089-enable-avx256-splitting-unaligned-load-store.patch > Disable unaligned load splitting for bdver1. > > All these patches are in 4.7 trunk. > > Bootstrap and tests are on-going in gcc 4.6 branch. > > Is It OK to commit to 4.6 branch as long as the tests pass? Yes, if they have been approved and checked in for trunk. Thanks, Richard. > Thanks, > > Changpeng > > > > ________________________________________ > From: Jagasia, Harsha > Sent: Monday, June 20, 2011 12:03 PM > To: 'H.J. Lu' > Cc: 'gcc-patches@gcc.gnu.org'; 'hubicka@ucw.cz'; 'ubizjak@gmail.com'; 'hongjiu.lu@intel.com'; Fang, Changpeng > Subject: RE: Backport AVX256 load/store split patches to gcc 4.6 for performance boost on latest AMD/Intel hardware. > >> On Mon, Jun 20, 2011 at 9:58 AM, <harsha.jagasia@amd.com> wrote: >> > Is it ok to backport patches, with Changelogs below, already in trunk >> to gcc >> > 4.6? These patches are for AVX-256bit load store splitting. These >> patches >> > make significant performance difference >=3% to several CPU2006 and >> > Polyhedron benchmarks on latest AMD and Intel hardware. If ok, I will >> post >> > backported patches for commit approval. >> > >> > AMD plans to submit additional patches on AVX-256 load/store >> splitting to >> > trunk. We will send additional backport requests for those later once >> they >> > are accepted/comitted to trunk. >> > >> >> Since we will make some changes on trunk, I would prefer to to do >> the backport after trunk change is finished. > > Ok, thanks. Adding Changpeng who is working on the trunk changes. > > Harsha > >
From 50310fc367348b406fc88d54c3ab54d1a304ad52 Mon Sep 17 00:00:00 2001 From: Changpeng Fang <chfang@huainan.(none)> Date: Mon, 13 Jun 2011 13:13:32 -0700 Subject: [PATCH 2/2] pr49089: enable avx256 splitting unaligned load/store only when beneficial * config/i386/i386.c (avx256_split_unaligned_load): New definition. (avx256_split_unaligned_store): New definition. (ix86_option_override_internal): Enable avx256 unaligned load(store) splitting only when avx256_split_unaligned_load(store) is set. --- gcc/config/i386/i386.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 7b266b9..3bc0b53 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2121,6 +2121,12 @@ static const unsigned int x86_arch_always_fancy_math_387 = m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2I7 | m_GENERIC; +static const unsigned int x86_avx256_split_unaligned_load + = m_COREI7 | m_GENERIC; + +static const unsigned int x86_avx256_split_unaligned_store + = m_COREI7 | m_BDVER1 | m_GENERIC; + /* In case the average insn count for single function invocation is lower than this constant, emit fast (but longer) prologue and epilogue code. */ @@ -4194,9 +4200,11 @@ ix86_option_override_internal (bool main_args_p) if (flag_expensive_optimizations && !(target_flags_explicit & MASK_VZEROUPPER)) target_flags |= MASK_VZEROUPPER; - if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) + if ((x86_avx256_split_unaligned_load & ix86_tune_mask) + && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD; - if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) + if ((x86_avx256_split_unaligned_store & ix86_tune_mask) + && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE; } } -- 1.7.0.4