Message ID | D4C76825A6780047854A11E93CDE84D02F7757@SAUSEXMBP01.amd.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 29, 2010 at 2:01 AM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote: > Hi, > > Attached is the patch that partially fixes bug 44576: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile > time on prefetching + peeling. > > This patch avoid useless computation of miss rate because, if delta (address diference) is greater than or equal to > cache line size, The two references will never hit the same cache size and thus all misses. > > This patch reduces the compile time of the test case from 5m30'' to 1m20'' on an amd-linux64 system. > Note that without -fprefetching-loop-arrays, the compile time on the same system is 30'', and I am still > working on reducing the complexity of reuse analysis and miss rate computation. > > The patch passed Bootstrapping and regression tests. > > Is this patch OK to commit? Ok. Thanks, Richard. > Thanks, > > Changpeng
Hi,
> Is this patch OK to commit?
yes,
Zdenek
On Tue, Jun 29, 2010 at 04:33, Richard Guenther <richard.guenther@gmail.com> wrote: > On Tue, Jun 29, 2010 at 2:01 AM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote: >> Hi, >> >> Attached is the patch that partially fixes bug 44576: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile >> time on prefetching + peeling. >> >> This patch avoid useless computation of miss rate because, if delta (address diference) is greater than or equal to >> cache line size, The two references will never hit the same cache size and thus all misses. >> >> This patch reduces the compile time of the test case from 5m30'' to 1m20'' on an amd-linux64 system. >> Note that without -fprefetching-loop-arrays, the compile time on the same system is 30'', and I am still >> working on reducing the complexity of reuse analysis and miss rate computation. >> >> The patch passed Bootstrapping and regression tests. >> >> Is this patch OK to commit? > > Ok. > Committed r161727
From b29f8edf2b1a068ab7271746e8c621446e342dc1 Mon Sep 17 00:00:00 2001 From: Changpeng Fang <chfang@pathscale.(none)> Date: Mon, 28 Jun 2010 10:23:36 -0700 Subject: [PATCH 4/4] pr 44576: miss rate computation improvement for prefetching loop arrays. * tree-ssa-loop-prefetch.c (compute_miss_rate): Return 1000 (out of 1000) for miss rate if the address diference is greater than or equal to the cache line size (the two reference will never hit the same cache line). --- gcc/tree-ssa-loop-prefetch.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c index 548c3e4..27e2b42 100644 --- a/gcc/tree-ssa-loop-prefetch.c +++ b/gcc/tree-ssa-loop-prefetch.c @@ -654,6 +654,11 @@ compute_miss_rate (unsigned HOST_WIDE_INT cache_line_size, int total_positions, miss_positions, miss_rate; int address1, address2, cache_line1, cache_line2; + /* It always misses if delta is greater than or equal to the cache + line size. */ + if (delta >= cache_line_size) + return 1000; + total_positions = 0; miss_positions = 0; -- 1.6.3.3