Message ID | d2de3c57-01ab-3e42-97d4-80ad552eaac8@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000/test: Add emulated gather test case | expand |
On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > > As evaluated, the emulated gather capability of vectorizer > (r12-2733) can help to speed up SPEC2017 510.parest_r on > Power8/9/10 by 5% to 9% with option sets Ofast unroll and > Ofast lto. But since rs6000 missed unpacking support for > unsigned int before, it can only vectorize the hotspots > until r12-3134. > > By checking why r12-2733 doesn't immediately show its impact > for SPEC2017 510.parest_r while the associated test case > already can get vectorized on rs6000 at that time, I realized > the associated test case use int as INDEXTYPE while the > hotspots actually use unsigned int. So different from the one > in i386, this patch uses unsigned int as INDEXTYPE since the > unpack support for unsigned int (r12-3134) also matters for > the hotspots vectorization. Not sure if it's worth to updating > the one in i386 as well? It looks like the same testcase added in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531 > > Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. > > Is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/vect-gather-1.c: New test. > > diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > new file mode 100644 > index 00000000000..bf98045ab03 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* Profitable from Power8 since it supports efficient unaligned load. */ > +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ > + > +#ifndef INDEXTYPE > +#define INDEXTYPE unsigned int > +#endif > +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, > + double *luval, double *dst) > +{ > + double res = 0; > + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) > + res += *luval * dst[*col]; > + return res; > +} > + > +/* With gather emulation this should be profitable to vectorize from Power8. */ > +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ > +/* The index vector loads and promotions should be scalar after forwprop. */ > +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ > -- > 2.25.1 >
on 2021/11/25 下午1:17, Hongtao Liu wrote: > On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Hi, >> >> This patch is to add a test case similar to the one in i386 >> to add testing coverage for 510.parest_r hotspots. >> >> As evaluated, the emulated gather capability of vectorizer >> (r12-2733) can help to speed up SPEC2017 510.parest_r on >> Power8/9/10 by 5% to 9% with option sets Ofast unroll and >> Ofast lto. But since rs6000 missed unpacking support for >> unsigned int before, it can only vectorize the hotspots >> until r12-3134. >> >> By checking why r12-2733 doesn't immediately show its impact >> for SPEC2017 510.parest_r while the associated test case >> already can get vectorized on rs6000 at that time, I realized >> the associated test case use int as INDEXTYPE while the >> hotspots actually use unsigned int. So different from the one >> in i386, this patch uses unsigned int as INDEXTYPE since the >> unpack support for unsigned int (r12-3134) also matters for >> the hotspots vectorization. Not sure if it's worth to updating >> the one in i386 as well? > It looks like the same testcase added in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531 Thanks for the information! Good to know that there are already some cases to cover. :) BR, Kewen >> >> Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> ----- >> gcc/testsuite/ChangeLog: >> >> * gcc.target/powerpc/vect-gather-1.c: New test. >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c >> new file mode 100644 >> index 00000000000..bf98045ab03 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c >> @@ -0,0 +1,20 @@ >> +/* { dg-do compile } */ >> +/* Profitable from Power8 since it supports efficient unaligned load. */ >> +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ >> + >> +#ifndef INDEXTYPE >> +#define INDEXTYPE unsigned int >> +#endif >> +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, >> + double *luval, double *dst) >> +{ >> + double res = 0; >> + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) >> + res += *luval * dst[*col]; >> + return res; >> +} >> + >> +/* With gather emulation this should be profitable to vectorize from Power8. */ >> +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ >> +/* The index vector loads and promotions should be scalar after forwprop. */ >> +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ >> -- >> 2.25.1 >> > >
Hi! On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote: > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > gcc/testsuite/ChangeLog: > * gcc.target/powerpc/vect-gather-1.c: New test. This is okay for trunk. Thanks! Segher
on 2021/11/27 上午12:24, Segher Boessenkool wrote: > Hi! > > On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote: >> This patch is to add a test case similar to the one in i386 >> to add testing coverage for 510.parest_r hotspots. > >> gcc/testsuite/ChangeLog: >> * gcc.target/powerpc/vect-gather-1.c: New test. > > This is okay for trunk. Thanks! > Thanks Segher! Committed as r12-5569. BR, Kewen
diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c new file mode 100644 index 00000000000..bf98045ab03 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* Profitable from Power8 since it supports efficient unaligned load. */ +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ + +#ifndef INDEXTYPE +#define INDEXTYPE unsigned int +#endif +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, + double *luval, double *dst) +{ + double res = 0; + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) + res += *luval * dst[*col]; + return res; +} + +/* With gather emulation this should be profitable to vectorize from Power8. */ +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ +/* The index vector loads and promotions should be scalar after forwprop. */ +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */