Message ID | ZtdGrRJ2I6T27li+@arm.com |
---|---|
State | New |
Headers | show |
Series | [1/4] middle-end: have vect_recog_cond_store_pattern use pattern statement for cond if available | expand |
Tamar Christina <tamar.christina@arm.com> writes: > Hi All, > > This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 > so we simplify vector comparisons in AArch64. > > With this enabled > > res: > movi v0.4s, 0 > cmeq v0.4s, v0.4s, v0.4s > ret > > is simplified to: > > res: > mvni v0.4s, 0 > ret > > NOTE: I don't really like the testcase as it depends on an > uninitialised value to hide the constant from GIMPLE. > > Happy to go with something else if there are any suggestions. > I thought about an RTL testcase, but those seem painful. Like you say, I think an RTL testcase would be better. Could you use the attached (for gcc.dg/rtl/aarch64)? > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64.h (VECTOR_STORE_FLAG_VALUE): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/vector-cmp-rtl-elim.c: New test. > > --- > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index 2dfb999bea53414498a2355bb30db938f6b94100..b99f69103ab7e1d44e5e41ee89fb9a74450c57ca 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -156,6 +156,8 @@ > > #define PCC_BITFIELD_TYPE_MATTERS 1 > > +#define VECTOR_STORE_FLAG_VALUE(MODE) CONST1_RTX (GET_MODE_INNER (MODE)) > + I think it'd be useful to capture the reasons we discussed internally for preferring this choice. /* Use the same RTL truth representation for vector elements as we do for scalars. This maintains the property that a comparison like eq:V4SI is a composition of 4 individual eq:SIs, just like plus:V4SI is a composition of 4 individual plus:SIs. This means that Advanced SIMD comparisons are represented in RTL as (neg (op ...)). */ OK with those changes, thanks. Richard > #ifndef USED_FOR_TARGET > > /* Define an enum of all features (ISA modes, architectures and extensions). > diff --git a/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c > new file mode 100644 > index 0000000000000000000000000000000000000000..d67baa216d8332a26bdc64350402b77d87379f28 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3" } */ > +/* { dg-final { check-function-bodies "**" "" "" } } */ > + > +#include <arm_neon.h> > + > +/* > +** res: > +** mvni v0.4s, 0 > +** ret > +*/ > +uint32x4_t res () > +{ > + uint32x4_t a; > + uint32x4_t b = {0, 0, 0, 0}; > + return vceqq_u32 (a, b); > +} > +
> -----Original Message----- > From: Richard Sandiford <richard.sandiford@arm.com> > Sent: Monday, September 9, 2024 9:29 AM > To: Tamar Christina <Tamar.Christina@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw > <Richard.Earnshaw@arm.com>; Marcus Shawcroft > <Marcus.Shawcroft@arm.com>; ktkachov@gcc.gnu.org > Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE. > > Tamar Christina <tamar.christina@arm.com> writes: > > Hi All, > > > > This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 > > so we simplify vector comparisons in AArch64. > > > > With this enabled > > > > res: > > movi v0.4s, 0 > > cmeq v0.4s, v0.4s, v0.4s > > ret > > > > is simplified to: > > > > res: > > mvni v0.4s, 0 > > ret > > > > NOTE: I don't really like the testcase as it depends on an > > uninitialised value to hide the constant from GIMPLE. > > > > Happy to go with something else if there are any suggestions. > > I thought about an RTL testcase, but those seem painful. > > Like you say, I think an RTL testcase would be better. Could you use > the attached (for gcc.dg/rtl/aarch64)? > Thanks, do you have any tips for writing these? If there a way to dump a skeleton like with the gimple tests? Thanks, Tamar > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.h (VECTOR_STORE_FLAG_VALUE): New. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/vector-cmp-rtl-elim.c: New test. > > > > --- > > > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > > index > 2dfb999bea53414498a2355bb30db938f6b94100..b99f69103ab7e1d44e5e41e > e89fb9a74450c57ca 100644 > > --- a/gcc/config/aarch64/aarch64.h > > +++ b/gcc/config/aarch64/aarch64.h > > @@ -156,6 +156,8 @@ > > > > #define PCC_BITFIELD_TYPE_MATTERS 1 > > > > +#define VECTOR_STORE_FLAG_VALUE(MODE) CONST1_RTX > (GET_MODE_INNER (MODE)) > > + > > I think it'd be useful to capture the reasons we discussed internally > for preferring this choice. > > /* Use the same RTL truth representation for vector elements as we do > for scalars. This maintains the property that a comparison like > eq:V4SI is a composition of 4 individual eq:SIs, just like plus:V4SI > is a composition of 4 individual plus:SIs. > > This means that Advanced SIMD comparisons are represented in RTL as > (neg (op ...)). */ > > OK with those changes, thanks. > > Richard > > > #ifndef USED_FOR_TARGET > > > > /* Define an enum of all features (ISA modes, architectures and extensions). > > diff --git a/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c > b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..d67baa216d8332a26bdc6 > 4350402b77d87379f28 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c > > @@ -0,0 +1,18 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O3" } */ > > +/* { dg-final { check-function-bodies "**" "" "" } } */ > > + > > +#include <arm_neon.h> > > + > > +/* > > +** res: > > +** mvni v0.4s, 0 > > +** ret > > +*/ > > +uint32x4_t res () > > +{ > > + uint32x4_t a; > > + uint32x4_t b = {0, 0, 0, 0}; > > + return vceqq_u32 (a, b); > > +} > > +
> On 9 Sep 2024, at 11:06, Tamar Christina <tamar.christina@arm.com> wrote: > > External email: Use caution opening links or attachments > > >> -----Original Message----- >> From: Richard Sandiford <richard.sandiford@arm.com> >> Sent: Monday, September 9, 2024 9:29 AM >> To: Tamar Christina <Tamar.Christina@arm.com> >> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw >> <Richard.Earnshaw@arm.com>; Marcus Shawcroft >> <Marcus.Shawcroft@arm.com>; ktkachov@gcc.gnu.org >> Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE. >> >> Tamar Christina <tamar.christina@arm.com> writes: >>> Hi All, >>> >>> This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 >>> so we simplify vector comparisons in AArch64. >>> >>> With this enabled >>> >>> res: >>> movi v0.4s, 0 >>> cmeq v0.4s, v0.4s, v0.4s >>> ret >>> >>> is simplified to: >>> >>> res: >>> mvni v0.4s, 0 >>> ret >>> >>> NOTE: I don't really like the testcase as it depends on an >>> uninitialised value to hide the constant from GIMPLE. >>> >>> Happy to go with something else if there are any suggestions. >>> I thought about an RTL testcase, but those seem painful. >> >> Like you say, I think an RTL testcase would be better. Could you use >> the attached (for gcc.dg/rtl/aarch64)? >> > > Thanks, do you have any tips for writing these? If there a way to dump a > skeleton like with the gimple tests? As a tangent, I wonder if the RTL dump logic can be extended to have a dump-for-rtl-testcase mode, under the reasoning that creating RTL test cases for ICE fixes is a common action. It could even be used in the EMERGENCY DUMP case when dumping during an ICE. Thanks, Kyrill > > Thanks, > Tamar > >>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >>> >>> Ok for master? >>> >>> Thanks, >>> Tamar >>> >>> gcc/ChangeLog: >>> >>> * config/aarch64/aarch64.h (VECTOR_STORE_FLAG_VALUE): New. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/aarch64/vector-cmp-rtl-elim.c: New test. >>> >>> --- >>> >>> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h >>> index >> 2dfb999bea53414498a2355bb30db938f6b94100..b99f69103ab7e1d44e5e41e >> e89fb9a74450c57ca 100644 >>> --- a/gcc/config/aarch64/aarch64.h >>> +++ b/gcc/config/aarch64/aarch64.h >>> @@ -156,6 +156,8 @@ >>> >>> #define PCC_BITFIELD_TYPE_MATTERS 1 >>> >>> +#define VECTOR_STORE_FLAG_VALUE(MODE) CONST1_RTX >> (GET_MODE_INNER (MODE)) >>> + >> >> I think it'd be useful to capture the reasons we discussed internally >> for preferring this choice. >> >> /* Use the same RTL truth representation for vector elements as we do >> for scalars. This maintains the property that a comparison like >> eq:V4SI is a composition of 4 individual eq:SIs, just like plus:V4SI >> is a composition of 4 individual plus:SIs. >> >> This means that Advanced SIMD comparisons are represented in RTL as >> (neg (op ...)). */ >> >> OK with those changes, thanks. >> >> Richard >> >>> #ifndef USED_FOR_TARGET >>> >>> /* Define an enum of all features (ISA modes, architectures and extensions). >>> diff --git a/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c >> b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c >>> new file mode 100644 >>> index >> 0000000000000000000000000000000000000000..d67baa216d8332a26bdc6 >> 4350402b77d87379f28 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c >>> @@ -0,0 +1,18 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O3" } */ >>> +/* { dg-final { check-function-bodies "**" "" "" } } */ >>> + >>> +#include <arm_neon.h> >>> + >>> +/* >>> +** res: >>> +** mvni v0.4s, 0 >>> +** ret >>> +*/ >>> +uint32x4_t res () >>> +{ >>> + uint32x4_t a; >>> + uint32x4_t b = {0, 0, 0, 0}; >>> + return vceqq_u32 (a, b); >>> +} >>> + >
Kyrylo Tkachov <ktkachov@nvidia.com> writes: >> On 9 Sep 2024, at 11:06, Tamar Christina <tamar.christina@arm.com> wrote: >> >> External email: Use caution opening links or attachments >> >> >>> -----Original Message----- >>> From: Richard Sandiford <richard.sandiford@arm.com> >>> Sent: Monday, September 9, 2024 9:29 AM >>> To: Tamar Christina <Tamar.Christina@arm.com> >>> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw >>> <Richard.Earnshaw@arm.com>; Marcus Shawcroft >>> <Marcus.Shawcroft@arm.com>; ktkachov@gcc.gnu.org >>> Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE. >>> >>> Tamar Christina <tamar.christina@arm.com> writes: >>>> Hi All, >>>> >>>> This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 >>>> so we simplify vector comparisons in AArch64. >>>> >>>> With this enabled >>>> >>>> res: >>>> movi v0.4s, 0 >>>> cmeq v0.4s, v0.4s, v0.4s >>>> ret >>>> >>>> is simplified to: >>>> >>>> res: >>>> mvni v0.4s, 0 >>>> ret >>>> >>>> NOTE: I don't really like the testcase as it depends on an >>>> uninitialised value to hide the constant from GIMPLE. >>>> >>>> Happy to go with something else if there are any suggestions. >>>> I thought about an RTL testcase, but those seem painful. >>> >>> Like you say, I think an RTL testcase would be better. Could you use >>> the attached (for gcc.dg/rtl/aarch64)? >>> >> >> Thanks, do you have any tips for writing these? If there a way to dump a >> skeleton like with the gimple tests? TBH I just take one I wrote earlier that looks relatively close, then adapt it. They're not something I could easily write from scratch. > As a tangent, I wonder if the RTL dump logic can be extended to have a dump-for-rtl-testcase mode, under the reasoning that creating RTL test cases for ICE fixes is a common action. It could even be used in the EMERGENCY DUMP case when dumping during an ICE. Yeah, perhaps. But I think there is an advantage in writing the test by hand, to make sure that everything in the test is necessary and relatively future-proof. Often, the reason for using an RTL testcase is that it isn't easy to create a minimal RTL reproducer from gimple. Most EMERGENCY DUMPs from non-RTL testcases are therefore likely to contain stuff that isn't directly relevant. Thanks, Richard
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2dfb999bea53414498a2355bb30db938f6b94100..b99f69103ab7e1d44e5e41ee89fb9a74450c57ca 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -156,6 +156,8 @@ #define PCC_BITFIELD_TYPE_MATTERS 1 +#define VECTOR_STORE_FLAG_VALUE(MODE) CONST1_RTX (GET_MODE_INNER (MODE)) + #ifndef USED_FOR_TARGET /* Define an enum of all features (ISA modes, architectures and extensions). diff --git a/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c new file mode 100644 index 0000000000000000000000000000000000000000..d67baa216d8332a26bdc64350402b77d87379f28 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vector-cmp-rtl-elim.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include <arm_neon.h> + +/* +** res: +** mvni v0.4s, 0 +** ret +*/ +uint32x4_t res () +{ + uint32x4_t a; + uint32x4_t b = {0, 0, 0, 0}; + return vceqq_u32 (a, b); +} +