Message ID | mptiku1wks5.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | aarch64: Fix folding of degenerate svwhilele case [PR117045] | expand |
Hi Richard, > -----Original Message----- > From: Richard Sandiford <richard.sandiford@arm.com> > Sent: Wednesday, October 9, 2024 12:58 PM > To: gcc-patches@gcc.gnu.org > Cc: ktkachov@nvidia.com; Richard Earnshaw <Richard.Earnshaw@arm.com>; > Tamar Christina <Tamar.Christina@arm.com> > Subject: [PATCH] aarch64: Fix folding of degenerate svwhilele case [PR117045] > > The svwhilele folder mishandled the degenerate case in which > the second argument is the maximum integer. In that case, > the result is all-true regardless of the first parameter: > > If the second scalar operand is equal to the maximum signed integer > value then a condition which includes an equality test can never fail > and the result will be an all-true predicate. > > This is because the conceptual "increment the first operand > by 1 after each element" is done modulo the range of the operand. > The GCC code was instead treating it as infinite precision. > whilele_5.c even had a test for the incorrect behaviour. > > The easiest fix seemed to be to handle that case specially before > doing constant folding. This also copes with variable first operands. > > Tested on aarch64-linux-gnu. I'll push on Friday if there are no > comments before then. Since it's a wrong-code bug, I'd also like > to backport to release branches. > > Thanks, > Richard > > > gcc/ > PR target/116999 > PR target/117045 > * config/aarch64/aarch64-sve-builtins-base.cc > (svwhilelx_impl::fold): Check for WHILELTs of the minimum value > and WHILELEs of the maximum value. Fold them to all-false and > all-true respectively. > > gcc/testsuite/ > PR target/116999 > PR target/117045 > * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus > expected result. > * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. > * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise. > --- > .../aarch64/aarch64-sve-builtins-base.cc | 11 +++++- > .../aarch64/sve/acle/general/whilele_11.c | 31 +++++++++++++++++ > .../aarch64/sve/acle/general/whilele_12.c | 34 +++++++++++++++++++ > .../aarch64/sve/acle/general/whilele_5.c | 2 +- > 4 files changed, 76 insertions(+), 2 deletions(-) > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > index 4b33585d981..3d0975e4294 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > @@ -2945,7 +2945,9 @@ public: > : while_comparison (unspec_for_sint, unspec_for_uint), m_eq_p (eq_p) > {} > > - /* Try to fold a call by treating its arguments as constants of type T. */ > + /* Try to fold a call by treating its arguments as constants of type T. > + We have already filtered out the degenerate cases of X .LT. MIN > + and X .LE. MAX. */ > template<typename T> > gimple * > fold_type (gimple_folder &f) const > @@ -3001,6 +3003,13 @@ public: > if (f.vectors_per_tuple () > 1) > return nullptr; > > + /* Filter out cases where the condition is always true or always false. */ > + tree arg1 = gimple_call_arg (f.call, 1); > + if (!m_eq_p && operand_equal_p (arg1, TYPE_MIN_VALUE (TREE_TYPE > (arg1)))) > + return f.fold_to_pfalse (); Just a quick question for my own understanding, I assume the reason MIN is handled here is because fold_type will decrement the value at some point? Otherwise wouldn't MIN + 1 still fit inside the type's precision? FWIW patch looks good to me, just wondering why the MIN case is needed :) Cheers, Tamar > + if (m_eq_p && operand_equal_p (arg1, TYPE_MAX_VALUE (TREE_TYPE > (arg1)))) > + return f.fold_to_ptrue (); > + > if (f.type_suffix (1).unsigned_p) > return fold_type<poly_uint64> (f); > else > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > new file mode 100644 > index 00000000000..2be9dc5c534 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#include <arm_sve.h> > +#include <limits.h> > + > +svbool_t > +f1 (volatile int32_t *ptr) > +{ > + return svwhilelt_b8_s32 (*ptr, INT32_MIN); > +} > + > +svbool_t > +f2 (volatile uint32_t *ptr) > +{ > + return svwhilelt_b16_u32 (*ptr, 0); > +} > + > +svbool_t > +f3 (volatile int64_t *ptr) > +{ > + return svwhilelt_b32_s64 (*ptr, INT64_MIN); > +} > + > +svbool_t > +f4 (volatile uint64_t *ptr) > +{ > + return svwhilelt_b64_u64 (*ptr, 0); > +} > + > +/* { dg-final { scan-assembler-times {\tpfalse\tp[0-9]+\.b\n} 4 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > new file mode 100644 > index 00000000000..713065c3145 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > @@ -0,0 +1,34 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#include <arm_sve.h> > +#include <limits.h> > + > +svbool_t > +f1 (volatile int32_t *ptr) > +{ > + return svwhilele_b8_s32 (*ptr, INT32_MAX); > +} > + > +svbool_t > +f2 (volatile uint32_t *ptr) > +{ > + return svwhilele_b16_u32 (*ptr, UINT32_MAX); > +} > + > +svbool_t > +f3 (volatile int64_t *ptr) > +{ > + return svwhilele_b32_s64 (*ptr, INT64_MAX); > +} > + > +svbool_t > +f4 (volatile uint64_t *ptr) > +{ > + return svwhilele_b64_u64 (*ptr, UINT64_MAX); > +} > + > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.b(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.s(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.d(?:, all)\n} } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > index 7c73aa5926b..79f0e64b2ae 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > @@ -28,7 +28,7 @@ test3 (svbool_t *ptr) > *ptr = svwhilele_b16_s32 (0x7ffffffb, 0x7fffffff); > } > > -/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h, vl5\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ > > void > test4 (svbool_t *ptr) > -- > 2.25.1
Tamar Christina <Tamar.Christina@arm.com> writes: > Hi Richard, > >> -----Original Message----- >> From: Richard Sandiford <richard.sandiford@arm.com> >> Sent: Wednesday, October 9, 2024 12:58 PM >> To: gcc-patches@gcc.gnu.org >> Cc: ktkachov@nvidia.com; Richard Earnshaw <Richard.Earnshaw@arm.com>; >> Tamar Christina <Tamar.Christina@arm.com> >> Subject: [PATCH] aarch64: Fix folding of degenerate svwhilele case [PR117045] >> >> The svwhilele folder mishandled the degenerate case in which >> the second argument is the maximum integer. In that case, >> the result is all-true regardless of the first parameter: >> >> If the second scalar operand is equal to the maximum signed integer >> value then a condition which includes an equality test can never fail >> and the result will be an all-true predicate. >> >> This is because the conceptual "increment the first operand >> by 1 after each element" is done modulo the range of the operand. >> The GCC code was instead treating it as infinite precision. >> whilele_5.c even had a test for the incorrect behaviour. >> >> The easiest fix seemed to be to handle that case specially before >> doing constant folding. This also copes with variable first operands. >> >> Tested on aarch64-linux-gnu. I'll push on Friday if there are no >> comments before then. Since it's a wrong-code bug, I'd also like >> to backport to release branches. >> >> Thanks, >> Richard >> >> >> gcc/ >> PR target/116999 >> PR target/117045 >> * config/aarch64/aarch64-sve-builtins-base.cc >> (svwhilelx_impl::fold): Check for WHILELTs of the minimum value >> and WHILELEs of the maximum value. Fold them to all-false and >> all-true respectively. >> >> gcc/testsuite/ >> PR target/116999 >> PR target/117045 >> * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus >> expected result. >> * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. >> * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise. >> --- >> .../aarch64/aarch64-sve-builtins-base.cc | 11 +++++- >> .../aarch64/sve/acle/general/whilele_11.c | 31 +++++++++++++++++ >> .../aarch64/sve/acle/general/whilele_12.c | 34 +++++++++++++++++++ >> .../aarch64/sve/acle/general/whilele_5.c | 2 +- >> 4 files changed, 76 insertions(+), 2 deletions(-) >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >> >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> index 4b33585d981..3d0975e4294 100644 >> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> @@ -2945,7 +2945,9 @@ public: >> : while_comparison (unspec_for_sint, unspec_for_uint), m_eq_p (eq_p) >> {} >> >> - /* Try to fold a call by treating its arguments as constants of type T. */ >> + /* Try to fold a call by treating its arguments as constants of type T. >> + We have already filtered out the degenerate cases of X .LT. MIN >> + and X .LE. MAX. */ >> template<typename T> >> gimple * >> fold_type (gimple_folder &f) const >> @@ -3001,6 +3003,13 @@ public: >> if (f.vectors_per_tuple () > 1) >> return nullptr; >> >> + /* Filter out cases where the condition is always true or always false. */ >> + tree arg1 = gimple_call_arg (f.call, 1); >> + if (!m_eq_p && operand_equal_p (arg1, TYPE_MIN_VALUE (TREE_TYPE >> (arg1)))) >> + return f.fold_to_pfalse (); > > Just a quick question for my own understanding, I assume the reason MIN > is handled here is because fold_type will decrement the value at some point? > > Otherwise wouldn't MIN + 1 still fit inside the type's precision? > > FWIW patch looks good to me, just wondering why the MIN case is needed :) I admit it probably isn't needed to fix the bug. I just though it would look strange if we handled the arg1 extremity for m_eq_p without also handling it for !m_eq_p. Thanks, Richard > > Cheers, > Tamar > >> + if (m_eq_p && operand_equal_p (arg1, TYPE_MAX_VALUE (TREE_TYPE >> (arg1)))) >> + return f.fold_to_ptrue (); >> + >> if (f.type_suffix (1).unsigned_p) >> return fold_type<poly_uint64> (f); >> else >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >> new file mode 100644 >> index 00000000000..2be9dc5c534 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >> @@ -0,0 +1,31 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +#include <arm_sve.h> >> +#include <limits.h> >> + >> +svbool_t >> +f1 (volatile int32_t *ptr) >> +{ >> + return svwhilelt_b8_s32 (*ptr, INT32_MIN); >> +} >> + >> +svbool_t >> +f2 (volatile uint32_t *ptr) >> +{ >> + return svwhilelt_b16_u32 (*ptr, 0); >> +} >> + >> +svbool_t >> +f3 (volatile int64_t *ptr) >> +{ >> + return svwhilelt_b32_s64 (*ptr, INT64_MIN); >> +} >> + >> +svbool_t >> +f4 (volatile uint64_t *ptr) >> +{ >> + return svwhilelt_b64_u64 (*ptr, 0); >> +} >> + >> +/* { dg-final { scan-assembler-times {\tpfalse\tp[0-9]+\.b\n} 4 } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >> new file mode 100644 >> index 00000000000..713065c3145 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >> @@ -0,0 +1,34 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +#include <arm_sve.h> >> +#include <limits.h> >> + >> +svbool_t >> +f1 (volatile int32_t *ptr) >> +{ >> + return svwhilele_b8_s32 (*ptr, INT32_MAX); >> +} >> + >> +svbool_t >> +f2 (volatile uint32_t *ptr) >> +{ >> + return svwhilele_b16_u32 (*ptr, UINT32_MAX); >> +} >> + >> +svbool_t >> +f3 (volatile int64_t *ptr) >> +{ >> + return svwhilele_b32_s64 (*ptr, INT64_MAX); >> +} >> + >> +svbool_t >> +f4 (volatile uint64_t *ptr) >> +{ >> + return svwhilele_b64_u64 (*ptr, UINT64_MAX); >> +} >> + >> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.b(?:, all)\n} } } */ >> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ >> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.s(?:, all)\n} } } */ >> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.d(?:, all)\n} } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >> index 7c73aa5926b..79f0e64b2ae 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >> @@ -28,7 +28,7 @@ test3 (svbool_t *ptr) >> *ptr = svwhilele_b16_s32 (0x7ffffffb, 0x7fffffff); >> } >> >> -/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h, vl5\n} } } */ >> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ >> >> void >> test4 (svbool_t *ptr) >> -- >> 2.25.1
> On 9 Oct 2024, at 17:35, Richard Sandiford <richard.sandiford@arm.com> wrote: > > External email: Use caution opening links or attachments > > > Tamar Christina <Tamar.Christina@arm.com> writes: >> Hi Richard, >> >>> -----Original Message----- >>> From: Richard Sandiford <richard.sandiford@arm.com> >>> Sent: Wednesday, October 9, 2024 12:58 PM >>> To: gcc-patches@gcc.gnu.org >>> Cc: ktkachov@nvidia.com; Richard Earnshaw <Richard.Earnshaw@arm.com>; >>> Tamar Christina <Tamar.Christina@arm.com> >>> Subject: [PATCH] aarch64: Fix folding of degenerate svwhilele case [PR117045] >>> >>> The svwhilele folder mishandled the degenerate case in which >>> the second argument is the maximum integer. In that case, >>> the result is all-true regardless of the first parameter: >>> >>> If the second scalar operand is equal to the maximum signed integer >>> value then a condition which includes an equality test can never fail >>> and the result will be an all-true predicate. >>> >>> This is because the conceptual "increment the first operand >>> by 1 after each element" is done modulo the range of the operand. >>> The GCC code was instead treating it as infinite precision. >>> whilele_5.c even had a test for the incorrect behaviour. >>> >>> The easiest fix seemed to be to handle that case specially before >>> doing constant folding. This also copes with variable first operands. >>> >>> Tested on aarch64-linux-gnu. I'll push on Friday if there are no >>> comments before then. Since it's a wrong-code bug, I'd also like >>> to backport to release branches. >>> >>> Thanks, >>> Richard >>> >>> >>> gcc/ >>> PR target/116999 >>> PR target/117045 >>> * config/aarch64/aarch64-sve-builtins-base.cc >>> (svwhilelx_impl::fold): Check for WHILELTs of the minimum value >>> and WHILELEs of the maximum value. Fold them to all-false and >>> all-true respectively. >>> >>> gcc/testsuite/ >>> PR target/116999 >>> PR target/117045 >>> * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus >>> expected result. >>> * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. >>> * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise. >>> --- >>> .../aarch64/aarch64-sve-builtins-base.cc | 11 +++++- >>> .../aarch64/sve/acle/general/whilele_11.c | 31 +++++++++++++++++ >>> .../aarch64/sve/acle/general/whilele_12.c | 34 +++++++++++++++++++ >>> .../aarch64/sve/acle/general/whilele_5.c | 2 +- >>> 4 files changed, 76 insertions(+), 2 deletions(-) >>> create mode 100644 >>> gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >>> create mode 100644 >>> gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >>> >>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >>> index 4b33585d981..3d0975e4294 100644 >>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >>> @@ -2945,7 +2945,9 @@ public: >>> : while_comparison (unspec_for_sint, unspec_for_uint), m_eq_p (eq_p) >>> {} >>> >>> - /* Try to fold a call by treating its arguments as constants of type T. */ >>> + /* Try to fold a call by treating its arguments as constants of type T. >>> + We have already filtered out the degenerate cases of X .LT. MIN >>> + and X .LE. MAX. */ >>> template<typename T> >>> gimple * >>> fold_type (gimple_folder &f) const >>> @@ -3001,6 +3003,13 @@ public: >>> if (f.vectors_per_tuple () > 1) >>> return nullptr; >>> >>> + /* Filter out cases where the condition is always true or always false. */ >>> + tree arg1 = gimple_call_arg (f.call, 1); >>> + if (!m_eq_p && operand_equal_p (arg1, TYPE_MIN_VALUE (TREE_TYPE >>> (arg1)))) >>> + return f.fold_to_pfalse (); >> >> Just a quick question for my own understanding, I assume the reason MIN >> is handled here is because fold_type will decrement the value at some point? >> >> Otherwise wouldn't MIN + 1 still fit inside the type's precision? >> >> FWIW patch looks good to me, just wondering why the MIN case is needed :) > > I admit it probably isn't needed to fix the bug. I just though it would > look strange if we handled the arg1 extremity for m_eq_p without also > handling it for !m_eq_p. The patch LGTM. Thanks, Kyrill > > Thanks, > Richard > >> >> Cheers, >> Tamar >> >>> + if (m_eq_p && operand_equal_p (arg1, TYPE_MAX_VALUE (TREE_TYPE >>> (arg1)))) >>> + return f.fold_to_ptrue (); >>> + >>> if (f.type_suffix (1).unsigned_p) >>> return fold_type<poly_uint64> (f); >>> else >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >>> new file mode 100644 >>> index 00000000000..2be9dc5c534 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c >>> @@ -0,0 +1,31 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2" } */ >>> + >>> +#include <arm_sve.h> >>> +#include <limits.h> >>> + >>> +svbool_t >>> +f1 (volatile int32_t *ptr) >>> +{ >>> + return svwhilelt_b8_s32 (*ptr, INT32_MIN); >>> +} >>> + >>> +svbool_t >>> +f2 (volatile uint32_t *ptr) >>> +{ >>> + return svwhilelt_b16_u32 (*ptr, 0); >>> +} >>> + >>> +svbool_t >>> +f3 (volatile int64_t *ptr) >>> +{ >>> + return svwhilelt_b32_s64 (*ptr, INT64_MIN); >>> +} >>> + >>> +svbool_t >>> +f4 (volatile uint64_t *ptr) >>> +{ >>> + return svwhilelt_b64_u64 (*ptr, 0); >>> +} >>> + >>> +/* { dg-final { scan-assembler-times {\tpfalse\tp[0-9]+\.b\n} 4 } } */ >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >>> new file mode 100644 >>> index 00000000000..713065c3145 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c >>> @@ -0,0 +1,34 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2" } */ >>> + >>> +#include <arm_sve.h> >>> +#include <limits.h> >>> + >>> +svbool_t >>> +f1 (volatile int32_t *ptr) >>> +{ >>> + return svwhilele_b8_s32 (*ptr, INT32_MAX); >>> +} >>> + >>> +svbool_t >>> +f2 (volatile uint32_t *ptr) >>> +{ >>> + return svwhilele_b16_u32 (*ptr, UINT32_MAX); >>> +} >>> + >>> +svbool_t >>> +f3 (volatile int64_t *ptr) >>> +{ >>> + return svwhilele_b32_s64 (*ptr, INT64_MAX); >>> +} >>> + >>> +svbool_t >>> +f4 (volatile uint64_t *ptr) >>> +{ >>> + return svwhilele_b64_u64 (*ptr, UINT64_MAX); >>> +} >>> + >>> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.b(?:, all)\n} } } */ >>> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ >>> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.s(?:, all)\n} } } */ >>> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.d(?:, all)\n} } } */ >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >>> index 7c73aa5926b..79f0e64b2ae 100644 >>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c >>> @@ -28,7 +28,7 @@ test3 (svbool_t *ptr) >>> *ptr = svwhilele_b16_s32 (0x7ffffffb, 0x7fffffff); >>> } >>> >>> -/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h, vl5\n} } } */ >>> +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ >>> >>> void >>> test4 (svbool_t *ptr) >>> -- >>> 2.25.1
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 4b33585d981..3d0975e4294 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -2945,7 +2945,9 @@ public: : while_comparison (unspec_for_sint, unspec_for_uint), m_eq_p (eq_p) {} - /* Try to fold a call by treating its arguments as constants of type T. */ + /* Try to fold a call by treating its arguments as constants of type T. + We have already filtered out the degenerate cases of X .LT. MIN + and X .LE. MAX. */ template<typename T> gimple * fold_type (gimple_folder &f) const @@ -3001,6 +3003,13 @@ public: if (f.vectors_per_tuple () > 1) return nullptr; + /* Filter out cases where the condition is always true or always false. */ + tree arg1 = gimple_call_arg (f.call, 1); + if (!m_eq_p && operand_equal_p (arg1, TYPE_MIN_VALUE (TREE_TYPE (arg1)))) + return f.fold_to_pfalse (); + if (m_eq_p && operand_equal_p (arg1, TYPE_MAX_VALUE (TREE_TYPE (arg1)))) + return f.fold_to_ptrue (); + if (f.type_suffix (1).unsigned_p) return fold_type<poly_uint64> (f); else diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c new file mode 100644 index 00000000000..2be9dc5c534 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> +#include <limits.h> + +svbool_t +f1 (volatile int32_t *ptr) +{ + return svwhilelt_b8_s32 (*ptr, INT32_MIN); +} + +svbool_t +f2 (volatile uint32_t *ptr) +{ + return svwhilelt_b16_u32 (*ptr, 0); +} + +svbool_t +f3 (volatile int64_t *ptr) +{ + return svwhilelt_b32_s64 (*ptr, INT64_MIN); +} + +svbool_t +f4 (volatile uint64_t *ptr) +{ + return svwhilelt_b64_u64 (*ptr, 0); +} + +/* { dg-final { scan-assembler-times {\tpfalse\tp[0-9]+\.b\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c new file mode 100644 index 00000000000..713065c3145 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include <arm_sve.h> +#include <limits.h> + +svbool_t +f1 (volatile int32_t *ptr) +{ + return svwhilele_b8_s32 (*ptr, INT32_MAX); +} + +svbool_t +f2 (volatile uint32_t *ptr) +{ + return svwhilele_b16_u32 (*ptr, UINT32_MAX); +} + +svbool_t +f3 (volatile int64_t *ptr) +{ + return svwhilele_b32_s64 (*ptr, INT64_MAX); +} + +svbool_t +f4 (volatile uint64_t *ptr) +{ + return svwhilele_b64_u64 (*ptr, UINT64_MAX); +} + +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.b(?:, all)\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.s(?:, all)\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.d(?:, all)\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c index 7c73aa5926b..79f0e64b2ae 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c @@ -28,7 +28,7 @@ test3 (svbool_t *ptr) *ptr = svwhilele_b16_s32 (0x7ffffffb, 0x7fffffff); } -/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h, vl5\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ void test4 (svbool_t *ptr)