mbox series

[0/3,APX,CFCMOV] Support APX CFCMOV

Message ID DM4PR11MB5487233633CD3F1E91C72A84ECC22@DM4PR11MB5487.namprd11.prod.outlook.com
Headers show
Series Support APX CFCMOV | expand

Message

Kong, Lingling June 14, 2024, 3:11 a.m. UTC
APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed
when the condition code evaluates to false and load or store a memory operand. Now we could load or store a
memory operand may trap or fault for conditional move.

In middle-end, now we don't support a conditional move if we knew that a load from A or B could trap or fault.

To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
in if-conversion pass to allow convert to cmov.

All the changes passed bootstrap & regtest x86-64-pc-linux-gnu.
We also tested spec with SDE and passed the runtime test.

Ok for trunk?

[1].https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html

Lingling Kong (3):
  [APX CFCMOV] Add a new target hook: TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
  [APX CFCMOV] Support APX CFCMOV in if_convert pass
  [APX CFCMOV] Support APX CFCMOV in backend

 gcc/config/i386/i386-expand.cc               |  63 +++++
 gcc/config/i386/i386-opts.h                  |   4 +-
 gcc/config/i386/i386.cc                      |  33 ++-
 gcc/config/i386/i386.h                       |   1 +
 gcc/config/i386/i386.md                      |  53 +++-
 gcc/config/i386/i386.opt                     |   3 +
 gcc/config/i386/predicates.md                |   7 +
 gcc/doc/tm.texi                              |   6 +
 gcc/doc/tm.texi.in                           |   2 +
 gcc/ifcvt.cc                                 | 247 ++++++++++++++++++-
 gcc/target.def                               |  11 +
 gcc/targhooks.cc                             |   8 +
 gcc/targhooks.h                              |   1 +
 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c |  73 ++++++
 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c |  40 +++
 15 files changed, 539 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c

Comments

Richard Biener June 14, 2024, 6:52 a.m. UTC | #1
On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com> wrote:
>
> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed
> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a
> memory operand may trap or fault for conditional move.
>
> In middle-end, now we don't support a conditional move if we knew that a load from A or B could trap or fault.

What's the cost of suppressing a fault?  ISTR that for example fault
suppression for vector masked load/store
is quite expensive, so when this is for example done in a loop where
there's always a fault that's suppressed
you can see 1000-fold slowdown.  I would suspect this is similar for
cfcmov?  So how is this reflected in
the decision to if-convert?

> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> in if-conversion pass to allow convert to cmov.
>
> All the changes passed bootstrap & regtest x86-64-pc-linux-gnu.
> We also tested spec with SDE and passed the runtime test.
>
> Ok for trunk?
>
> [1].https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html
>
> Lingling Kong (3):
>   [APX CFCMOV] Add a new target hook: TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
>   [APX CFCMOV] Support APX CFCMOV in if_convert pass
>   [APX CFCMOV] Support APX CFCMOV in backend
>
>  gcc/config/i386/i386-expand.cc               |  63 +++++
>  gcc/config/i386/i386-opts.h                  |   4 +-
>  gcc/config/i386/i386.cc                      |  33 ++-
>  gcc/config/i386/i386.h                       |   1 +
>  gcc/config/i386/i386.md                      |  53 +++-
>  gcc/config/i386/i386.opt                     |   3 +
>  gcc/config/i386/predicates.md                |   7 +
>  gcc/doc/tm.texi                              |   6 +
>  gcc/doc/tm.texi.in                           |   2 +
>  gcc/ifcvt.cc                                 | 247 ++++++++++++++++++-
>  gcc/target.def                               |  11 +
>  gcc/targhooks.cc                             |   8 +
>  gcc/targhooks.h                              |   1 +
>  gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c |  73 ++++++
>  gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c |  40 +++
>  15 files changed, 539 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c
>
> --
> 2.31.1
>
Liu, Hongtao June 14, 2024, 6:58 a.m. UTC | #2
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Friday, June 14, 2024 2:52 PM
> To: Kong, Lingling <lingling.kong@intel.com>
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>; Uros
> Bizjak <ubizjak@gmail.com>
> Subject: Re: [PATCH 0/3] [APX CFCMOV] Support APX CFCMOV
> 
> On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com>
> wrote:
> >
> > APX CFCMOV[1] feature implements conditionally faulting which means
> > that all memory faults are suppressed when the condition code
> > evaluates to false and load or store a memory operand. Now we could load
> or store a memory operand may trap or fault for conditional move.
> >
> > In middle-end, now we don't support a conditional move if we knew that a
> load from A or B could trap or fault.
> 
> What's the cost of suppressing a fault?  ISTR that for example fault suppression
> for vector masked load/store is quite expensive, so when this is for example
Yes, avx512 masked load/store, the cost is expensive when memory is invalid.
> done in a loop where there's always a fault that's suppressed you can see
> 1000-fold slowdown.  I would suspect this is similar for cfcmov?  So how is this
> reflected in the decision to if-convert?
But for APXF, we were told the cost of invalid memory is as cheap as valid ones.
(Why else would this instructions be designed?)
> 
> > To enable CFCMOV, we add a target HOOK
> > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> > in if-conversion pass to allow convert to cmov.
> >
> > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu.
> > We also tested spec with SDE and passed the runtime test.
> >
> > Ok for trunk?
> >
> > [1].https://www.intel.com/content/www/us/en/developer/articles/technic
> > al/advanced-performance-extensions-apx.html
> >
> > Lingling Kong (3):
> >   [APX CFCMOV] Add a new target hook:
> TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> >   [APX CFCMOV] Support APX CFCMOV in if_convert pass
> >   [APX CFCMOV] Support APX CFCMOV in backend
> >
> >  gcc/config/i386/i386-expand.cc               |  63 +++++
> >  gcc/config/i386/i386-opts.h                  |   4 +-
> >  gcc/config/i386/i386.cc                      |  33 ++-
> >  gcc/config/i386/i386.h                       |   1 +
> >  gcc/config/i386/i386.md                      |  53 +++-
> >  gcc/config/i386/i386.opt                     |   3 +
> >  gcc/config/i386/predicates.md                |   7 +
> >  gcc/doc/tm.texi                              |   6 +
> >  gcc/doc/tm.texi.in                           |   2 +
> >  gcc/ifcvt.cc                                 | 247 ++++++++++++++++++-
> >  gcc/target.def                               |  11 +
> >  gcc/targhooks.cc                             |   8 +
> >  gcc/targhooks.h                              |   1 +
> >  gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c |  73 ++++++
> > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c |  40 +++
> >  15 files changed, 539 insertions(+), 13 deletions(-)  create mode
> > 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c
> >
> > --
> > 2.31.1
> >
Richard Biener June 14, 2024, 8:10 a.m. UTC | #3
On Fri, Jun 14, 2024 at 8:58 AM Liu, Hongtao <hongtao.liu@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Richard Biener <richard.guenther@gmail.com>
> > Sent: Friday, June 14, 2024 2:52 PM
> > To: Kong, Lingling <lingling.kong@intel.com>
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>; Uros
> > Bizjak <ubizjak@gmail.com>
> > Subject: Re: [PATCH 0/3] [APX CFCMOV] Support APX CFCMOV
> >
> > On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com>
> > wrote:
> > >
> > > APX CFCMOV[1] feature implements conditionally faulting which means
> > > that all memory faults are suppressed when the condition code
> > > evaluates to false and load or store a memory operand. Now we could load
> > or store a memory operand may trap or fault for conditional move.
> > >
> > > In middle-end, now we don't support a conditional move if we knew that a
> > load from A or B could trap or fault.
> >
> > What's the cost of suppressing a fault?  ISTR that for example fault suppression
> > for vector masked load/store is quite expensive, so when this is for example
> Yes, avx512 masked load/store, the cost is expensive when memory is invalid.
> > done in a loop where there's always a fault that's suppressed you can see
> > 1000-fold slowdown.  I would suspect this is similar for cfcmov?  So how is this
> > reflected in the decision to if-convert?
> But for APXF, we were told the cost of invalid memory is as cheap as valid ones.
> (Why else would this instructions be designed?)

Well - I wondered about this for the AVX512 case, so this isn't a good reason to
expect it to be any better for APXF ;)  But if you have confirmation
this is to be
expected (I would expect the silicon design with APX is finished at
this point, even
if actual hardware is still 2-3 years out), then fine - consider me
positively surprised ;)

Richard.

> >
> > > To enable CFCMOV, we add a target HOOK
> > > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> > > in if-conversion pass to allow convert to cmov.
> > >
> > > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu.
> > > We also tested spec with SDE and passed the runtime test.
> > >
> > > Ok for trunk?
> > >
> > > [1].https://www.intel.com/content/www/us/en/developer/articles/technic
> > > al/advanced-performance-extensions-apx.html
> > >
> > > Lingling Kong (3):
> > >   [APX CFCMOV] Add a new target hook:
> > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> > >   [APX CFCMOV] Support APX CFCMOV in if_convert pass
> > >   [APX CFCMOV] Support APX CFCMOV in backend
> > >
> > >  gcc/config/i386/i386-expand.cc               |  63 +++++
> > >  gcc/config/i386/i386-opts.h                  |   4 +-
> > >  gcc/config/i386/i386.cc                      |  33 ++-
> > >  gcc/config/i386/i386.h                       |   1 +
> > >  gcc/config/i386/i386.md                      |  53 +++-
> > >  gcc/config/i386/i386.opt                     |   3 +
> > >  gcc/config/i386/predicates.md                |   7 +
> > >  gcc/doc/tm.texi                              |   6 +
> > >  gcc/doc/tm.texi.in                           |   2 +
> > >  gcc/ifcvt.cc                                 | 247 ++++++++++++++++++-
> > >  gcc/target.def                               |  11 +
> > >  gcc/targhooks.cc                             |   8 +
> > >  gcc/targhooks.h                              |   1 +
> > >  gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c |  73 ++++++
> > > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c |  40 +++
> > >  15 files changed, 539 insertions(+), 13 deletions(-)  create mode
> > > 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c
> > >
> > > --
> > > 2.31.1
> > >
Alexander Monakov June 14, 2024, 5:10 p.m. UTC | #4
On Fri, 14 Jun 2024, Kong, Lingling wrote:

> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed
> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a
> memory operand may trap or fault for conditional move.
> 
> In middle-end, now we don't support a conditional move if we knew that a load
> from A or B could trap or fault.

Predicated loads&stores on Itanium don't trap either. They are modeled via
COND_EXEC on RTL. The late if-conversion pass (the instance that runs after
reload) is capable of introducing them.

> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> in if-conversion pass to allow convert to cmov.

Considering the above, is the new hook really necessary? Can you model the new
instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ?

Alexander
Jeff Law June 14, 2024, 5:22 p.m. UTC | #5
On 6/14/24 11:10 AM, Alexander Monakov wrote:
> 
> On Fri, 14 Jun 2024, Kong, Lingling wrote:
> 
>> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed
>> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a
>> memory operand may trap or fault for conditional move.
>>
>> In middle-end, now we don't support a conditional move if we knew that a load
>> from A or B could trap or fault.
> 
> Predicated loads&stores on Itanium don't trap either. They are modeled via
> COND_EXEC on RTL. The late if-conversion pass (the instance that runs after
> reload) is capable of introducing them.
> 
>> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
>> in if-conversion pass to allow convert to cmov.
> 
> Considering the above, is the new hook really necessary? Can you model the new
> instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ?
Note that turning on cond_exec will turn off some of the cmove support.

But the general suggesting of trying to avoid a hook for this is a good 
one.  In fact, my first reaction to this thread was "do we really need a 
hook for this".

jeff
Hongtao Liu June 17, 2024, 3:04 a.m. UTC | #6
On Sat, Jun 15, 2024 at 1:22 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 6/14/24 11:10 AM, Alexander Monakov wrote:
> >
> > On Fri, 14 Jun 2024, Kong, Lingling wrote:
> >
> >> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed
> >> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a
> >> memory operand may trap or fault for conditional move.
> >>
> >> In middle-end, now we don't support a conditional move if we knew that a load
> >> from A or B could trap or fault.
> >
> > Predicated loads&stores on Itanium don't trap either. They are modeled via
> > COND_EXEC on RTL. The late if-conversion pass (the instance that runs after
> > reload) is capable of introducing them.
> >
> >> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP
> >> in if-conversion pass to allow convert to cmov.
> >
> > Considering the above, is the new hook really necessary? Can you model the new
> > instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ?
> Note that turning on cond_exec will turn off some of the cmove support.
Yes, cfcmov looks more like a cmov than a cond_exec.
>
> But the general suggesting of trying to avoid a hook for this is a good
> one.  In fact, my first reaction to this thread was "do we really need a
> hook for this".
Maybe a new optab, .i.e cfmovmodecc, and it differs from movcc for
Conditional Fault?
>
> jeff