Message ID | DM4PR11MB5487233633CD3F1E91C72A84ECC22@DM4PR11MB5487.namprd11.prod.outlook.com |
---|---|
Headers | show |
Series | Support APX CFCMOV | expand |
On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com> wrote: > > APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed > when the condition code evaluates to false and load or store a memory operand. Now we could load or store a > memory operand may trap or fault for conditional move. > > In middle-end, now we don't support a conditional move if we knew that a load from A or B could trap or fault. What's the cost of suppressing a fault? ISTR that for example fault suppression for vector masked load/store is quite expensive, so when this is for example done in a loop where there's always a fault that's suppressed you can see 1000-fold slowdown. I would suspect this is similar for cfcmov? So how is this reflected in the decision to if-convert? > To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > in if-conversion pass to allow convert to cmov. > > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu. > We also tested spec with SDE and passed the runtime test. > > Ok for trunk? > > [1].https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html > > Lingling Kong (3): > [APX CFCMOV] Add a new target hook: TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > [APX CFCMOV] Support APX CFCMOV in if_convert pass > [APX CFCMOV] Support APX CFCMOV in backend > > gcc/config/i386/i386-expand.cc | 63 +++++ > gcc/config/i386/i386-opts.h | 4 +- > gcc/config/i386/i386.cc | 33 ++- > gcc/config/i386/i386.h | 1 + > gcc/config/i386/i386.md | 53 +++- > gcc/config/i386/i386.opt | 3 + > gcc/config/i386/predicates.md | 7 + > gcc/doc/tm.texi | 6 + > gcc/doc/tm.texi.in | 2 + > gcc/ifcvt.cc | 247 ++++++++++++++++++- > gcc/target.def | 11 + > gcc/targhooks.cc | 8 + > gcc/targhooks.h | 1 + > gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c | 73 ++++++ > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c | 40 +++ > 15 files changed, 539 insertions(+), 13 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c > > -- > 2.31.1 >
> -----Original Message----- > From: Richard Biener <richard.guenther@gmail.com> > Sent: Friday, June 14, 2024 2:52 PM > To: Kong, Lingling <lingling.kong@intel.com> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>; Uros > Bizjak <ubizjak@gmail.com> > Subject: Re: [PATCH 0/3] [APX CFCMOV] Support APX CFCMOV > > On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com> > wrote: > > > > APX CFCMOV[1] feature implements conditionally faulting which means > > that all memory faults are suppressed when the condition code > > evaluates to false and load or store a memory operand. Now we could load > or store a memory operand may trap or fault for conditional move. > > > > In middle-end, now we don't support a conditional move if we knew that a > load from A or B could trap or fault. > > What's the cost of suppressing a fault? ISTR that for example fault suppression > for vector masked load/store is quite expensive, so when this is for example Yes, avx512 masked load/store, the cost is expensive when memory is invalid. > done in a loop where there's always a fault that's suppressed you can see > 1000-fold slowdown. I would suspect this is similar for cfcmov? So how is this > reflected in the decision to if-convert? But for APXF, we were told the cost of invalid memory is as cheap as valid ones. (Why else would this instructions be designed?) > > > To enable CFCMOV, we add a target HOOK > > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > in if-conversion pass to allow convert to cmov. > > > > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu. > > We also tested spec with SDE and passed the runtime test. > > > > Ok for trunk? > > > > [1].https://www.intel.com/content/www/us/en/developer/articles/technic > > al/advanced-performance-extensions-apx.html > > > > Lingling Kong (3): > > [APX CFCMOV] Add a new target hook: > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > [APX CFCMOV] Support APX CFCMOV in if_convert pass > > [APX CFCMOV] Support APX CFCMOV in backend > > > > gcc/config/i386/i386-expand.cc | 63 +++++ > > gcc/config/i386/i386-opts.h | 4 +- > > gcc/config/i386/i386.cc | 33 ++- > > gcc/config/i386/i386.h | 1 + > > gcc/config/i386/i386.md | 53 +++- > > gcc/config/i386/i386.opt | 3 + > > gcc/config/i386/predicates.md | 7 + > > gcc/doc/tm.texi | 6 + > > gcc/doc/tm.texi.in | 2 + > > gcc/ifcvt.cc | 247 ++++++++++++++++++- > > gcc/target.def | 11 + > > gcc/targhooks.cc | 8 + > > gcc/targhooks.h | 1 + > > gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c | 73 ++++++ > > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c | 40 +++ > > 15 files changed, 539 insertions(+), 13 deletions(-) create mode > > 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c > > > > -- > > 2.31.1 > >
On Fri, Jun 14, 2024 at 8:58 AM Liu, Hongtao <hongtao.liu@intel.com> wrote: > > > > > -----Original Message----- > > From: Richard Biener <richard.guenther@gmail.com> > > Sent: Friday, June 14, 2024 2:52 PM > > To: Kong, Lingling <lingling.kong@intel.com> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>; Uros > > Bizjak <ubizjak@gmail.com> > > Subject: Re: [PATCH 0/3] [APX CFCMOV] Support APX CFCMOV > > > > On Fri, Jun 14, 2024 at 5:12 AM Kong, Lingling <lingling.kong@intel.com> > > wrote: > > > > > > APX CFCMOV[1] feature implements conditionally faulting which means > > > that all memory faults are suppressed when the condition code > > > evaluates to false and load or store a memory operand. Now we could load > > or store a memory operand may trap or fault for conditional move. > > > > > > In middle-end, now we don't support a conditional move if we knew that a > > load from A or B could trap or fault. > > > > What's the cost of suppressing a fault? ISTR that for example fault suppression > > for vector masked load/store is quite expensive, so when this is for example > Yes, avx512 masked load/store, the cost is expensive when memory is invalid. > > done in a loop where there's always a fault that's suppressed you can see > > 1000-fold slowdown. I would suspect this is similar for cfcmov? So how is this > > reflected in the decision to if-convert? > But for APXF, we were told the cost of invalid memory is as cheap as valid ones. > (Why else would this instructions be designed?) Well - I wondered about this for the AVX512 case, so this isn't a good reason to expect it to be any better for APXF ;) But if you have confirmation this is to be expected (I would expect the silicon design with APX is finished at this point, even if actual hardware is still 2-3 years out), then fine - consider me positively surprised ;) Richard. > > > > > To enable CFCMOV, we add a target HOOK > > > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > > in if-conversion pass to allow convert to cmov. > > > > > > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu. > > > We also tested spec with SDE and passed the runtime test. > > > > > > Ok for trunk? > > > > > > [1].https://www.intel.com/content/www/us/en/developer/articles/technic > > > al/advanced-performance-extensions-apx.html > > > > > > Lingling Kong (3): > > > [APX CFCMOV] Add a new target hook: > > TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > > > [APX CFCMOV] Support APX CFCMOV in if_convert pass > > > [APX CFCMOV] Support APX CFCMOV in backend > > > > > > gcc/config/i386/i386-expand.cc | 63 +++++ > > > gcc/config/i386/i386-opts.h | 4 +- > > > gcc/config/i386/i386.cc | 33 ++- > > > gcc/config/i386/i386.h | 1 + > > > gcc/config/i386/i386.md | 53 +++- > > > gcc/config/i386/i386.opt | 3 + > > > gcc/config/i386/predicates.md | 7 + > > > gcc/doc/tm.texi | 6 + > > > gcc/doc/tm.texi.in | 2 + > > > gcc/ifcvt.cc | 247 ++++++++++++++++++- > > > gcc/target.def | 11 + > > > gcc/targhooks.cc | 8 + > > > gcc/targhooks.h | 1 + > > > gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c | 73 ++++++ > > > gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c | 40 +++ > > > 15 files changed, 539 insertions(+), 13 deletions(-) create mode > > > 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c > > > > > > -- > > > 2.31.1 > > >
On Fri, 14 Jun 2024, Kong, Lingling wrote: > APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed > when the condition code evaluates to false and load or store a memory operand. Now we could load or store a > memory operand may trap or fault for conditional move. > > In middle-end, now we don't support a conditional move if we knew that a load > from A or B could trap or fault. Predicated loads&stores on Itanium don't trap either. They are modeled via COND_EXEC on RTL. The late if-conversion pass (the instance that runs after reload) is capable of introducing them. > To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > in if-conversion pass to allow convert to cmov. Considering the above, is the new hook really necessary? Can you model the new instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ? Alexander
On 6/14/24 11:10 AM, Alexander Monakov wrote: > > On Fri, 14 Jun 2024, Kong, Lingling wrote: > >> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed >> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a >> memory operand may trap or fault for conditional move. >> >> In middle-end, now we don't support a conditional move if we knew that a load >> from A or B could trap or fault. > > Predicated loads&stores on Itanium don't trap either. They are modeled via > COND_EXEC on RTL. The late if-conversion pass (the instance that runs after > reload) is capable of introducing them. > >> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP >> in if-conversion pass to allow convert to cmov. > > Considering the above, is the new hook really necessary? Can you model the new > instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ? Note that turning on cond_exec will turn off some of the cmove support. But the general suggesting of trying to avoid a hook for this is a good one. In fact, my first reaction to this thread was "do we really need a hook for this". jeff
On Sat, Jun 15, 2024 at 1:22 AM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 6/14/24 11:10 AM, Alexander Monakov wrote: > > > > On Fri, 14 Jun 2024, Kong, Lingling wrote: > > > >> APX CFCMOV[1] feature implements conditionally faulting which means that all memory faults are suppressed > >> when the condition code evaluates to false and load or store a memory operand. Now we could load or store a > >> memory operand may trap or fault for conditional move. > >> > >> In middle-end, now we don't support a conditional move if we knew that a load > >> from A or B could trap or fault. > > > > Predicated loads&stores on Itanium don't trap either. They are modeled via > > COND_EXEC on RTL. The late if-conversion pass (the instance that runs after > > reload) is capable of introducing them. > > > >> To enable CFCMOV, we add a target HOOK TARGET_HAVE_CONDITIONAL_MOVE_MEM_NOTRAP > >> in if-conversion pass to allow convert to cmov. > > > > Considering the above, is the new hook really necessary? Can you model the new > > instructions via (cond_exec () (set ...)) instead of (set (if_then_else ...)) ? > Note that turning on cond_exec will turn off some of the cmove support. Yes, cfcmov looks more like a cmov than a cond_exec. > > But the general suggesting of trying to avoid a hook for this is a good > one. In fact, my first reaction to this thread was "do we really need a > hook for this". Maybe a new optab, .i.e cfmovmodecc, and it differs from movcc for Conditional Fault? > > jeff