Message ID | 20230831082024.314097-7-hongyu.wang@intel.com |
---|---|
State | New |
Headers | show |
Series | Support Intel APX EGPR | expand |
On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > From: Kong Lingling <lingling.kong@intel.com> > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > usage by default from mapping the common reg/mem constraint to non-EGPR > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > for inline asm. > > gcc/ChangeLog: > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > ix86_md_asm_adjust. > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > target option, map reg/mem constraints to non-EGPR constraints. > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > --- > gcc/config/i386/i386.cc | 44 +++++++ > gcc/config/i386/i386.opt | 5 + > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > 3 files changed, 156 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index d26d9ab0d9d..9460ebbfda4 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > along with GCC; see the file COPYING3. If not see > <http://www.gnu.org/licenses/>. */ > > +#define INCLUDE_STRING > #define IN_TARGET_CODE 1 > > #include "config.h" > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > bool saw_asm_flag = false; > > start_sequence (); > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > + constraints, will eventually map all the usable constraints in the future. */ I think there should be some constraint which explicitly has all the 32 GPRs, like there is one for just all 16 GPRs (h), so that regardless of -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. Also, what about the "g" constraint? Shouldn't there be another for "g" without r16..r31? What about the various other memory constraints ("<", "o", ...)? > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > + { > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > + and replace only r, exclude Br and Yr. */ > + for (unsigned i = 0; i < constraints.length (); i++) > + { > + std::string *s = new std::string (constraints[i]); Doesn't this leak memory (all the time)? I must say I don't really understand why you need to use std::string here, but certainly it shouldn't leak. > + size_t pos = s->find ('r'); > + while (pos != std::string::npos) > + { > + if (pos > 0 > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > + pos = s->find ('r', pos + 1); > + else > + { > + s->replace (pos, 1, "h"); > + constraints[i] = (const char*) s->c_str (); Formatting (space before *). The usual way for constraints is ggc_strdup on some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't that appear just in clobbers? And that doesn't look like something that should be replaced), Bm, e.g. in various alternatives. So, you need to change them all, not just the first hit. "r,r,r,m" and the like. Normally, one would simply walk the constraint string, parsing the special letters (+, =, & etc.) and single letter constraints and 2 letter constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). Either do it in 2 passes, first one counts how long constraint string one will need after the adjustments (and whether to adjust something at all), then if needed XALLOCAVEC it and adjust in there, or say use a auto_vec<char, 32> for it. > + break; > + } > + } > + } > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > + "Bt/Bt/BT". */ > + for (unsigned i = 0; i < constraints.length (); i++) > + { > + std::string *s = new std::string (constraints[i]); > + size_t pos = s->find ("m"); > + size_t pos2 = s->find ("memory"); > + if (pos != std::string::npos) > + { > + if (pos > 0 && (s->at (pos - 1) == 'B')) > + s->replace (pos - 1, 2, "BT"); > + else if (pos2 != std::string::npos) > + s->replace (pos, 6, "Bt"); > + else > + s->replace (pos, 1, "Bt"); Formatting, the s->replace calls are indented too much. Jakub
On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > From: Kong Lingling <lingling.kong@intel.com> > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > usage by default from mapping the common reg/mem constraint to non-EGPR > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > for inline asm. > > > > gcc/ChangeLog: > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > ix86_md_asm_adjust. > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > target option, map reg/mem constraints to non-EGPR constraints. > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > --- > > gcc/config/i386/i386.cc | 44 +++++++ > > gcc/config/i386/i386.opt | 5 + > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > 3 files changed, 156 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index d26d9ab0d9d..9460ebbfda4 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > along with GCC; see the file COPYING3. If not see > > <http://www.gnu.org/licenses/>. */ > > > > +#define INCLUDE_STRING > > #define IN_TARGET_CODE 1 > > > > #include "config.h" > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > bool saw_asm_flag = false; > > > > start_sequence (); > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > + constraints, will eventually map all the usable constraints in the future. */ > > I think there should be some constraint which explicitly has all the 32 > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > Also, what about the "g" constraint? Shouldn't there be another for "g" > without r16..r31? What about the various other memory > constraints ("<", "o", ...)? I think we should leave all existing constraints as they are, so "r" covers only GPR16, "m" and "o" to only use GPR16. We can then introduce "h" to instructions that have the ability to handle EGPR. This would be somehow similar to the SSE -> AVX512F transition, where we still have "x" for SSE16 and "v" was introduced as a separate register class for EVEX SSE registers. This way, asm will be compatible, when "r", "m", "o" and "g" are used. The new memory constraint "Bt", should allow new registers, and should be added to the constraint string as a separate constraint, and conditionally enabled by relevant "isa" (AKA "enabled") attribute. Uros. > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > + { > > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > > + and replace only r, exclude Br and Yr. */ > > + for (unsigned i = 0; i < constraints.length (); i++) > > + { > > + std::string *s = new std::string (constraints[i]); > > Doesn't this leak memory (all the time)? > I must say I don't really understand why you need to use std::string here, > but certainly it shouldn't leak. > > > + size_t pos = s->find ('r'); > > + while (pos != std::string::npos) > > + { > > + if (pos > 0 > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > + pos = s->find ('r', pos + 1); > > + else > > + { > > + s->replace (pos, 1, "h"); > > + constraints[i] = (const char*) s->c_str (); > > Formatting (space before *). The usual way for constraints is ggc_strdup on > some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't > that appear just in clobbers? And that doesn't look like something that > should be replaced), Bm, e.g. in various alternatives. So, you > need to change them all, not just the first hit. "r,r,r,m" and the like. > Normally, one would simply walk the constraint string, parsing the special > letters (+, =, & etc.) and single letter constraints and 2 letter > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). > Either do it in 2 passes, first one counts how long constraint string one > will need after the adjustments (and whether to adjust something at all), > then if needed XALLOCAVEC it and adjust in there, or say use a > auto_vec<char, 32> for > it. > > > + break; > > + } > > + } > > + } > > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > > + "Bt/Bt/BT". */ > > + for (unsigned i = 0; i < constraints.length (); i++) > > + { > > + std::string *s = new std::string (constraints[i]); > > + size_t pos = s->find ("m"); > > + size_t pos2 = s->find ("memory"); > > + if (pos != std::string::npos) > > + { > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > + s->replace (pos - 1, 2, "BT"); > > + else if (pos2 != std::string::npos) > > + s->replace (pos, 6, "Bt"); > > + else > > + s->replace (pos, 1, "Bt"); > > Formatting, the s->replace calls are indented too much. > > Jakub >
Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 17:18写道: > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > From: Kong Lingling <lingling.kong@intel.com> > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > usage by default from mapping the common reg/mem constraint to non-EGPR > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > for inline asm. > > > > gcc/ChangeLog: > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > ix86_md_asm_adjust. > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > target option, map reg/mem constraints to non-EGPR constraints. > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > --- > > gcc/config/i386/i386.cc | 44 +++++++ > > gcc/config/i386/i386.opt | 5 + > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > 3 files changed, 156 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index d26d9ab0d9d..9460ebbfda4 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > along with GCC; see the file COPYING3. If not see > > <http://www.gnu.org/licenses/>. */ > > > > +#define INCLUDE_STRING > > #define IN_TARGET_CODE 1 > > > > #include "config.h" > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > bool saw_asm_flag = false; > > > > start_sequence (); > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > + constraints, will eventually map all the usable constraints in the future. */ > > I think there should be some constraint which explicitly has all the 32 > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > Yes, we will add new register constraints. For memory constraints it requires some special handling in ix86_memory_address_use_extended_reg_class_p > Also, what about the "g" constraint? Shouldn't there be another for "g" > without r16..r31? What about the various other memory > constraints ("<", "o", ...)? We will support fully mapping of all common constraints, with refining of current mapping in the V2 patch. > > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > + { > > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > > + and replace only r, exclude Br and Yr. */ > > + for (unsigned i = 0; i < constraints.length (); i++) > > + { > > + std::string *s = new std::string (constraints[i]); > > Doesn't this leak memory (all the time)? > I must say I don't really understand why you need to use std::string here, > but certainly it shouldn't leak. std::string just makes the code shorter than using str functions. Current code will be completely refactored with supporting more mapping of constraints. > > > + size_t pos = s->find ('r'); > > + while (pos != std::string::npos) > > + { > > + if (pos > 0 > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > + pos = s->find ('r', pos + 1); > > + else > > + { > > + s->replace (pos, 1, "h"); > > + constraints[i] = (const char*) s->c_str (); > > Formatting (space before *). The usual way for constraints is ggc_strdup on > some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't > that appear just in clobbers? And that doesn't look like something that > should be replaced), Bm, e.g. in various alternatives. So, you > need to change them all, not just the first hit. "r,r,r,m" and the like. > Normally, one would simply walk the constraint string, parsing the special > letters (+, =, & etc.) and single letter constraints and 2 letter > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). > Either do it in 2 passes, first one counts how long constraint string one > will need after the adjustments (and whether to adjust something at all), > then if needed XALLOCAVEC it and adjust in there, or say use a > auto_vec<char, 32> for > it. Thanks for your guidance. Previously we thought constraints[i] was a splitted simple constraint, but clearly it is not. We will refer to an existing example of this and rewrite current one. > > > + break; > > + } > > + } > > + } > > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > > + "Bt/Bt/BT". */ > > + for (unsigned i = 0; i < constraints.length (); i++) > > + { > > + std::string *s = new std::string (constraints[i]); > > + size_t pos = s->find ("m"); > > + size_t pos2 = s->find ("memory"); > > + if (pos != std::string::npos) > > + { > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > + s->replace (pos - 1, 2, "BT"); > > + else if (pos2 != std::string::npos) > > + s->replace (pos, 6, "Bt"); > > + else > > + s->replace (pos, 1, "Bt"); > > Formatting, the s->replace calls are indented too much. > > Jakub >
Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > > From: Kong Lingling <lingling.kong@intel.com> > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > > usage by default from mapping the common reg/mem constraint to non-EGPR > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > for inline asm. > > > > > > gcc/ChangeLog: > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > ix86_md_asm_adjust. > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > > target option, map reg/mem constraints to non-EGPR constraints. > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > --- > > > gcc/config/i386/i386.cc | 44 +++++++ > > > gcc/config/i386/i386.opt | 5 + > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > > 3 files changed, 156 insertions(+) > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > --- a/gcc/config/i386/i386.cc > > > +++ b/gcc/config/i386/i386.cc > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > > along with GCC; see the file COPYING3. If not see > > > <http://www.gnu.org/licenses/>. */ > > > > > > +#define INCLUDE_STRING > > > #define IN_TARGET_CODE 1 > > > > > > #include "config.h" > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > > bool saw_asm_flag = false; > > > > > > start_sequence (); > > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > > + constraints, will eventually map all the usable constraints in the future. */ > > > > I think there should be some constraint which explicitly has all the 32 > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > without r16..r31? What about the various other memory > > constraints ("<", "o", ...)? > > I think we should leave all existing constraints as they are, so "r" > covers only GPR16, "m" and "o" to only use GPR16. We can then > introduce "h" to instructions that have the ability to handle EGPR. > This would be somehow similar to the SSE -> AVX512F transition, where > we still have "x" for SSE16 and "v" was introduced as a separate > register class for EVEX SSE registers. This way, asm will be > compatible, when "r", "m", "o" and "g" are used. The new memory > constraint "Bt", should allow new registers, and should be added to > the constraint string as a separate constraint, and conditionally > enabled by relevant "isa" (AKA "enabled") attribute. The extended constraint can work for registers, but for memory it is more complicated. If we want to use new mem constraints that allow gpr32, then BASE/INDEX reg class still requires per-insn verification, so it means changes on all patterns with vm, and those SSE patterns on opcode map0/1. Also, several legacy insns that are promoted to EVEX encoding space need to be changed. The overall implementation could be 10 times larger than current, which would be quite hard for maintenance. > > Uros. > > > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > > + { > > > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > > > + and replace only r, exclude Br and Yr. */ > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > + { > > > + std::string *s = new std::string (constraints[i]); > > > > Doesn't this leak memory (all the time)? > > I must say I don't really understand why you need to use std::string here, > > but certainly it shouldn't leak. > > > > > + size_t pos = s->find ('r'); > > > + while (pos != std::string::npos) > > > + { > > > + if (pos > 0 > > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > > + pos = s->find ('r', pos + 1); > > > + else > > > + { > > > + s->replace (pos, 1, "h"); > > > + constraints[i] = (const char*) s->c_str (); > > > > Formatting (space before *). The usual way for constraints is ggc_strdup on > > some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't > > that appear just in clobbers? And that doesn't look like something that > > should be replaced), Bm, e.g. in various alternatives. So, you > > need to change them all, not just the first hit. "r,r,r,m" and the like. > > Normally, one would simply walk the constraint string, parsing the special > > letters (+, =, & etc.) and single letter constraints and 2 letter > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). > > Either do it in 2 passes, first one counts how long constraint string one > > will need after the adjustments (and whether to adjust something at all), > > then if needed XALLOCAVEC it and adjust in there, or say use a > > auto_vec<char, 32> for > > it. > > > > > + break; > > > + } > > > + } > > > + } > > > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > > > + "Bt/Bt/BT". */ > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > + { > > > + std::string *s = new std::string (constraints[i]); > > > + size_t pos = s->find ("m"); > > > + size_t pos2 = s->find ("memory"); > > > + if (pos != std::string::npos) > > > + { > > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > > + s->replace (pos - 1, 2, "BT"); > > > + else if (pos2 != std::string::npos) > > > + s->replace (pos, 6, "Bt"); > > > + else > > > + s->replace (pos, 1, "Bt"); > > > > Formatting, the s->replace calls are indented too much. > > > > Jakub > >
On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > > > From: Kong Lingling <lingling.kong@intel.com> > > > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > > > usage by default from mapping the common reg/mem constraint to non-EGPR > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > > for inline asm. > > > > > > > > gcc/ChangeLog: > > > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > > ix86_md_asm_adjust. > > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > > > target option, map reg/mem constraints to non-EGPR constraints. > > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > > --- > > > > gcc/config/i386/i386.cc | 44 +++++++ > > > > gcc/config/i386/i386.opt | 5 + > > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > > > 3 files changed, 156 insertions(+) > > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > > --- a/gcc/config/i386/i386.cc > > > > +++ b/gcc/config/i386/i386.cc > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > > > along with GCC; see the file COPYING3. If not see > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > +#define INCLUDE_STRING > > > > #define IN_TARGET_CODE 1 > > > > > > > > #include "config.h" > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > > > bool saw_asm_flag = false; > > > > > > > > start_sequence (); > > > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > > > + constraints, will eventually map all the usable constraints in the future. */ > > > > > > I think there should be some constraint which explicitly has all the 32 > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > without r16..r31? What about the various other memory > > > constraints ("<", "o", ...)? > > > > I think we should leave all existing constraints as they are, so "r" > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > introduce "h" to instructions that have the ability to handle EGPR. > > This would be somehow similar to the SSE -> AVX512F transition, where > > we still have "x" for SSE16 and "v" was introduced as a separate > > register class for EVEX SSE registers. This way, asm will be > > compatible, when "r", "m", "o" and "g" are used. The new memory > > constraint "Bt", should allow new registers, and should be added to > > the constraint string as a separate constraint, and conditionally > > enabled by relevant "isa" (AKA "enabled") attribute. > > The extended constraint can work for registers, but for memory it is more > complicated. Yes, unfortunately. The compiler assumes that an unchangeable register class is used for BASE/INDEX registers. I have hit this limitation when trying to implement memory support for instructions involving 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX registers, also inside memory operand. (You can see the "hack" in e.g. *extzvqi_mem_rex64" and corresponding peephole2 with the original *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX register class is the major limitation in the compiler, so perhaps the strategy on how to override this limitation should be discussed with the register allocator author first. Perhaps adding an insn attribute to insn RTX pattern to specify different BASE/INDEX register sets can be a better solution than passing insn RTX to the register allocator. The above idea still does not solve the asm problem on how to select correct BASE/INDEX register set for memory operands. Uros. > > If we want to use new mem constraints that allow gpr32, then BASE/INDEX > reg class still requires per-insn verification, so it means changes > on all patterns with vm, and those SSE patterns on opcode map0/1. Also, > several legacy insns that are promoted to EVEX encoding space need to be > changed. The overall implementation could be 10 times larger than current, > which would be quite hard for maintenance. > > > > > Uros. > > > > > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > > > + { > > > > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > > > > + and replace only r, exclude Br and Yr. */ > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > + { > > > > + std::string *s = new std::string (constraints[i]); > > > > > > Doesn't this leak memory (all the time)? > > > I must say I don't really understand why you need to use std::string here, > > > but certainly it shouldn't leak. > > > > > > > + size_t pos = s->find ('r'); > > > > + while (pos != std::string::npos) > > > > + { > > > > + if (pos > 0 > > > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > > > + pos = s->find ('r', pos + 1); > > > > + else > > > > + { > > > > + s->replace (pos, 1, "h"); > > > > + constraints[i] = (const char*) s->c_str (); > > > > > > Formatting (space before *). The usual way for constraints is ggc_strdup on > > > some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't > > > that appear just in clobbers? And that doesn't look like something that > > > should be replaced), Bm, e.g. in various alternatives. So, you > > > need to change them all, not just the first hit. "r,r,r,m" and the like. > > > Normally, one would simply walk the constraint string, parsing the special > > > letters (+, =, & etc.) and single letter constraints and 2 letter > > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). > > > Either do it in 2 passes, first one counts how long constraint string one > > > will need after the adjustments (and whether to adjust something at all), > > > then if needed XALLOCAVEC it and adjust in there, or say use a > > > auto_vec<char, 32> for > > > it. > > > > > > > + break; > > > > + } > > > > + } > > > > + } > > > > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > > > > + "Bt/Bt/BT". */ > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > + { > > > > + std::string *s = new std::string (constraints[i]); > > > > + size_t pos = s->find ("m"); > > > > + size_t pos2 = s->find ("memory"); > > > > + if (pos != std::string::npos) > > > > + { > > > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > > > + s->replace (pos - 1, 2, "BT"); > > > > + else if (pos2 != std::string::npos) > > > > + s->replace (pos, 6, "Bt"); > > > > + else > > > > + s->replace (pos, 1, "Bt"); > > > > > > Formatting, the s->replace calls are indented too much. > > > > > > Jakub > > >
On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > > > > From: Kong Lingling <lingling.kong@intel.com> > > > > > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > > > > usage by default from mapping the common reg/mem constraint to non-EGPR > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > > > for inline asm. > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > > > ix86_md_asm_adjust. > > > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > > > > target option, map reg/mem constraints to non-EGPR constraints. > > > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > > > --- > > > > > gcc/config/i386/i386.cc | 44 +++++++ > > > > > gcc/config/i386/i386.opt | 5 + > > > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > > > > 3 files changed, 156 insertions(+) > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > > > --- a/gcc/config/i386/i386.cc > > > > > +++ b/gcc/config/i386/i386.cc > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > > > > along with GCC; see the file COPYING3. If not see > > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > > > +#define INCLUDE_STRING > > > > > #define IN_TARGET_CODE 1 > > > > > > > > > > #include "config.h" > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > > > > bool saw_asm_flag = false; > > > > > > > > > > start_sequence (); > > > > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > > > > + constraints, will eventually map all the usable constraints in the future. */ > > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > without r16..r31? What about the various other memory > > > > constraints ("<", "o", ...)? > > > > > > I think we should leave all existing constraints as they are, so "r" > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > introduce "h" to instructions that have the ability to handle EGPR. > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > register class for EVEX SSE registers. This way, asm will be > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > constraint "Bt", should allow new registers, and should be added to > > > the constraint string as a separate constraint, and conditionally > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > The extended constraint can work for registers, but for memory it is more > > complicated. > > Yes, unfortunately. The compiler assumes that an unchangeable register > class is used for BASE/INDEX registers. I have hit this limitation > when trying to implement memory support for instructions involving > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > registers, also inside memory operand. (You can see the "hack" in e.g. > *extzvqi_mem_rex64" and corresponding peephole2 with the original > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > register class is the major limitation in the compiler, so perhaps the > strategy on how to override this limitation should be discussed with > the register allocator author first. Perhaps adding an insn attribute > to insn RTX pattern to specify different BASE/INDEX register sets can > be a better solution than passing insn RTX to the register allocator. > > The above idea still does not solve the asm problem on how to select > correct BASE/INDEX register set for memory operands. The current approach disables gpr32 for memory operand in asm_operand by default. but can be turned on by options ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction supports gpr32). Only ~ 5% of total instructions don't support gpr32, reversed approach only gonna get more complicated. > > Uros. > > > > If we want to use new mem constraints that allow gpr32, then BASE/INDEX > > reg class still requires per-insn verification, so it means changes > > on all patterns with vm, and those SSE patterns on opcode map0/1. Also, > > several legacy insns that are promoted to EVEX encoding space need to be > > changed. The overall implementation could be 10 times larger than current, > > which would be quite hard for maintenance. > > > > > > > > Uros. > > > > > > > > + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) > > > > > + { > > > > > + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 > > > > > + and replace only r, exclude Br and Yr. */ > > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > > + { > > > > > + std::string *s = new std::string (constraints[i]); > > > > > > > > Doesn't this leak memory (all the time)? > > > > I must say I don't really understand why you need to use std::string here, > > > > but certainly it shouldn't leak. > > > > > > > > > + size_t pos = s->find ('r'); > > > > > + while (pos != std::string::npos) > > > > > + { > > > > > + if (pos > 0 > > > > > + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) > > > > > + pos = s->find ('r', pos + 1); > > > > > + else > > > > > + { > > > > > + s->replace (pos, 1, "h"); > > > > > + constraints[i] = (const char*) s->c_str (); > > > > > > > > Formatting (space before *). The usual way for constraints is ggc_strdup on > > > > some string in a buffer. Also, one could have several copies or r (or m, memory (doesn't > > > > that appear just in clobbers? And that doesn't look like something that > > > > should be replaced), Bm, e.g. in various alternatives. So, you > > > > need to change them all, not just the first hit. "r,r,r,m" and the like. > > > > Normally, one would simply walk the constraint string, parsing the special > > > > letters (+, =, & etc.) and single letter constraints and 2 letter > > > > constraints using CONSTRAINT_LEN macro (tons of examples in GCC sources). > > > > Either do it in 2 passes, first one counts how long constraint string one > > > > will need after the adjustments (and whether to adjust something at all), > > > > then if needed XALLOCAVEC it and adjust in there, or say use a > > > > auto_vec<char, 32> for > > > > it. > > > > > > > > > + break; > > > > > + } > > > > > + } > > > > > + } > > > > > + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with > > > > > + "Bt/Bt/BT". */ > > > > > + for (unsigned i = 0; i < constraints.length (); i++) > > > > > + { > > > > > + std::string *s = new std::string (constraints[i]); > > > > > + size_t pos = s->find ("m"); > > > > > + size_t pos2 = s->find ("memory"); > > > > > + if (pos != std::string::npos) > > > > > + { > > > > > + if (pos > 0 && (s->at (pos - 1) == 'B')) > > > > > + s->replace (pos - 1, 2, "BT"); > > > > > + else if (pos2 != std::string::npos) > > > > > + s->replace (pos, 6, "Bt"); > > > > > + else > > > > > + s->replace (pos, 1, "Bt"); > > > > > > > > Formatting, the s->replace calls are indented too much. > > > > > > > > Jakub > > > >
Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: >> > From: Kong Lingling <lingling.kong@intel.com> >> > >> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR >> > usage by default from mapping the common reg/mem constraint to non-EGPR >> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage >> > for inline asm. >> > >> > gcc/ChangeLog: >> > >> > * config/i386/i386.cc (INCLUDE_STRING): Add include for >> > ix86_md_asm_adjust. >> > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the >> > target option, map reg/mem constraints to non-EGPR constraints. >> > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. >> > >> > gcc/testsuite/ChangeLog: >> > >> > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. >> > --- >> > gcc/config/i386/i386.cc | 44 +++++++ >> > gcc/config/i386/i386.opt | 5 + >> > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ >> > 3 files changed, 156 insertions(+) >> > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c >> > >> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc >> > index d26d9ab0d9d..9460ebbfda4 100644 >> > --- a/gcc/config/i386/i386.cc >> > +++ b/gcc/config/i386/i386.cc >> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License >> > along with GCC; see the file COPYING3. If not see >> > <http://www.gnu.org/licenses/>. */ >> > >> > +#define INCLUDE_STRING >> > #define IN_TARGET_CODE 1 >> > >> > #include "config.h" >> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, >> > bool saw_asm_flag = false; >> > >> > start_sequence (); >> > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR >> > + constraints, will eventually map all the usable constraints in the future. */ >> >> I think there should be some constraint which explicitly has all the 32 >> GPRs, like there is one for just all 16 GPRs (h), so that regardless of >> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. >> >> Also, what about the "g" constraint? Shouldn't there be another for "g" >> without r16..r31? What about the various other memory >> constraints ("<", "o", ...)? > > I think we should leave all existing constraints as they are, so "r" > covers only GPR16, "m" and "o" to only use GPR16. We can then > introduce "h" to instructions that have the ability to handle EGPR. Yeah. I'm jumping in without having read the full thread, sorry, but the current mechanism for handling this is TARGET_MEM_CONSTRAINT (added for s390). That is, TARGET_MEM_CONSTRAINT can be defined to some new constraint that is more general than the traditional "m" constraint. This constraint is then the one that is associated with memory_operand etc. "m" can then be defined explicitly to the old definition, so that existing asms continue to work. So if the port wants generic internal memory addresses to use the EGPR set (sounds reasonable), then TARGET_MEM_CONSTRAINT would be a new constraint that maps to those addresses. Thanks, Richard
On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu <crazylht@gmail.com> wrote: > > On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > > > > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > > > > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > > > > > From: Kong Lingling <lingling.kong@intel.com> > > > > > > > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > > > > > usage by default from mapping the common reg/mem constraint to non-EGPR > > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > > > > for inline asm. > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > > > > ix86_md_asm_adjust. > > > > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > > > > > target option, map reg/mem constraints to non-EGPR constraints. > > > > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > > > > --- > > > > > > gcc/config/i386/i386.cc | 44 +++++++ > > > > > > gcc/config/i386/i386.opt | 5 + > > > > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > > > > > 3 files changed, 156 insertions(+) > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > > > > --- a/gcc/config/i386/i386.cc > > > > > > +++ b/gcc/config/i386/i386.cc > > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > > > > > along with GCC; see the file COPYING3. If not see > > > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > > > > > +#define INCLUDE_STRING > > > > > > #define IN_TARGET_CODE 1 > > > > > > > > > > > > #include "config.h" > > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > > > > > bool saw_asm_flag = false; > > > > > > > > > > > > start_sequence (); > > > > > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > > > > > + constraints, will eventually map all the usable constraints in the future. */ > > > > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > > without r16..r31? What about the various other memory > > > > > constraints ("<", "o", ...)? > > > > > > > > I think we should leave all existing constraints as they are, so "r" > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > > introduce "h" to instructions that have the ability to handle EGPR. > > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > > register class for EVEX SSE registers. This way, asm will be > > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > > constraint "Bt", should allow new registers, and should be added to > > > > the constraint string as a separate constraint, and conditionally > > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > > > The extended constraint can work for registers, but for memory it is more > > > complicated. > > > > Yes, unfortunately. The compiler assumes that an unchangeable register > > class is used for BASE/INDEX registers. I have hit this limitation > > when trying to implement memory support for instructions involving > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > > registers, also inside memory operand. (You can see the "hack" in e.g. > > *extzvqi_mem_rex64" and corresponding peephole2 with the original > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > > register class is the major limitation in the compiler, so perhaps the > > strategy on how to override this limitation should be discussed with > > the register allocator author first. Perhaps adding an insn attribute > > to insn RTX pattern to specify different BASE/INDEX register sets can > > be a better solution than passing insn RTX to the register allocator. > > > > The above idea still does not solve the asm problem on how to select > > correct BASE/INDEX register set for memory operands. > The current approach disables gpr32 for memory operand in asm_operand > by default. but can be turned on by options > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction > supports gpr32). > Only ~ 5% of total instructions don't support gpr32, reversed approach > only gonna get more complicated. I'm not referring to the reversed approach, just want to point out that the same approach as you proposed w.r.t. to memory operand can be achieved using some named insn attribute that would affect BASE/INDEX register class selection. The attribute could default to gpr32 with APX, unless the insn specific attribute has e.g. nogpr32 value. See for example how "enabled" and "preferred_for_*" attributes are used. Perhaps this new attribute can also be applied to separate alternatives. Uros.
On Fri, Sep 1, 2023 at 7:27 PM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu <crazylht@gmail.com> wrote: > > > > On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > > > > > > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年8月31日周四 18:01写道: > > > > > > > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > > > > > > From: Kong Lingling <lingling.kong@intel.com> > > > > > > > > > > > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > > > > > > usage by default from mapping the common reg/mem constraint to non-EGPR > > > > > > > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > > > > > > > for inline asm. > > > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > > > * config/i386/i386.cc (INCLUDE_STRING): Add include for > > > > > > > ix86_md_asm_adjust. > > > > > > > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > > > > > > > target option, map reg/mem constraints to non-EGPR constraints. > > > > > > > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > > > > > > > --- > > > > > > > gcc/config/i386/i386.cc | 44 +++++++ > > > > > > > gcc/config/i386/i386.opt | 5 + > > > > > > > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > > > > > > > 3 files changed, 156 insertions(+) > > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > > > > > > > > > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > > > > > index d26d9ab0d9d..9460ebbfda4 100644 > > > > > > > --- a/gcc/config/i386/i386.cc > > > > > > > +++ b/gcc/config/i386/i386.cc > > > > > > > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > > > > > > > along with GCC; see the file COPYING3. If not see > > > > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > > > > > > > +#define INCLUDE_STRING > > > > > > > #define IN_TARGET_CODE 1 > > > > > > > > > > > > > > #include "config.h" > > > > > > > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > > > > > > > bool saw_asm_flag = false; > > > > > > > > > > > > > > start_sequence (); > > > > > > > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > > > > > > > + constraints, will eventually map all the usable constraints in the future. */ > > > > > > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > > > without r16..r31? What about the various other memory > > > > > > constraints ("<", "o", ...)? > > > > > > > > > > I think we should leave all existing constraints as they are, so "r" > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > > > introduce "h" to instructions that have the ability to handle EGPR. > > > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > > > register class for EVEX SSE registers. This way, asm will be > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > > > constraint "Bt", should allow new registers, and should be added to > > > > > the constraint string as a separate constraint, and conditionally > > > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > > > > > The extended constraint can work for registers, but for memory it is more > > > > complicated. > > > > > > Yes, unfortunately. The compiler assumes that an unchangeable register > > > class is used for BASE/INDEX registers. I have hit this limitation > > > when trying to implement memory support for instructions involving > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > > > registers, also inside memory operand. (You can see the "hack" in e.g. > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > > > register class is the major limitation in the compiler, so perhaps the > > > strategy on how to override this limitation should be discussed with > > > the register allocator author first. Perhaps adding an insn attribute > > > to insn RTX pattern to specify different BASE/INDEX register sets can > > > be a better solution than passing insn RTX to the register allocator. > > > > > > The above idea still does not solve the asm problem on how to select > > > correct BASE/INDEX register set for memory operands. > > The current approach disables gpr32 for memory operand in asm_operand > > by default. but can be turned on by options > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction > > supports gpr32). > > Only ~ 5% of total instructions don't support gpr32, reversed approach > > only gonna get more complicated. > > I'm not referring to the reversed approach, just want to point out > that the same approach as you proposed w.r.t. to memory operand can be > achieved using some named insn attribute that would affect BASE/INDEX > register class selection. The attribute could default to gpr32 with > APX, unless the insn specific attribute has e.g. nogpr32 value. See > for example how "enabled" and "preferred_for_*" attributes are used. > Perhaps this new attribute can also be applied to separate > alternatives. Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like (define_attr "gpr32" "0, 1" (cond [(eq_attr "isa" "fma4") (const_string "0")] (const_string "1"))) But still, we need to adjust memory constraints in the pattern. Ideally, gcc includes encoding information for every instruction, (.i.e. map0/map1), so that we can determine the attribute value of gpr32 directly from this information. > > Uros.
On Fri, Sep 1, 2023 at 7:03 PM Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > >> > From: Kong Lingling <lingling.kong@intel.com> > >> > > >> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > >> > usage by default from mapping the common reg/mem constraint to non-EGPR > >> > constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage > >> > for inline asm. > >> > > >> > gcc/ChangeLog: > >> > > >> > * config/i386/i386.cc (INCLUDE_STRING): Add include for > >> > ix86_md_asm_adjust. > >> > (ix86_md_asm_adjust): When APX EGPR enabled without specifying the > >> > target option, map reg/mem constraints to non-EGPR constraints. > >> > * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. > >> > > >> > gcc/testsuite/ChangeLog: > >> > > >> > * gcc.target/i386/apx-inline-gpr-norex2.c: New test. > >> > --- > >> > gcc/config/i386/i386.cc | 44 +++++++ > >> > gcc/config/i386/i386.opt | 5 + > >> > .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ > >> > 3 files changed, 156 insertions(+) > >> > create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c > >> > > >> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > >> > index d26d9ab0d9d..9460ebbfda4 100644 > >> > --- a/gcc/config/i386/i386.cc > >> > +++ b/gcc/config/i386/i386.cc > >> > @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License > >> > along with GCC; see the file COPYING3. If not see > >> > <http://www.gnu.org/licenses/>. */ > >> > > >> > +#define INCLUDE_STRING > >> > #define IN_TARGET_CODE 1 > >> > > >> > #include "config.h" > >> > @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, > >> > bool saw_asm_flag = false; > >> > > >> > start_sequence (); > >> > + /* TODO: Here we just mapped the general r/m constraints to non-EGPR > >> > + constraints, will eventually map all the usable constraints in the future. */ > >> > >> I think there should be some constraint which explicitly has all the 32 > >> GPRs, like there is one for just all 16 GPRs (h), so that regardless of > >> -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > >> > >> Also, what about the "g" constraint? Shouldn't there be another for "g" > >> without r16..r31? What about the various other memory > >> constraints ("<", "o", ...)? > > > > I think we should leave all existing constraints as they are, so "r" > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > introduce "h" to instructions that have the ability to handle EGPR. > > Yeah. I'm jumping in without having read the full thread, sorry, > but the current mechanism for handling this is TARGET_MEM_CONSTRAINT > (added for s390). That is, TARGET_MEM_CONSTRAINT can be defined to some Thanks for the comments. > new constraint that is more general than the traditional "m" constraint. > This constraint is then the one that is associated with memory_operand > etc. "m" can then be defined explicitly to the old definition, > so that existing asms continue to work. > > So if the port wants generic internal memory addresses to use the > EGPR set (sounds reasonable), then TARGET_MEM_CONSTRAINT would be > a new constraint that maps to those addresses. But still we need to enhance current reload infrastructure to support selective base_reg_class/index_reg_class, refer to [1]. The good thing about using TARGET_MEM_CONSTRAINT is that we don't have to remapping memory constraint for inline asm, but the bad thing about it is that we need to modify the backend pattern a lot, because only 5% of the instructions don't support gpr32, and 95% of them need to be changed to the new memory constraint. It feels like the cons outweigh the pros. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629040.html > > Thanks, > Richard
On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu <crazylht@gmail.com> wrote: > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > > > > without r16..r31? What about the various other memory > > > > > > > constraints ("<", "o", ...)? > > > > > > > > > > > > I think we should leave all existing constraints as they are, so "r" > > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > > > > introduce "h" to instructions that have the ability to handle EGPR. > > > > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > > > > register class for EVEX SSE registers. This way, asm will be > > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > > > > constraint "Bt", should allow new registers, and should be added to > > > > > > the constraint string as a separate constraint, and conditionally > > > > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > > > > > > > The extended constraint can work for registers, but for memory it is more > > > > > complicated. > > > > > > > > Yes, unfortunately. The compiler assumes that an unchangeable register > > > > class is used for BASE/INDEX registers. I have hit this limitation > > > > when trying to implement memory support for instructions involving > > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > > > > registers, also inside memory operand. (You can see the "hack" in e.g. > > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original > > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > > > > register class is the major limitation in the compiler, so perhaps the > > > > strategy on how to override this limitation should be discussed with > > > > the register allocator author first. Perhaps adding an insn attribute > > > > to insn RTX pattern to specify different BASE/INDEX register sets can > > > > be a better solution than passing insn RTX to the register allocator. > > > > > > > > The above idea still does not solve the asm problem on how to select > > > > correct BASE/INDEX register set for memory operands. > > > The current approach disables gpr32 for memory operand in asm_operand > > > by default. but can be turned on by options > > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction > > > supports gpr32). > > > Only ~ 5% of total instructions don't support gpr32, reversed approach > > > only gonna get more complicated. > > > > I'm not referring to the reversed approach, just want to point out > > that the same approach as you proposed w.r.t. to memory operand can be > > achieved using some named insn attribute that would affect BASE/INDEX > > register class selection. The attribute could default to gpr32 with > > APX, unless the insn specific attribute has e.g. nogpr32 value. See > > for example how "enabled" and "preferred_for_*" attributes are used. > > Perhaps this new attribute can also be applied to separate > > alternatives. > Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like > (define_attr "gpr32" "0, 1" > (cond [(eq_attr "isa" "fma4") > (const_string "0")] > (const_string "1"))) Just a nit, can the member be named "map0" and "map1"? The code will then look like: if (get_attr_gpr32 (insn) == GPR32_MAP0) ... instead of: if (get_attr_gpr32 (insn) == GPR32_0) ... > But still, we need to adjust memory constraints in the pattern. I guess the gpr32 property is the same for all alternatives of the insn pattern. In this case, "m" "g" and "a" constraints could remain as they are, the final register class will be adjusted (by some target hook?) based on the value of gpr32 attribute. > Ideally, gcc includes encoding information for every instruction, > (.i.e. map0/map1), so that we can determine the attribute value of > gpr32 directly from this information. I think the right tool for this is attribute infrastructure of insn patterns. We can set the default, set precise value of the insns, or calculate attribute from some other attribute in a quite flexible way. Other than that, adjusting BASE/INDEX register class of the RA pass is the infrastructure change, but perhaps similar to the one you proposed. Uros.
On Mon, Sep 4, 2023 at 4:57 PM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu <crazylht@gmail.com> wrote: > > > > > > > > > I think there should be some constraint which explicitly has all the 32 > > > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that regardless of > > > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline asm wants. > > > > > > > > > > > > > > > > Also, what about the "g" constraint? Shouldn't there be another for "g" > > > > > > > > without r16..r31? What about the various other memory > > > > > > > > constraints ("<", "o", ...)? > > > > > > > > > > > > > > I think we should leave all existing constraints as they are, so "r" > > > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then > > > > > > > introduce "h" to instructions that have the ability to handle EGPR. > > > > > > > This would be somehow similar to the SSE -> AVX512F transition, where > > > > > > > we still have "x" for SSE16 and "v" was introduced as a separate > > > > > > > register class for EVEX SSE registers. This way, asm will be > > > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory > > > > > > > constraint "Bt", should allow new registers, and should be added to > > > > > > > the constraint string as a separate constraint, and conditionally > > > > > > > enabled by relevant "isa" (AKA "enabled") attribute. > > > > > > > > > > > > The extended constraint can work for registers, but for memory it is more > > > > > > complicated. > > > > > > > > > > Yes, unfortunately. The compiler assumes that an unchangeable register > > > > > class is used for BASE/INDEX registers. I have hit this limitation > > > > > when trying to implement memory support for instructions involving > > > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX > > > > > registers, also inside memory operand. (You can see the "hack" in e.g. > > > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original > > > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX > > > > > register class is the major limitation in the compiler, so perhaps the > > > > > strategy on how to override this limitation should be discussed with > > > > > the register allocator author first. Perhaps adding an insn attribute > > > > > to insn RTX pattern to specify different BASE/INDEX register sets can > > > > > be a better solution than passing insn RTX to the register allocator. > > > > > > > > > > The above idea still does not solve the asm problem on how to select > > > > > correct BASE/INDEX register set for memory operands. > > > > The current approach disables gpr32 for memory operand in asm_operand > > > > by default. but can be turned on by options > > > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction > > > > supports gpr32). > > > > Only ~ 5% of total instructions don't support gpr32, reversed approach > > > > only gonna get more complicated. > > > > > > I'm not referring to the reversed approach, just want to point out > > > that the same approach as you proposed w.r.t. to memory operand can be > > > achieved using some named insn attribute that would affect BASE/INDEX > > > register class selection. The attribute could default to gpr32 with > > > APX, unless the insn specific attribute has e.g. nogpr32 value. See > > > for example how "enabled" and "preferred_for_*" attributes are used. > > > Perhaps this new attribute can also be applied to separate > > > alternatives. > > Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like > > (define_attr "gpr32" "0, 1" > > (cond [(eq_attr "isa" "fma4") > > (const_string "0")] > > (const_string "1"))) > > Just a nit, can the member be named "map0" and "map1"? The code will > then look like: > > if (get_attr_gpr32 (insn) == GPR32_MAP0) ... > > instead of: > > if (get_attr_gpr32 (insn) == GPR32_0) ... > > > But still, we need to adjust memory constraints in the pattern. > > I guess the gpr32 property is the same for all alternatives of the > insn pattern. In this case, "m" "g" and "a" constraints could remain > as they are, the final register class will be adjusted (by some target > hook?) based on the value of gpr32 attribute. I'm worried that not all rtl optimizers after post_reload will respect base/index_reg_class regarding the insn they belong to. if they just check if it's a legitimate memory/address (the current legitimate_address doesn't have a corresponding insn to pass down), m/g/a will still generate invalid instruction. So a defensive programming is to explicitly modifying the constraint. > > > Ideally, gcc includes encoding information for every instruction, > > (.i.e. map0/map1), so that we can determine the attribute value of > > gpr32 directly from this information. > > I think the right tool for this is attribute infrastructure of insn > patterns. We can set the default, set precise value of the insns, or > calculate attribute from some other attribute in a quite flexible way. > Other than that, adjusting BASE/INDEX register class of the RA pass is > the infrastructure change, but perhaps similar to the one you > proposed. Yes. > > Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index d26d9ab0d9d..9460ebbfda4 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_STRING #define IN_TARGET_CODE 1 #include "config.h" @@ -23077,6 +23078,49 @@ ix86_md_asm_adjust (vec<rtx> &outputs, vec<rtx> & /*inputs*/, bool saw_asm_flag = false; start_sequence (); + /* TODO: Here we just mapped the general r/m constraints to non-EGPR + constraints, will eventually map all the usable constraints in the future. */ + if (TARGET_APX_EGPR && !ix86_apx_inline_asm_use_gpr32) + { + /* Map "r" constraint in inline asm to "h" that disallows r16-r31 + and replace only r, exclude Br and Yr. */ + for (unsigned i = 0; i < constraints.length (); i++) + { + std::string *s = new std::string (constraints[i]); + size_t pos = s->find ('r'); + while (pos != std::string::npos) + { + if (pos > 0 + && (s->at (pos - 1) == 'Y' || s->at (pos - 1) == 'B')) + pos = s->find ('r', pos + 1); + else + { + s->replace (pos, 1, "h"); + constraints[i] = (const char*) s->c_str (); + break; + } + } + } + /* Also map "m/memory/Bm" constraint that may use GPR32, replace them with + "Bt/Bt/BT". */ + for (unsigned i = 0; i < constraints.length (); i++) + { + std::string *s = new std::string (constraints[i]); + size_t pos = s->find ("m"); + size_t pos2 = s->find ("memory"); + if (pos != std::string::npos) + { + if (pos > 0 && (s->at (pos - 1) == 'B')) + s->replace (pos - 1, 2, "BT"); + else if (pos2 != std::string::npos) + s->replace (pos, 6, "Bt"); + else + s->replace (pos, 1, "Bt"); + constraints[i] = (const char*) s->c_str (); + } + } + } + for (unsigned i = 0, n = outputs.length (); i < n; ++i) { const char *con = constraints[i]; diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 1ee4d90186e..5c8d3a207e3 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1335,3 +1335,8 @@ Enum(apx_features) String(ndd) Value(apx_ndd) Set(4) EnumValue Enum(apx_features) String(all) Value(apx_all) Set(1) + +mapx-inline-asm-use-gpr32 +Target Var(ix86_apx_inline_asm_use_gpr32) Init(0) +Enable GPR32 in inline asm when APX_EGPR enabled, do not +hook reg or mem constraint in inline asm to GPR16. diff --git a/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c new file mode 100644 index 00000000000..21534450045 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c @@ -0,0 +1,107 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mapxf -m64 -march=skylake-avx512 -DDTYPE32" } */ + +typedef unsigned int u32; +typedef unsigned long long u64; + +#ifdef DTYPE32 +typedef u32 DTYPE; +#define byteswap byteswapu32 +#endif + +#define R(x,n) ( (x >> n) | (x << (32 - n))) + +#define S0(x) (R(x, 2) ^ R(x,13) ^ R(x,22)) +#define S1(x) (R(x, 6) ^ R(x,11) ^ R(x,25)) + +#define TT(a,b,c,d,e,f,g,h,x,K) \ +{ \ + tmp1 = h + S1(e) + (g ^ (e & (f ^ g))) + K + x; \ + tmp2 = S0(a) + ((a & b) | (c & (a | b))); \ + h = tmp1 + tmp2; \ + d += tmp1; \ +} + +static inline u32 byteswapu32(u32 x) +{ + x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16; + x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8; + return x; +} + +void foo (DTYPE in[16], DTYPE out[8], const DTYPE C[16]) +{ + DTYPE tmp1 = 0, tmp2 = 0, a, b, c, d, e, f, g, h; + DTYPE w0, w1, w2, w3, w4, w5, w6, w7, + w8, w9, w10, w11, w12, w13, w14, w15; + w0 = byteswap(in[0]); + w1 = byteswap(in[1]); + w2 = byteswap(in[2]); + w3 = byteswap(in[3]); + w4 = byteswap(in[4]); + w5 = byteswap(in[5]); + w6 = byteswap(in[6]); + w7 = byteswap(in[7]); + w8 = byteswap(in[8]); + w9 = byteswap(in[9]); + w10 = byteswap(in[10]); + w11 = byteswap(in[11]); + w12 = byteswap(in[12]); + w13 = byteswap(in[13]); + w14 = byteswap(in[14]); + w15 = byteswap(in[15]); + a = out[0]; + b = out[1]; + c = out[2]; + d = out[3]; + e = out[4]; + f = out[5]; + g = out[6]; + h = out[7]; + + TT(a, b, c, d, e, f, g, h, w0, C[0]); + TT(h, a, b, c, d, e, f, g, w1, C[1]); + TT(g, h, a, b, c, d, e, f, w2, C[2]); + TT(f, g, h, a, b, c, d, e, w3, C[3]); + TT(e, f, g, h, a, b, c, d, w4, C[4]); + TT(d, e, f, g, h, a, b, c, w5, C[5]); + TT(c, d, e, f, g, h, a, b, w6, C[6]); + TT(b, c, d, e, f, g, h, a, w7, C[7]); + TT(a, b, c, d, e, f, g, h, w8, C[8]); + TT(h, a, b, c, d, e, f, g, w9, C[9]); + TT(g, h, a, b, c, d, e, f, w10, C[10]); + TT(f, g, h, a, b, c, d, e, w11, C[11]); + TT(e, f, g, h, a, b, c, d, w12, C[12]); + TT(d, e, f, g, h, a, b, c, w13, C[13]); + TT(c, d, e, f, g, h, a, b, w14, C[14]); + TT(b, c, d, e, f, g, h, a, w15, C[15]); + + out[0] += a; + out[1] += b; + out[2] += c; + out[3] += d; + out[4] += e; + out[5] += f; + out[6] += g; + out[7] += h; + + __asm__ __volatile__ ("test_asm_xmm %0, %%rax" : : "Yr" (out[7]) : "rax"); + __asm__ __volatile__ ("test_asm_Brr %0, %%rax" : : "Brr" (w14) : "rbx"); + __asm__ __volatile__ ("test_asm_rBr %0, %%rax" : : "rBr" (w13) : "rbx"); + __asm__ __volatile__ ("test_asm_r %0, %%rax" : : "r" (w15) : "rbx"); + __asm__ __volatile__ ("test_asm_m %0, %%rax" : : "m" (out[0]) : "rbx"); + __asm__ __volatile__ ("test_asm_mem %0, %%rax" : : "memory" (out[1]) : "rbx"); +} + +/* { dg-final { scan-assembler-not "knot" } } */ +/* { dg-final { scan-assembler-not "kxor" } } */ +/* { dg-final { scan-assembler-not "kor" } } */ +/* { dg-final { scan-assembler-not "kandn" } } */ +/* { dg-final { scan-assembler-times "test_asm_xmm %xmm5, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_Brr %r15d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_rBr %r14d, %rax" 1 } } */ +/* { dg-final { scan-assembler-times "test_asm_r %r13d, %rax" 1 } } */ +/* { dg-final { scan-assembler-not "test_asm_rBr %r31d, %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_r %r30d, %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_m \\(%r29d\\), %rax" } } */ +/* { dg-final { scan-assembler-not "test_asm_mem \\(%r28d\\), %rax" } } */
From: Kong Lingling <lingling.kong@intel.com> In inline asm, we do not know if the insn can use EGPR, so disable EGPR usage by default from mapping the common reg/mem constraint to non-EGPR constraints. Use a flag mapx-inline-asm-use-gpr32 to enable EGPR usage for inline asm. gcc/ChangeLog: * config/i386/i386.cc (INCLUDE_STRING): Add include for ix86_md_asm_adjust. (ix86_md_asm_adjust): When APX EGPR enabled without specifying the target option, map reg/mem constraints to non-EGPR constraints. * config/i386/i386.opt: Add option mapx-inline-asm-use-gpr32. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-inline-gpr-norex2.c: New test. --- gcc/config/i386/i386.cc | 44 +++++++ gcc/config/i386/i386.opt | 5 + .../gcc.target/i386/apx-inline-gpr-norex2.c | 107 ++++++++++++++++++ 3 files changed, 156 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/apx-inline-gpr-norex2.c