Message ID | 20220630055907.50030-1-haochen.jiang@intel.com |
---|---|
State | New |
Headers | show |
Series | i386: Extend cvtps2pd to memory | expand |
On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> wrote: > > Hi all, > > This patch aims to fix the cvtps2pd insn, which should also work on > memory operand but currently does not. After this fix, when loop == 2, > it will eliminate movq instruction. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > BRs, > Haochen > > gcc/ChangeLog: > > PR target/43618 > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. > > gcc/testsuite/ChangeLog: > > PR target/43618 > * gcc.target/i386/pr43618-1.c: New test. This patch could be as simple as: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 8cd0f617bf3..c331445cb2d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -9195,7 +9195,7 @@ (define_insn "extendv2sfv2df2" [(set (match_operand:V2DF 0 "register_operand" "=v") (float_extend:V2DF - (match_operand:V2SF 1 "register_operand" "v")))] + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] "TARGET_MMX_WITH_SSE" "%vcvtps2pd\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") Uros.
> -----Original Message----- > From: Uros Bizjak <ubizjak@gmail.com> > Sent: Thursday, June 30, 2022 2:20 PM > To: Jiang, Haochen <haochen.jiang@intel.com> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> > wrote: > > > > Hi all, > > > > This patch aims to fix the cvtps2pd insn, which should also work on > > memory operand but currently does not. After this fix, when loop == 2, > > it will eliminate movq instruction. > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > BRs, > > Haochen > > > > gcc/ChangeLog: > > > > PR target/43618 > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. > > > > gcc/testsuite/ChangeLog: > > > > PR target/43618 > > * gcc.target/i386/pr43618-1.c: New test. > > This patch could be as simple as: > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index > 8cd0f617bf3..c331445cb2d 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -9195,7 +9195,7 @@ > (define_insn "extendv2sfv2df2" > [(set (match_operand:V2DF 0 "register_operand" "=v") > (float_extend:V2DF > - (match_operand:V2SF 1 "register_operand" "v")))] > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > "TARGET_MMX_WITH_SSE" > "%vcvtps2pd\t{%1, %0|%0, %1}" > [(set_attr "type" "ssecvt") We also tested on this version, it is ok. The reason why the patch looks like this is because in the previous insn sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand actually does not match the actual instruction. Memory operand is V2SF, not V4SF. Therefore, we changed the constraint in that insn. Then it caused another issue. For memory operand, it seems that we cannot generate those mask instructions. So I change the pattern to how extendv2hfv2df2 works. Haochen > Uros.
On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote: > > > -----Original Message----- > > From: Uros Bizjak <ubizjak@gmail.com> > > Sent: Thursday, June 30, 2022 2:20 PM > > To: Jiang, Haochen <haochen.jiang@intel.com> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> > > wrote: > > > > > > Hi all, > > > > > > This patch aims to fix the cvtps2pd insn, which should also work on > > > memory operand but currently does not. After this fix, when loop == 2, > > > it will eliminate movq instruction. > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > BRs, > > > Haochen > > > > > > gcc/ChangeLog: > > > > > > PR target/43618 > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/43618 > > > * gcc.target/i386/pr43618-1.c: New test. > > > > This patch could be as simple as: > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index > > 8cd0f617bf3..c331445cb2d 100644 > > --- a/gcc/config/i386/sse.md > > +++ b/gcc/config/i386/sse.md > > @@ -9195,7 +9195,7 @@ > > (define_insn "extendv2sfv2df2" > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > (float_extend:V2DF > > - (match_operand:V2SF 1 "register_operand" "v")))] > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > "TARGET_MMX_WITH_SSE" > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > [(set_attr "type" "ssecvt") > > We also tested on this version, it is ok. > > The reason why the patch looks like this is because in the previous insn > sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand > actually does not match the actual instruction. Memory operand is V2SF, > not V4SF. > > Therefore, we changed the constraint in that insn. Then it caused another issue. > For memory operand, it seems that we cannot generate those mask instructions. > So I change the pattern to how extendv2hfv2df2 works. If you want to change the memory access in sse2_cvtps2pd<mask_name>, then please see how e.g. <insn>v2hiv2di is handled in sse.md. In addition to two instructions, you will need one define_insn_and_split with a pre-reload splitter. Uros,
On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote: > > > > > -----Original Message----- > > > From: Uros Bizjak <ubizjak@gmail.com> > > > Sent: Thursday, June 30, 2022 2:20 PM > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> > > > wrote: > > > > > > > > Hi all, > > > > > > > > This patch aims to fix the cvtps2pd insn, which should also work on > > > > memory operand but currently does not. After this fix, when loop == 2, > > > > it will eliminate movq instruction. > > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > > > BRs, > > > > Haochen > > > > > > > > gcc/ChangeLog: > > > > > > > > PR target/43618 > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. Rename FROM ... Please also mention change to sse2_cvtps2pd<mask_name>. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > PR target/43618 > > > > * gcc.target/i386/pr43618-1.c: New test. > > > > > > This patch could be as simple as: > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index > > > 8cd0f617bf3..c331445cb2d 100644 > > > --- a/gcc/config/i386/sse.md > > > +++ b/gcc/config/i386/sse.md > > > @@ -9195,7 +9195,7 @@ > > > (define_insn "extendv2sfv2df2" > > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > > (float_extend:V2DF > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > > "TARGET_MMX_WITH_SSE" > > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > > [(set_attr "type" "ssecvt") > > > > We also tested on this version, it is ok. > > > > The reason why the patch looks like this is because in the previous insn > > sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand > > actually does not match the actual instruction. Memory operand is V2SF, > > not V4SF. > > > > Therefore, we changed the constraint in that insn. Then it caused another issue. > > For memory operand, it seems that we cannot generate those mask instructions. > > So I change the pattern to how extendv2hfv2df2 works. > > If you want to change the memory access in sse2_cvtps2pd<mask_name>, > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In > addition to two instructions, you will need one define_insn_and_split > with a pre-reload splitter. Oh, nowadays combine does vec_select from a paradoxical subreg on its own. +(define_expand "extendv2sfv2df2" + [(set (match_operand:V2DF 0 "register_operand") + (float_extend:V2DF + (match_operand:V2SF 1 "nonimmediate_operand")))] + "TARGET_MMX_WITH_SSE" +{ + if (!MEM_P (operands[1])) + { You will need force reg here: rtx op1 = force_reg (V2SFmode, operands[1]); + operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode); + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); + DONE; + } +}) -(define_insn "extendv2sfv2df2" +(define_insn "sse2_cvtps2pd_load<mask_name>" Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the star at the beginning, You don't have to make the name public. OK with the above changes. Thanks, Uros,
On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> wrote: > > > > > > > -----Original Message----- > > > > From: Uros Bizjak <ubizjak@gmail.com> > > > > Sent: Thursday, June 30, 2022 2:20 PM > > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang <haochen.jiang@intel.com> > > > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > This patch aims to fix the cvtps2pd insn, which should also work on > > > > > memory operand but currently does not. After this fix, when loop == 2, > > > > > it will eliminate movq instruction. > > > > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > > > > > BRs, > > > > > Haochen > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR target/43618 > > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > > > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. > > Rename FROM ... > > Please also mention change to sse2_cvtps2pd<mask_name>. > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR target/43618 > > > > > * gcc.target/i386/pr43618-1.c: New test. > > > > > > > > This patch could be as simple as: > > > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index > > > > 8cd0f617bf3..c331445cb2d 100644 > > > > --- a/gcc/config/i386/sse.md > > > > +++ b/gcc/config/i386/sse.md > > > > @@ -9195,7 +9195,7 @@ > > > > (define_insn "extendv2sfv2df2" > > > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > > > (float_extend:V2DF > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > > > "TARGET_MMX_WITH_SSE" > > > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > > > [(set_attr "type" "ssecvt") > > > > > > We also tested on this version, it is ok. > > > > > > The reason why the patch looks like this is because in the previous insn > > > sse2_cvtps2pd<mask_name>, the constraint vm and vector_operand > > > actually does not match the actual instruction. Memory operand is V2SF, > > > not V4SF. > > > > > > Therefore, we changed the constraint in that insn. Then it caused another issue. > > > For memory operand, it seems that we cannot generate those mask instructions. > > > So I change the pattern to how extendv2hfv2df2 works. > > > > If you want to change the memory access in sse2_cvtps2pd<mask_name>, > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In > > addition to two instructions, you will need one define_insn_and_split > > with a pre-reload splitter. > > Oh, nowadays combine does vec_select from a paradoxical subreg on its own. > > +(define_expand "extendv2sfv2df2" > + [(set (match_operand:V2DF 0 "register_operand") > + (float_extend:V2DF > + (match_operand:V2SF 1 "nonimmediate_operand")))] > + "TARGET_MMX_WITH_SSE" > +{ > + if (!MEM_P (operands[1])) > + { > > You will need force reg here: > > rtx op1 = force_reg (V2SFmode, operands[1]); > + operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode); > + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); > + DONE; > + } > +}) > > > -(define_insn "extendv2sfv2df2" > +(define_insn "sse2_cvtps2pd_load<mask_name>" > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the > star at the beginning, You don't have to make the name public. > > OK with the above changes. Forgot to mention: - (match_operand:V2SF 1 "register_operand" "v")))] - "TARGET_MMX_WITH_SSE" - "%vcvtps2pd\t{%1, %0|%0, %1}" + (match_operand:V2SF 1 "memory_operand" "m")))] + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>" + "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper and2>, %q1}" [(set_attr "type" "ssecvt") The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we can use TARGET_SSE2 here. Which opens the question if the expander could also be TARGET_SSE2 only. There are no MMX registers involved in any of the patterns anymore. Uros. > > Thanks, > Uros,
> -----Original Message----- > From: Uros Bizjak <ubizjak@gmail.com> > Sent: Thursday, June 30, 2022 4:53 PM > To: Jiang, Haochen <haochen.jiang@intel.com> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen <haochen.jiang@intel.com> > wrote: > > > > > > > > > -----Original Message----- > > > > > From: Uros Bizjak <ubizjak@gmail.com> > > > > > Sent: Thursday, June 30, 2022 2:20 PM > > > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > > > > > <hongtao.liu@intel.com> > > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang > > > > > <haochen.jiang@intel.com> > > > > > wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > This patch aims to fix the cvtps2pd insn, which should also > > > > > > work on memory operand but currently does not. After this fix, > > > > > > when loop == 2, it will eliminate movq instruction. > > > > > > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > > > > > > > BRs, > > > > > > Haochen > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > PR target/43618 > > > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > > > > (sse2_cvtps2pd_load<mask_name>): Rename extendvsdfv2df2. > > > > Rename FROM ... > > > > Please also mention change to sse2_cvtps2pd<mask_name>. > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > PR target/43618 > > > > > > * gcc.target/i386/pr43618-1.c: New test. > > > > > > > > > > This patch could be as simple as: > > > > > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > > > > index 8cd0f617bf3..c331445cb2d 100644 > > > > > --- a/gcc/config/i386/sse.md > > > > > +++ b/gcc/config/i386/sse.md > > > > > @@ -9195,7 +9195,7 @@ > > > > > (define_insn "extendv2sfv2df2" > > > > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > > > > (float_extend:V2DF > > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > > > > "TARGET_MMX_WITH_SSE" > > > > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > > > > [(set_attr "type" "ssecvt") > > > > > > > > We also tested on this version, it is ok. > > > > > > > > The reason why the patch looks like this is because in the > > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and > > > > vector_operand actually does not match the actual instruction. > > > > Memory operand is V2SF, not V4SF. > > > > > > > > Therefore, we changed the constraint in that insn. Then it caused another > issue. > > > > For memory operand, it seems that we cannot generate those mask > instructions. > > > > So I change the pattern to how extendv2hfv2df2 works. > > > > > > If you want to change the memory access in sse2_cvtps2pd<mask_name>, > > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In > > > addition to two instructions, you will need one > > > define_insn_and_split with a pre-reload splitter. > > > > Oh, nowadays combine does vec_select from a paradoxical subreg on its own. > > > > +(define_expand "extendv2sfv2df2" > > + [(set (match_operand:V2DF 0 "register_operand") > > + (float_extend:V2DF > > + (match_operand:V2SF 1 "nonimmediate_operand")))] > > + "TARGET_MMX_WITH_SSE" > > +{ > > + if (!MEM_P (operands[1])) > > + { > > > > You will need force reg here: > > > > rtx op1 = force_reg (V2SFmode, operands[1]); > > + operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode); > > + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); > > + DONE; > > + } > > +}) > > > > > > -(define_insn "extendv2sfv2df2" > > +(define_insn "sse2_cvtps2pd_load<mask_name>" > > > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note the > > star at the beginning, You don't have to make the name public. > > > > OK with the above changes. > > Forgot to mention: > > > - (match_operand:V2SF 1 "register_operand" "v")))] > - "TARGET_MMX_WITH_SSE" > - "%vcvtps2pd\t{%1, %0|%0, %1}" > + (match_operand:V2SF 1 "memory_operand" "m")))] > + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>" > + "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper > and2>, %q1}" > [(set_attr "type" "ssecvt") > > The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we > can use TARGET_SSE2 here. > > Which opens the question if the expander could also be TARGET_SSE2 only. > There are no MMX registers involved in any of the patterns anymore. Yes. > > Uros. > > > > Thanks, > > Uros,
Hi all, I revised my patch according to all your reviews. Regtested on x86_64-pc-linux-gnu. BRs, Haochen > -----Original Message----- > From: Liu, Hongtao <hongtao.liu@intel.com> > Sent: Thursday, June 30, 2022 4:57 PM > To: Uros Bizjak <ubizjak@gmail.com>; Jiang, Haochen > <haochen.jiang@intel.com> > Cc: gcc-patches@gcc.gnu.org > Subject: RE: [PATCH] i386: Extend cvtps2pd to memory > > > > > -----Original Message----- > > From: Uros Bizjak <ubizjak@gmail.com> > > Sent: Thursday, June 30, 2022 4:53 PM > > To: Jiang, Haochen <haochen.jiang@intel.com> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen > <haochen.jiang@intel.com> > > wrote: > > > > > > > > > > > -----Original Message----- > > > > > > From: Uros Bizjak <ubizjak@gmail.com> > > > > > > Sent: Thursday, June 30, 2022 2:20 PM > > > > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > > > > > > <hongtao.liu@intel.com> > > > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang > > > > > > <haochen.jiang@intel.com> > > > > > > wrote: > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > This patch aims to fix the cvtps2pd insn, which should also > > > > > > > work on memory operand but currently does not. After this fix, > > > > > > > when loop == 2, it will eliminate movq instruction. > > > > > > > > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > > > > > > > > > BRs, > > > > > > > Haochen > > > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > > > PR target/43618 > > > > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > > > > > (sse2_cvtps2pd_load<mask_name>): Rename > extendvsdfv2df2. > > > > > > Rename FROM ... > > > > > > Please also mention change to sse2_cvtps2pd<mask_name>. > > > > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > PR target/43618 > > > > > > > * gcc.target/i386/pr43618-1.c: New test. > > > > > > > > > > > > This patch could be as simple as: > > > > > > > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > > > > > index 8cd0f617bf3..c331445cb2d 100644 > > > > > > --- a/gcc/config/i386/sse.md > > > > > > +++ b/gcc/config/i386/sse.md > > > > > > @@ -9195,7 +9195,7 @@ > > > > > > (define_insn "extendv2sfv2df2" > > > > > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > > > > > (float_extend:V2DF > > > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > > > > > "TARGET_MMX_WITH_SSE" > > > > > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > > > > > [(set_attr "type" "ssecvt") > > > > > > > > > > We also tested on this version, it is ok. > > > > > > > > > > The reason why the patch looks like this is because in the > > > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and > > > > > vector_operand actually does not match the actual instruction. > > > > > Memory operand is V2SF, not V4SF. > > > > > > > > > > Therefore, we changed the constraint in that insn. Then it caused > another > > issue. > > > > > For memory operand, it seems that we cannot generate those mask > > instructions. > > > > > So I change the pattern to how extendv2hfv2df2 works. > > > > > > > > If you want to change the memory access in > sse2_cvtps2pd<mask_name>, > > > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In > > > > addition to two instructions, you will need one > > > > define_insn_and_split with a pre-reload splitter. > > > > > > Oh, nowadays combine does vec_select from a paradoxical subreg on its > own. > > > > > > +(define_expand "extendv2sfv2df2" > > > + [(set (match_operand:V2DF 0 "register_operand") > > > + (float_extend:V2DF > > > + (match_operand:V2SF 1 "nonimmediate_operand")))] > > > + "TARGET_MMX_WITH_SSE" > > > +{ > > > + if (!MEM_P (operands[1])) > > > + { > > > > > > You will need force reg here: > > > > > > rtx op1 = force_reg (V2SFmode, operands[1]); > > > + operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode); > > > + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); > > > + DONE; > > > + } > > > +}) > > > > > > > > > -(define_insn "extendv2sfv2df2" > > > +(define_insn "sse2_cvtps2pd_load<mask_name>" > > > > > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note > the > > > star at the beginning, You don't have to make the name public. > > > > > > OK with the above changes. > > > > Forgot to mention: > > > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > - "TARGET_MMX_WITH_SSE" > > - "%vcvtps2pd\t{%1, %0|%0, %1}" > > + (match_operand:V2SF 1 "memory_operand" "m")))] > > + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>" > > + "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper > > and2>, %q1}" > > [(set_attr "type" "ssecvt") > > > > The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so > we > > can use TARGET_SSE2 here. > > > > Which opens the question if the expander could also be TARGET_SSE2 only. > > There are no MMX registers involved in any of the patterns anymore. > Yes. > > > > Uros. > > > > > > Thanks, > > > Uros,
On Mon, Jul 4, 2022 at 7:10 AM Jiang, Haochen <haochen.jiang@intel.com> wrote: > > Hi all, > > I revised my patch according to all your reviews. > > Regtested on x86_64-pc-linux-gnu. OK. Thanks, Uros. > > BRs, > Haochen > > > -----Original Message----- > > From: Liu, Hongtao <hongtao.liu@intel.com> > > Sent: Thursday, June 30, 2022 4:57 PM > > To: Uros Bizjak <ubizjak@gmail.com>; Jiang, Haochen > > <haochen.jiang@intel.com> > > Cc: gcc-patches@gcc.gnu.org > > Subject: RE: [PATCH] i386: Extend cvtps2pd to memory > > > > > > > > > -----Original Message----- > > > From: Uros Bizjak <ubizjak@gmail.com> > > > Sent: Thursday, June 30, 2022 4:53 PM > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com> > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen > > <haochen.jiang@intel.com> > > > wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Uros Bizjak <ubizjak@gmail.com> > > > > > > > Sent: Thursday, June 30, 2022 2:20 PM > > > > > > > To: Jiang, Haochen <haochen.jiang@intel.com> > > > > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > > > > > > > <hongtao.liu@intel.com> > > > > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory > > > > > > > > > > > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang > > > > > > > <haochen.jiang@intel.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > This patch aims to fix the cvtps2pd insn, which should also > > > > > > > > work on memory operand but currently does not. After this fix, > > > > > > > > when loop == 2, it will eliminate movq instruction. > > > > > > > > > > > > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk? > > > > > > > > > > > > > > > > BRs, > > > > > > > > Haochen > > > > > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > > > > > PR target/43618 > > > > > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand. > > > > > > > > (sse2_cvtps2pd_load<mask_name>): Rename > > extendvsdfv2df2. > > > > > > > > Rename FROM ... > > > > > > > > Please also mention change to sse2_cvtps2pd<mask_name>. > > > > > > > > > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > > > > > > > PR target/43618 > > > > > > > > * gcc.target/i386/pr43618-1.c: New test. > > > > > > > > > > > > > > This patch could be as simple as: > > > > > > > > > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > > > > > > index 8cd0f617bf3..c331445cb2d 100644 > > > > > > > --- a/gcc/config/i386/sse.md > > > > > > > +++ b/gcc/config/i386/sse.md > > > > > > > @@ -9195,7 +9195,7 @@ > > > > > > > (define_insn "extendv2sfv2df2" > > > > > > > [(set (match_operand:V2DF 0 "register_operand" "=v") > > > > > > > (float_extend:V2DF > > > > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > > > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))] > > > > > > > "TARGET_MMX_WITH_SSE" > > > > > > > "%vcvtps2pd\t{%1, %0|%0, %1}" > > > > > > > [(set_attr "type" "ssecvt") > > > > > > > > > > > > We also tested on this version, it is ok. > > > > > > > > > > > > The reason why the patch looks like this is because in the > > > > > > previous insn sse2_cvtps2pd<mask_name>, the constraint vm and > > > > > > vector_operand actually does not match the actual instruction. > > > > > > Memory operand is V2SF, not V4SF. > > > > > > > > > > > > Therefore, we changed the constraint in that insn. Then it caused > > another > > > issue. > > > > > > For memory operand, it seems that we cannot generate those mask > > > instructions. > > > > > > So I change the pattern to how extendv2hfv2df2 works. > > > > > > > > > > If you want to change the memory access in > > sse2_cvtps2pd<mask_name>, > > > > > then please see how e.g. <insn>v2hiv2di is handled in sse.md. In > > > > > addition to two instructions, you will need one > > > > > define_insn_and_split with a pre-reload splitter. > > > > > > > > Oh, nowadays combine does vec_select from a paradoxical subreg on its > > own. > > > > > > > > +(define_expand "extendv2sfv2df2" > > > > + [(set (match_operand:V2DF 0 "register_operand") > > > > + (float_extend:V2DF > > > > + (match_operand:V2SF 1 "nonimmediate_operand")))] > > > > + "TARGET_MMX_WITH_SSE" > > > > +{ > > > > + if (!MEM_P (operands[1])) > > > > + { > > > > > > > > You will need force reg here: > > > > > > > > rtx op1 = force_reg (V2SFmode, operands[1]); > > > > + operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode); > > > > + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); > > > > + DONE; > > > > + } > > > > +}) > > > > > > > > > > > > -(define_insn "extendv2sfv2df2" > > > > +(define_insn "sse2_cvtps2pd_load<mask_name>" > > > > > > > > Please name this insn "*sse2_cvtps2pd<mask_name>_1". Please note > > the > > > > star at the beginning, You don't have to make the name public. > > > > > > > > OK with the above changes. > > > > > > Forgot to mention: > > > > > > > > > - (match_operand:V2SF 1 "register_operand" "v")))] > > > - "TARGET_MMX_WITH_SSE" > > > - "%vcvtps2pd\t{%1, %0|%0, %1}" > > > + (match_operand:V2SF 1 "memory_operand" "m")))] > > > + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>" > > > + "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_oper > > > and2>, %q1}" > > > [(set_attr "type" "ssecvt") > > > > > > The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so > > we > > > can use TARGET_SSE2 here. > > > > > > Which opens the question if the expander could also be TARGET_SSE2 only. > > > There are no MMX registers involved in any of the patterns anymore. > > Yes. > > > > > > Uros. > > > > > > > > Thanks, > > > > Uros,
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 8b2602bfa79..f96bb3dc6c3 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -9175,11 +9175,25 @@ (set_attr "prefix" "evex") (set_attr "mode" "<sseinsnmode>")]) +(define_expand "extendv2sfv2df2" + [(set (match_operand:V2DF 0 "register_operand") + (float_extend:V2DF + (match_operand:V2SF 1 "nonimmediate_operand")))] + "TARGET_MMX_WITH_SSE" +{ + if (!MEM_P (operands[1])) + { + operands[1] = lowpart_subreg (V4SFmode, operands[1], V2SFmode); + emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1])); + DONE; + } +}) + (define_insn "sse2_cvtps2pd<mask_name>" [(set (match_operand:V2DF 0 "register_operand" "=v") (float_extend:V2DF (vec_select:V2SF - (match_operand:V4SF 1 "vector_operand" "vm") + (match_operand:V4SF 1 "register_operand" "v") (parallel [(const_int 0) (const_int 1)]))))] "TARGET_SSE2 && <mask_avx512vl_condition>" "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}" @@ -9191,12 +9205,12 @@ (set_attr "prefix" "maybe_vex") (set_attr "mode" "V2DF")]) -(define_insn "extendv2sfv2df2" +(define_insn "sse2_cvtps2pd_load<mask_name>" [(set (match_operand:V2DF 0 "register_operand" "=v") (float_extend:V2DF - (match_operand:V2SF 1 "register_operand" "v")))] - "TARGET_MMX_WITH_SSE" - "%vcvtps2pd\t{%1, %0|%0, %1}" + (match_operand:V2SF 1 "memory_operand" "m")))] + "TARGET_MMX_WITH_SSE && <mask_avx512vl_condition>" + "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}" [(set_attr "type" "ssecvt") (set_attr "amdfam10_decode" "direct") (set_attr "athlon_decode" "double") diff --git a/gcc/testsuite/gcc.target/i386/pr43618-1.c b/gcc/testsuite/gcc.target/i386/pr43618-1.c new file mode 100644 index 00000000000..3c84ea444aa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr43618-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "movq" } } */ +/* { dg-final { scan-assembler "cvtps2pd" } } */ + +void +foo (float a[2], double b[2]) +{ + int i; + for (i = 0; i < 2; i++) + b[i] = a[i]; +} +