Message ID | 20100622175837.GA3734@intel.com |
---|---|
State | New |
Headers | show |
On Tue, 2010-06-22 at 10:58 -0700, H.J. Lu wrote: > Hi, > > AVX cast from 256bit to 128bit and extract lower 128bit from 256 bit > are the same operation. This patch replaces AVX cast with lower 128bit > extraction. It also uses define and split for lower 128bit extractions. > Tested on Linux/x86-64. OK for trunk? > > Thanks. > > > H.J. > --- > 2010-06-22 H.J. Lu <hongjiu.lu@intel.com> > > * config/i386/i386.c (bdesc_args): Replace CODE_FOR_avx_si_si256, > CODE_FOR_avx_ps_ps256 and CODE_FOR_avx_pd_pd256 with > CODE_FOR_vec_extract_lo_v8si, CODE_FOR_vec_extract_lo_v8sf > and CODE_FOR_vec_extract_lo_v4df. > > * config/i386/sse.md (vec_extract_lo_<mode>:AVX256MODE4P): If we want to specify pattern names with this approach, then please use "vec_extract_lo_<AVX256MODE4P:mode>", as this is correct mode iterator syntax. > Changed to define_insn_and_split. > (vec_extract_lo_<mode>:AVX256MODE8P): Likewise. > (vec_extract_lo_v16hi): Likewise. > (vec_extract_lo_v32qi): Likewise. > (avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>): Likewise. > (avx_<avxmodesuffixp>_<avxmodesuffixp><avxmodesuffix>): Removed. > > -(define_insn "vec_extract_lo_<mode>" > +(define_insn_and_split "vec_extract_lo_<mode>" > [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") > (vec_select:<avxhalfvecmode> > - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") > + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") > (parallel [(const_int 0) (const_int 1)])))] > "TARGET_AVX" > - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" > - [(set_attr "type" "sselog") > - (set_attr "prefix_extra" "1") > - (set_attr "length_immediate" "1") > - (set_attr "memory" "none,store") > - (set_attr "prefix" "vex") > - (set_attr "mode" "V8SF")]) > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + rtx op1 = operands[1]; > + if (REG_P (op1)) > + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); > + else > + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); > + emit_move_insn (operands[0], op1); > + DONE; > +}) Hm, can't gen_lowpart handle register conversion directly? Uros.
On Tue, Jun 22, 2010 at 11:27 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, 2010-06-22 at 10:58 -0700, H.J. Lu wrote: >> Hi, >> >> AVX cast from 256bit to 128bit and extract lower 128bit from 256 bit >> are the same operation. This patch replaces AVX cast with lower 128bit >> extraction. It also uses define and split for lower 128bit extractions. >> Tested on Linux/x86-64. OK for trunk? >> >> Thanks. >> >> >> H.J. >> --- >> 2010-06-22 H.J. Lu <hongjiu.lu@intel.com> >> >> * config/i386/i386.c (bdesc_args): Replace CODE_FOR_avx_si_si256, >> CODE_FOR_avx_ps_ps256 and CODE_FOR_avx_pd_pd256 with >> CODE_FOR_vec_extract_lo_v8si, CODE_FOR_vec_extract_lo_v8sf >> and CODE_FOR_vec_extract_lo_v4df. >> >> * config/i386/sse.md (vec_extract_lo_<mode>:AVX256MODE4P): > > If we want to specify pattern names with this approach, then please use > "vec_extract_lo_<AVX256MODE4P:mode>", as this is correct mode iterator > syntax. I will make the change. >> Changed to define_insn_and_split. >> (vec_extract_lo_<mode>:AVX256MODE8P): Likewise. >> (vec_extract_lo_v16hi): Likewise. >> (vec_extract_lo_v32qi): Likewise. >> (avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>): Likewise. >> (avx_<avxmodesuffixp>_<avxmodesuffixp><avxmodesuffix>): Removed. >> > >> -(define_insn "vec_extract_lo_<mode>" >> +(define_insn_and_split "vec_extract_lo_<mode>" >> [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") >> (vec_select:<avxhalfvecmode> >> - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") >> + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") >> (parallel [(const_int 0) (const_int 1)])))] >> "TARGET_AVX" >> - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" >> - [(set_attr "type" "sselog") >> - (set_attr "prefix_extra" "1") >> - (set_attr "length_immediate" "1") >> - (set_attr "memory" "none,store") >> - (set_attr "prefix" "vex") >> - (set_attr "mode" "V8SF")]) >> + "#" >> + "&& reload_completed" >> + [(const_int 0)] >> +{ >> + rtx op1 = operands[1]; >> + if (REG_P (op1)) >> + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); >> + else >> + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); >> + emit_move_insn (operands[0], op1); >> + DONE; >> +}) > > Hm, can't gen_lowpart handle register conversion directly? > That is how it is done in other places in sse.md. gen_lowpart may generate SUBREG, which we don't want on registers.
On Tue, 2010-06-22 at 11:33 -0700, H.J. Lu wrote: > >> -(define_insn "vec_extract_lo_<mode>" > >> +(define_insn_and_split "vec_extract_lo_<mode>" > >> [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") > >> (vec_select:<avxhalfvecmode> > >> - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") > >> + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") > >> (parallel [(const_int 0) (const_int 1)])))] > >> "TARGET_AVX" > >> - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" > >> - [(set_attr "type" "sselog") > >> - (set_attr "prefix_extra" "1") > >> - (set_attr "length_immediate" "1") > >> - (set_attr "memory" "none,store") > >> - (set_attr "prefix" "vex") > >> - (set_attr "mode" "V8SF")]) > >> + "#" > >> + "&& reload_completed" > >> + [(const_int 0)] > >> +{ > >> + rtx op1 = operands[1]; > >> + if (REG_P (op1)) > >> + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); > >> + else > >> + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); > >> + emit_move_insn (operands[0], op1); > >> + DONE; > >> +}) > > > > Hm, can't gen_lowpart handle register conversion directly? > > > > That is how it is done in other places in sse.md. > gen_lowpart may generate SUBREG, which we don't > want on registers. This is post-reload splitter, so IIRC, gen_lowpart will just change mode of the hard register. But, I'm not totaly sure... Uros.
On Tue, Jun 22, 2010 at 11:47 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Tue, 2010-06-22 at 11:33 -0700, H.J. Lu wrote: > >> >> -(define_insn "vec_extract_lo_<mode>" >> >> +(define_insn_and_split "vec_extract_lo_<mode>" >> >> [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") >> >> (vec_select:<avxhalfvecmode> >> >> - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") >> >> + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") >> >> (parallel [(const_int 0) (const_int 1)])))] >> >> "TARGET_AVX" >> >> - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" >> >> - [(set_attr "type" "sselog") >> >> - (set_attr "prefix_extra" "1") >> >> - (set_attr "length_immediate" "1") >> >> - (set_attr "memory" "none,store") >> >> - (set_attr "prefix" "vex") >> >> - (set_attr "mode" "V8SF")]) >> >> + "#" >> >> + "&& reload_completed" >> >> + [(const_int 0)] >> >> +{ >> >> + rtx op1 = operands[1]; >> >> + if (REG_P (op1)) >> >> + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); >> >> + else >> >> + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); >> >> + emit_move_insn (operands[0], op1); >> >> + DONE; >> >> +}) >> > >> > Hm, can't gen_lowpart handle register conversion directly? >> > >> >> That is how it is done in other places in sse.md. >> gen_lowpart may generate SUBREG, which we don't >> want on registers. > > This is post-reload splitter, so IIRC, gen_lowpart will just change mode > of the hard register. But, I'm not totaly sure... > It doesn't worl. I got Starting program: /export/build/gnu/gcc/build-x86_64-linux/prev-gcc/cc1 -fpreprocessed x.i -quiet -dumpbase x.i -mavx -mtune=generic -march=x86-64 -auxbase x -O2 -version -o x.s GNU C (GCC) version 4.6.0 20100622 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.4.4 20100503 (Red Hat 4.4.4-2), GMP version 4.3.1, MPFR version 2.4.2-p3, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C (GCC) version 4.6.0 20100622 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.4.4 20100503 (Red Hat 4.4.4-2), GMP version 4.3.1, MPFR version 2.4.2-p3, MPC version 0.8.1 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: e1890fe66351dde4fa9a333ace671390 Breakpoint 1, fancy_abort ( file=0x3054618 "/export/gnu/import/git/gcc/gcc/emit-rtl.c", line=890, function=0x30555d9 "gen_reg_rtx") at /export/gnu/import/git/gcc/gcc/diagnostic.c:879 879 internal_error ("in %s, at %s:%d", function, trim_filename (file), line); (gdb) bt #0 fancy_abort (file=0x3054618 "/export/gnu/import/git/gcc/gcc/emit-rtl.c", line=890, function=0x30555d9 "gen_reg_rtx") at /export/gnu/import/git/gcc/gcc/diagnostic.c:879 #1 0x0000000000ab22b0 in gen_reg_rtx (mode=V4DFmode) at /export/gnu/import/git/gcc/gcc/emit-rtl.c:890 #2 0x0000000000afd9c5 in copy_to_reg (x=0x7ffff179df40) at /export/gnu/import/git/gcc/gcc/explow.c:601 #3 0x000000000135345f in gen_lowpart_general (mode=V2DFmode, x=0x7ffff179df40) at /export/gnu/import/git/gcc/gcc/rtlhooks.c:50 #4 0x00000000026d8ba3 in gen_split_3050 (curr_insn=0x7ffff1646318, operands=0x43dce80) at /export/gnu/import/git/gcc/gcc/config/i386/sse.md:4180 #5 0x000000000282a0d1 in split_5 (x0=0x7ffff164f048, insn=0x7ffff1646318) at /export/gnu/import/git/gcc/gcc/config/i386/sse.md:4179 #6 0x000000000283089e in split_insns (x0=0x7ffff164f048, insn=0x7ffff1646318) at /export/gnu/import/git/gcc/gcc/config/i386/sse.md:4610 #7 0x0000000000ab9e9d in try_split (pat=0x7ffff164f048, trial=0x7ffff1646318, last=1) at /export/gnu/import/git/gcc/gcc/emit-rtl.c:3431 #8 0x000000000122faf3 in split_insn (insn=0x7ffff1646318) at /export/gnu/import/git/gcc/gcc/recog.c:2747 #9 0x000000000122fef8 in split_all_insns () at /export/gnu/import/git/gcc/gcc/recog.c:2836 #10 0x0000000001231821 in rest_of_handle_split_after_reload () ---Type <return> to continue, or q <return> to quit--- at /export/gnu/import/git/gcc/gcc/recog.c:3562 #11 0x00000000011348e3 in execute_one_pass (pass=0x43664a0) at /export/gnu/import/git/gcc/gcc/passes.c:1576 #12 0x0000000001134acc in execute_pass_list (pass=0x43664a0) at /export/gnu/import/git/gcc/gcc/passes.c:1631 #13 0x0000000001134aed in execute_pass_list (pass=0x4366060) at /export/gnu/import/git/gcc/gcc/passes.c:1632 #14 0x0000000001134aed in execute_pass_list (pass=0x4366000) at /export/gnu/import/git/gcc/gcc/passes.c:1632 #15 0x0000000001810013 in tree_rest_of_compilation (fndecl=0x7ffff15b3600) at /export/gnu/import/git/gcc/gcc/tree-optimize.c:420 #16 0x000000000236b006 in cgraph_expand_function (node=0x7ffff15b6158) at /export/gnu/import/git/gcc/gcc/cgraphunit.c:1632 #17 0x000000000236b2ca in cgraph_expand_all_functions () at /export/gnu/import/git/gcc/gcc/cgraphunit.c:1711 #18 0x000000000236b8f2 in cgraph_optimize () at /export/gnu/import/git/gcc/gcc/cgraphunit.c:1967 #19 0x00000000023692e3 in cgraph_finalize_compilation_unit () at /export/gnu/import/git/gcc/gcc/cgraphunit.c:1171 #20 0x00000000004e6bea in c_write_global_declarations () at /export/gnu/import/git/gcc/gcc/c-decl.c:9698 #21 0x00000000014fe1b5 in compile_file () at /export/gnu/import/git/gcc/gcc/toplev.c:997 ---Type <return> to continue, or q <return> to quit--- #22 0x00000000015003f7 in do_compile () at /export/gnu/import/git/gcc/gcc/toplev.c:2342 #23 0x00000000015004c5 in toplev_main (argc=15, argv=0x7fffffffe318) at /export/gnu/import/git/gcc/gcc/toplev.c:2383 #24 0x00000000006ec6de in main (argc=15, argv=0x7fffffffe318) at /export/gnu/import/git/gcc/gcc/main.c:35
On Tue, Jun 22, 2010 at 9:12 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> >> -(define_insn "vec_extract_lo_<mode>" >>> >> +(define_insn_and_split "vec_extract_lo_<mode>" >>> >> [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") >>> >> (vec_select:<avxhalfvecmode> >>> >> - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") >>> >> + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") >>> >> (parallel [(const_int 0) (const_int 1)])))] >>> >> "TARGET_AVX" >>> >> - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" >>> >> - [(set_attr "type" "sselog") >>> >> - (set_attr "prefix_extra" "1") >>> >> - (set_attr "length_immediate" "1") >>> >> - (set_attr "memory" "none,store") >>> >> - (set_attr "prefix" "vex") >>> >> - (set_attr "mode" "V8SF")]) >>> >> + "#" >>> >> + "&& reload_completed" >>> >> + [(const_int 0)] >>> >> +{ >>> >> + rtx op1 = operands[1]; >>> >> + if (REG_P (op1)) >>> >> + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); >>> >> + else >>> >> + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); >>> >> + emit_move_insn (operands[0], op1); >>> >> + DONE; >>> >> +}) >>> > >>> > Hm, can't gen_lowpart handle register conversion directly? >>> > >>> >>> That is how it is done in other places in sse.md. >>> gen_lowpart may generate SUBREG, which we don't >>> want on registers. >> >> This is post-reload splitter, so IIRC, gen_lowpart will just change mode >> of the hard register. But, I'm not totaly sure... >> > > It doesn't work. I got Hm, OK then. Thanks, Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 657e55a..268be3b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -22427,9 +22427,9 @@ static const struct builtin_description bdesc_args[] = { OPTION_MASK_ISA_AVX, CODE_FOR_avx_si256_si, "__builtin_ia32_si256_si", IX86_BUILTIN_SI256_SI, UNKNOWN, (int) V8SI_FTYPE_V4SI }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_ps256_ps, "__builtin_ia32_ps256_ps", IX86_BUILTIN_PS256_PS, UNKNOWN, (int) V8SF_FTYPE_V4SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_pd256_pd, "__builtin_ia32_pd256_pd", IX86_BUILTIN_PD256_PD, UNKNOWN, (int) V4DF_FTYPE_V2DF }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_si_si256, "__builtin_ia32_si_si256", IX86_BUILTIN_SI_SI256, UNKNOWN, (int) V4SI_FTYPE_V8SI }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_ps_ps256, "__builtin_ia32_ps_ps256", IX86_BUILTIN_PS_PS256, UNKNOWN, (int) V4SF_FTYPE_V8SF }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_pd_pd256, "__builtin_ia32_pd_pd256", IX86_BUILTIN_PD_PD256, UNKNOWN, (int) V2DF_FTYPE_V4DF }, + { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8si, "__builtin_ia32_si_si256", IX86_BUILTIN_SI_SI256, UNKNOWN, (int) V4SI_FTYPE_V8SI }, + { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8sf, "__builtin_ia32_ps_ps256", IX86_BUILTIN_PS_PS256, UNKNOWN, (int) V4SF_FTYPE_V8SF }, + { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v4df, "__builtin_ia32_pd_pd256", IX86_BUILTIN_PD_PD256, UNKNOWN, (int) V2DF_FTYPE_V4DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, "__builtin_ia32_vtestzpd", IX86_BUILTIN_VTESTZPD, EQ, (int) INT_FTYPE_V2DF_V2DF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, "__builtin_ia32_vtestcpd", IX86_BUILTIN_VTESTCPD, LTU, (int) INT_FTYPE_V2DF_V2DF_PTEST }, diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 7625906..ed22675 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4178,19 +4178,24 @@ DONE; }) -(define_insn "vec_extract_lo_<mode>" +(define_insn_and_split "vec_extract_lo_<mode>" [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") (vec_select:<avxhalfvecmode> - (match_operand:AVX256MODE4P 1 "register_operand" "x,x") + (match_operand:AVX256MODE4P 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1)])))] "TARGET_AVX" - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" - [(set_attr "type" "sselog") - (set_attr "prefix_extra" "1") - (set_attr "length_immediate" "1") - (set_attr "memory" "none,store") - (set_attr "prefix" "vex") - (set_attr "mode" "V8SF")]) + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); + else + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); + emit_move_insn (operands[0], op1); + DONE; +}) (define_insn "vec_extract_hi_<mode>" [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") @@ -4206,20 +4211,25 @@ (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) -(define_insn "vec_extract_lo_<mode>" +(define_insn_and_split "vec_extract_lo_<mode>" [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") (vec_select:<avxhalfvecmode> - (match_operand:AVX256MODE8P 1 "register_operand" "x,x") + (match_operand:AVX256MODE8P 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] "TARGET_AVX" - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" - [(set_attr "type" "sselog") - (set_attr "prefix_extra" "1") - (set_attr "length_immediate" "1") - (set_attr "memory" "none,store") - (set_attr "prefix" "vex") - (set_attr "mode" "V8SF")]) + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (<avxhalfvecmode>mode, REGNO (op1)); + else + op1 = gen_lowpart (<avxhalfvecmode>mode, op1); + emit_move_insn (operands[0], op1); + DONE; +}) (define_insn "vec_extract_hi_<mode>" [(set (match_operand:<avxhalfvecmode> 0 "nonimmediate_operand" "=x,m") @@ -4236,22 +4246,27 @@ (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) -(define_insn "vec_extract_lo_v16hi" +(define_insn_and_split "vec_extract_lo_v16hi" [(set (match_operand:V8HI 0 "nonimmediate_operand" "=x,m") (vec_select:V8HI - (match_operand:V16HI 1 "register_operand" "x,x") + (match_operand:V16HI 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])))] "TARGET_AVX" - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" - [(set_attr "type" "sselog") - (set_attr "prefix_extra" "1") - (set_attr "length_immediate" "1") - (set_attr "memory" "none,store") - (set_attr "prefix" "vex") - (set_attr "mode" "V8SF")]) + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (V8HImode, REGNO (op1)); + else + op1 = gen_lowpart (V8HImode, op1); + emit_move_insn (operands[0], op1); + DONE; +}) (define_insn "vec_extract_hi_v16hi" [(set (match_operand:V8HI 0 "nonimmediate_operand" "=x,m") @@ -4270,10 +4285,10 @@ (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) -(define_insn "vec_extract_lo_v32qi" +(define_insn_and_split "vec_extract_lo_v32qi" [(set (match_operand:V16QI 0 "nonimmediate_operand" "=x,m") (vec_select:V16QI - (match_operand:V32QI 1 "register_operand" "x,x") + (match_operand:V32QI 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) @@ -4283,13 +4298,18 @@ (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] "TARGET_AVX" - "vextractf128\t{$0x0, %1, %0|%0, %1, 0x0}" - [(set_attr "type" "sselog") - (set_attr "prefix_extra" "1") - (set_attr "length_immediate" "1") - (set_attr "memory" "none,store") - (set_attr "prefix" "vex") - (set_attr "mode" "V8SF")]) + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (V16QImode, REGNO (op1)); + else + op1 = gen_lowpart (V16QImode, op1); + emit_move_insn (operands[0], op1); + DONE; +}) (define_insn "vec_extract_hi_v32qi" [(set (match_operand:V16QI 0 "nonimmediate_operand" "=x,m") @@ -12252,77 +12272,24 @@ (set_attr "prefix" "vex") (set_attr "mode" "<MODE>")]) -(define_insn "avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>" - [(set (match_operand:AVX256MODE2P 0 "register_operand" "=x,x") +(define_insn_and_split "avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>" + [(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m") (unspec:AVX256MODE2P - [(match_operand:<avxhalfvecmode> 1 "nonimmediate_operand" "0,xm")] - UNSPEC_CAST))] - "TARGET_AVX" -{ - switch (which_alternative) - { - case 0: - return ""; - case 1: - switch (get_attr_mode (insn)) - { - case MODE_V8SF: - return "vmovaps\t{%1, %x0|%x0, %1}"; - case MODE_V4DF: - return "vmovapd\t{%1, %x0|%x0, %1}"; - case MODE_OI: - return "vmovdqa\t{%1, %x0|%x0, %1}"; - default: - break; - } - default: - break; - } - gcc_unreachable (); -} - [(set_attr "type" "ssemov") - (set_attr "prefix" "vex") - (set_attr "mode" "<avxvecmode>") - (set (attr "length") - (if_then_else (eq_attr "alternative" "0") - (const_string "0") - (const_string "*")))]) - -(define_insn "avx_<avxmodesuffixp>_<avxmodesuffixp><avxmodesuffix>" - [(set (match_operand:<avxhalfvecmode> 0 "register_operand" "=x,x") - (unspec:<avxhalfvecmode> - [(match_operand:AVX256MODE2P 1 "nonimmediate_operand" "0,xm")] + [(match_operand:<avxhalfvecmode> 1 "nonimmediate_operand" "xm,x")] UNSPEC_CAST))] "TARGET_AVX" + "#" + "&& reload_completed" + [(const_int 0)] { - switch (which_alternative) - { - case 0: - return ""; - case 1: - switch (get_attr_mode (insn)) - { - case MODE_V8SF: - return "vmovaps\t{%x1, %0|%0, %x1}"; - case MODE_V4DF: - return "vmovapd\t{%x1, %0|%0, %x1}"; - case MODE_OI: - return "vmovdqa\t{%x1, %0|%0, %x1}"; - default: - break; - } - default: - break; - } - gcc_unreachable (); -} - [(set_attr "type" "ssemov") - (set_attr "prefix" "vex") - (set_attr "mode" "<avxvecmode>") - (set (attr "length") - (if_then_else (eq_attr "alternative" "0") - (const_string "0") - (const_string "*")))]) + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (<MODE>mode, REGNO (op1)); + else + op1 = gen_lowpart (<MODE>mode, op1); + emit_move_insn (operands[0], op1); + DONE; +}) (define_expand "vec_init<mode>" [(match_operand:AVX256MODE 0 "register_operand" "")