Message ID | 01e0bcef-10a0-561b-0227-4700ebe22692@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [PATCH-1,v2,rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453] | expand |
Hi! On Tue, Jun 07, 2022 at 04:07:58PM +0800, HAO CHEN GUI wrote: > This patch replaces shift and ior insns with one rotate and mask > insn for the split patterns which are for DI byte swap on Power6. The > test cases shows the optimization. Nice :-) > - emit_insn (gen_ashldi3 (op3, op3, GEN_INT (32))); > - emit_insn (gen_iordi3 (dest, dest, op3)); > + emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest, > + GEN_INT (0xffffffff))); You could make some define_expand to make this easier to use. But not sure what to call it. The goal would be to make this easier to read and use, not to make it harder :-) Something with duplicate-si-to-di or such? Is that pattern somewhere else already, maybe vectors, maybe some other target even? > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile { target lp64 } } */ > +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */ It doesn't require -m64, only -mpowerpc64. You can use has_arch_ppc64 to test for the latter. Okay for trunk, even without that improvement. Thanks! Segher
On Tue, Jun 07, 2022 at 04:07:58PM +0800, HAO CHEN GUI wrote: > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. One further thing: please make sure you have tested on -m32 as well, especially for integer stuff like this, it is easy to accidentally get the conditions on a define_* and one of its uses out of synch. But it looks to be okay in this case :-) Segher
Hi, On 7/6/2022 下午 11:59, Segher Boessenkool wrote: >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile { target lp64 } } */ >> +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */ > It doesn't require -m64, only -mpowerpc64. You can use has_arch_ppc64 > to test for the latter. Tested it with 'target has_arch_ppc64', it works on both -m32 and -m64. Thanks. Gui Haochen
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bf85baa5370..83800df12aa 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -2828,8 +2828,8 @@ (define_split emit_insn (gen_bswapsi2 (dest_32, word2)); } - emit_insn (gen_ashldi3 (op3, op3, GEN_INT (32))); - emit_insn (gen_iordi3 (dest, dest, op3)); + emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest, + GEN_INT (0xffffffff))); DONE; }) @@ -2914,10 +2914,10 @@ (define_split rtx op3_si = simplify_gen_subreg (SImode, op3, DImode, lo_off); emit_insn (gen_lshrdi3 (op2, src, GEN_INT (32))); - emit_insn (gen_bswapsi2 (dest_si, src_si)); - emit_insn (gen_bswapsi2 (op3_si, op2_si)); - emit_insn (gen_ashldi3 (dest, dest, GEN_INT (32))); - emit_insn (gen_iordi3 (dest, dest, op3)); + emit_insn (gen_bswapsi2 (op3_si, src_si)); + emit_insn (gen_bswapsi2 (dest_si, op2_si)); + emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest, + GEN_INT (0xffffffff))); DONE; }) diff --git a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c new file mode 100644 index 00000000000..4271886561f --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */ + +unsigned long load_byte_reverse (unsigned long *in) +{ + return __builtin_bswap64 (*in); +} + +unsigned long byte_reverse (unsigned long in) +{ + return __builtin_bswap64 (in); +} + +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */