From patchwork Mon Jul 7 11:40:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej W. Rozycki" X-Patchwork-Id: 367500 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 292EA1400BE for ; Mon, 7 Jul 2014 21:41:31 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=UZhAXWejUsxLMjQad3/Q9f7xXmZMYeHor+nmrOJcdD8oJ9e6FiUiY brksA/oGMbtyT9+YZAn5PwYmrjljvfeYJSpEDb7lU7djyTxXDa+MqxM4T1KYZz4w xsSJEe7U/pNaLIcCDWHeTtYgzQNE5x5k/zsgL334Zlk/udZQZodvjQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=FkKOlrdVLwqO7/PS+/0qeYlyKSM=; b=EY2yHF836Vh0EMi5cZZb eSvG7nmUUmRT8zjluWT+UzBtNNl3QEGPIOJ2BAQc+7AhXQjpDyl8Jci3VzDIEuIt 3cL1COiXCeh3mLFynPWMDQGUwbxQLKGsNV0nz64HPTKfm9I4tLWyw+OkA0BPr68D a7QVqvt3bG/lXw9jJOPg69c= Received: (qmail 21690 invoked by alias); 7 Jul 2014 11:41:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 21661 invoked by uid 89); 7 Jul 2014 11:41:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 07 Jul 2014 11:41:09 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1X47I8-0001Zp-RS from Maciej_Rozycki@mentor.com for gcc-patches@gcc.gnu.org; Mon, 07 Jul 2014 04:41:04 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 7 Jul 2014 04:41:04 -0700 Received: from localhost (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server (TLS) id 14.2.247.3; Mon, 7 Jul 2014 12:41:02 +0100 Date: Mon, 7 Jul 2014 12:40:57 +0100 From: "Maciej W. Rozycki" To: Subject: [PATCH] Power/GCC: Implement little-endian SPE operations Message-ID: User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Hi, This change implements little-endian code generation for Signal Processing Engine (SPE) operations. Where possible changes are handled within the existing patterns with suitable conditionals added to support the little-endian mode. In some cases operand constraints are different between the two endiannesses where an numerical entity is accessed in memory with a partial data transfer. In these cases new patterns have been added and the existing patterns renamed to reflect the two endiannesses handled. Finally the paired-integer vector permute intrinsics do not correspond to the same high-level operations and have therefore been reimplemented with new expander patterns. The reason is number pairs in vectors are placed in memory in the same order regardless of the endianness selected -- the first number occupies the lower-addressed unit and the second number takes the higher-addressed unit. When transferred into a register with a doubleword vector load operation they appear in the register word-swapped between endiannesses. These intrinsics turned out not properly covered by the testsuite, a mistake made in the process of implementing the new expanders went through unnoticed as only compilation-time checks are made and no run-time ones are. Therefore a new test case has been added that covers the intrinsics, and that scores no failures with or without changes made to GCC with this patch. The existing patterns that used to handle these intrinsics and that can also be pulled implicitly by the optimiser, have been renamed to reflect the individual vector permutation operations they implement and extended to handle the little endianness too. This change removes several hundreds of failures seen in powerpc-eabi GCC, G++, libstdc++ and also GDB testing for the: -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mlittle multilib and does not change results for the following powerpc-eabi multilibs: -mcpu=603e -mcpu=603e -msoft-float -mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -msoft-float -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -msoft-float -mcpu=7400 -maltivec -mabi=altivec as well as the following powerpc-linux-gnu multilibs: -mcpu=603e -mcpu=603e -msoft-float -mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mcpu=7400 -maltivec -mabi=altivec -mcpu=e5500 -m64 OK to apply? 2014-07-07 Maciej W. Rozycki gcc/ * config/rs6000/rs6000.c (output_vec_const_move): Handle little-endian code generation. * config/rs6000/spe.md (spe_evmergehi): Rename to... (vec_perm00_v2si): ... this. Handle little-endian code generation. (spe_evmergehilo): Rename to... (vec_perm01_v2si): ... this. Handle little-endian code generation. (spe_evmergelo): Rename to... (vec_perm11_v2si): ... this. Handle little-endian code generation. (spe_evmergelohi): Rename to... (vec_perm10_v2si): ... this. Handle little-endian code generation. (spe_evmergehi, spe_evmergehilo): New expanders. (spe_evmergelo, spe_evmergelohi): Likewise. (*frob__): Handle little-endian code generation. (*frob_tf_ti): Likewise. (*frob__di_2): Likewise. (*frob_tf_di_8_2): Likewise. (*frob_di_): Likewise. (*frob_ti_tf): Likewise. (*frob___2): Likewise. (*frob_ti__8_2): Likewise. (*frob_ti_tf_2): Likewise. (mov_si_e500_subreg0): Rename to... (mov_si_e500_subreg0_be): ... this. Restrict to the big endianness only. (*mov_si_e500_subreg0_le): New instruction pattern. (*mov_si_e500_subreg0_elf_low): Rename to... (*mov_si_e500_subreg0_elf_low_be): ... this. Restrict to the big endianness only. (*mov_si_e500_subreg0_elf_low_le): New instruction pattern. (*mov_si_e500_subreg0_2): Rename to... (*mov_si_e500_subreg0_2_be): ... this. Restrict to the big big endianness only. (*mov_si_e500_subreg0_2_le): New instruction pattern. (*mov_si_e500_subreg4): Rename to... (*mov_si_e500_subreg4_be): ... this. Restrict to the big endianness only. (mov_si_e500_subreg4_le): New instruction pattern. (*mov_si_e500_subreg4_elf_low): Rename to... (*mov_si_e500_subreg4_elf_low_be): ... this. Restrict to the big endianness only. (*mov_si_e500_subreg4_elf_low_le): New instruction/splitter pattern. (*mov_si_e500_subreg4_2): Rename to... (*mov_si_e500_subreg4_2_be): ... this. Restrict to the big endianness only. (*mov_si_e500_subreg4_2_le): New instruction pattern. (*mov_sitf_e500_subreg8): Rename to... (*mov_sitf_e500_subreg8_be): ... this. Restrict to the big endianness only. (*mov_sitf_e500_subreg8_le): New instruction pattern. (*mov_sitf_e500_subreg8_2): Rename to... (*mov_sitf_e500_subreg8_2_be): ... this. Restrict to the big endianness only. (*mov_sitf_e500_subreg8_2_le): New instruction pattern. (*mov_sitf_e500_subreg12): Rename to... (*mov_sitf_e500_subreg12_be): ... this. Restrict to the big endianness only. (*mov_sitf_e500_subreg12_le): New instruction pattern. (*mov_sitf_e500_subreg12_2): Rename to... (*mov_sitf_e500_subreg12_2_be): ... this. Restrict to the big endianness only. (*mov_sitf_e500_subreg12_2_le): New instruction pattern. gcc/testsuite/ * gcc.target/powerpc/spe-evmerge.c: New file. Maciej gcc-ppc-spe-le.diff Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.c =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/rs6000.c 2014-06-11 16:35:08.917560846 +0100 +++ gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.c 2014-06-11 16:35:25.917851800 +0100 @@ -5299,8 +5299,10 @@ output_vec_const_move (rtx *operands) operands[2] = CONST_VECTOR_ELT (vec, 1); if (cst == cst2) return "li %0,%1\n\tevmergelo %0,%0,%0"; - else + else if (WORDS_BIG_ENDIAN) return "li %0,%1\n\tevmergelo %0,%0,%0\n\tli %0,%2"; + else + return "li %0,%2\n\tevmergelo %0,%0,%0\n\tli %0,%1"; } /* Initialize TARGET of vector PAIRED to VALS. */ Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md =================================================================== --- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/spe.md 2014-05-16 16:01:20.197526085 +0100 +++ gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md 2014-06-11 16:35:25.917851800 +0100 @@ -438,7 +438,7 @@ [(set_attr "type" "vecload") (set_attr "length" "4")]) -(define_insn "spe_evmergehi" +(define_insn "vec_perm00_v2si" [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r") (vec_select:V2SI (vec_concat:V4SI @@ -446,11 +446,16 @@ (match_operand:V2SI 2 "gpc_reg_operand" "r")) (parallel [(const_int 0) (const_int 2)])))] "TARGET_SPE" - "evmergehi %0,%1,%2" +{ + if (WORDS_BIG_ENDIAN) + return "evmergehi %0,%1,%2"; + else + return "evmergelo %0,%2,%1"; +} [(set_attr "type" "vecsimple") (set_attr "length" "4")]) -(define_insn "spe_evmergehilo" +(define_insn "vec_perm01_v2si" [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r") (vec_select:V2SI (vec_concat:V4SI @@ -458,11 +463,16 @@ (match_operand:V2SI 2 "gpc_reg_operand" "r")) (parallel [(const_int 0) (const_int 3)])))] "TARGET_SPE" - "evmergehilo %0,%1,%2" +{ + if (WORDS_BIG_ENDIAN) + return "evmergehilo %0,%1,%2"; + else + return "evmergehilo %0,%2,%1"; +} [(set_attr "type" "vecsimple") (set_attr "length" "4")]) -(define_insn "spe_evmergelo" +(define_insn "vec_perm11_v2si" [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r") (vec_select:V2SI (vec_concat:V4SI @@ -470,11 +480,16 @@ (match_operand:V2SI 2 "gpc_reg_operand" "r")) (parallel [(const_int 1) (const_int 3)])))] "TARGET_SPE" - "evmergelo %0,%1,%2" +{ + if (WORDS_BIG_ENDIAN) + return "evmergelo %0,%1,%2"; + else + return "evmergehi %0,%2,%1"; +} [(set_attr "type" "vecsimple") (set_attr "length" "4")]) -(define_insn "spe_evmergelohi" +(define_insn "vec_perm10_v2si" [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r") (vec_select:V2SI (vec_concat:V4SI @@ -482,7 +497,12 @@ (match_operand:V2SI 2 "gpc_reg_operand" "r")) (parallel [(const_int 1) (const_int 2)])))] "TARGET_SPE" - "evmergelohi %0,%1,%2" +{ + if (WORDS_BIG_ENDIAN) + return "evmergelohi %0,%1,%2"; + else + return "evmergelohi %0,%2,%1"; +} [(set_attr "type" "vecsimple") (set_attr "length" "4")]) @@ -499,6 +519,58 @@ FAIL; }) +(define_expand "spe_evmergehi" + [(match_operand:V2SI 0 "register_operand" "") + (match_operand:V2SI 1 "register_operand" "") + (match_operand:V2SI 2 "register_operand" "")] + "TARGET_SPE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vec_perm00_v2si (operands[0], operands[1], operands[2])); + else + emit_insn (gen_vec_perm11_v2si (operands[0], operands[2], operands[1])); + DONE; +}) + +(define_expand "spe_evmergehilo" + [(match_operand:V2SI 0 "register_operand" "") + (match_operand:V2SI 1 "register_operand" "") + (match_operand:V2SI 2 "register_operand" "")] + "TARGET_SPE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vec_perm01_v2si (operands[0], operands[1], operands[2])); + else + emit_insn (gen_vec_perm01_v2si (operands[0], operands[2], operands[1])); + DONE; +}) + +(define_expand "spe_evmergelo" + [(match_operand:V2SI 0 "register_operand" "") + (match_operand:V2SI 1 "register_operand" "") + (match_operand:V2SI 2 "register_operand" "")] + "TARGET_SPE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vec_perm11_v2si (operands[0], operands[1], operands[2])); + else + emit_insn (gen_vec_perm00_v2si (operands[0], operands[2], operands[1])); + DONE; +}) + +(define_expand "spe_evmergelohi" + [(match_operand:V2SI 0 "register_operand" "") + (match_operand:V2SI 1 "register_operand" "") + (match_operand:V2SI 2 "register_operand" "")] + "TARGET_SPE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vec_perm10_v2si (operands[0], operands[1], operands[2])); + else + emit_insn (gen_vec_perm10_v2si (operands[0], operands[2], operands[1])); + DONE; +}) + (define_insn "spe_evnand" [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r") (not:V2SI (and:V2SI (match_operand:V2SI 1 "gpc_reg_operand" "r") @@ -2220,15 +2292,31 @@ (subreg:SPE64 (match_operand:DITI 1 "input_operand" "r,m") 0))] "(TARGET_E500_DOUBLE && mode == DFmode) || (TARGET_SPE && mode != DFmode)" - "@ - evmergelo %0,%1,%L1 - evldd%X1 %0,%y1") +{ + switch (which_alternative) + { + default: + gcc_unreachable (); + case 0: + if (WORDS_BIG_ENDIAN) + return "evmergelo %0,%1,%L1"; + else + return "evmergelo %0,%L1,%1"; + case 1: + return "evldd%X1 %0,%y1"; + } +}) (define_insn "*frob_tf_ti" [(set (match_operand:TF 0 "gpc_reg_operand" "=r") (subreg:TF (match_operand:TI 1 "gpc_reg_operand" "r") 0))] "TARGET_E500_DOUBLE" - "evmergelo %0,%1,%L1\;evmergelo %L0,%Y1,%Z1" +{ + if (WORDS_BIG_ENDIAN) + return "evmergelo %0,%1,%L1\;evmergelo %L0,%Y1,%Z1"; + else + return "evmergelo %L0,%Z1,%Y1\;evmergelo %0,%L1,%1"; +} [(set_attr "length" "8")]) (define_insn "*frob__di_2" @@ -2236,31 +2324,63 @@ (match_operand:DI 1 "input_operand" "r,m"))] "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) || (TARGET_SPE && mode != DFmode && mode != TFmode)" - "@ - evmergelo %0,%1,%L1 - evldd%X1 %0,%y1") +{ + switch (which_alternative) + { + default: + gcc_unreachable (); + case 0: + if (WORDS_BIG_ENDIAN) + return "evmergelo %0,%1,%L1"; + else + return "evmergelo %0,%L1,%1"; + case 1: + return "evldd%X1 %0,%y1"; + } +}) (define_insn "*frob_tf_di_8_2" [(set (subreg:DI (match_operand:TF 0 "nonimmediate_operand" "+&r,r") 8) (match_operand:DI 1 "input_operand" "r,m"))] "TARGET_E500_DOUBLE" - "@ - evmergelo %L0,%1,%L1 - evldd%X1 %L0,%y1") +{ + switch (which_alternative) + { + default: + gcc_unreachable (); + case 0: + if (WORDS_BIG_ENDIAN) + return "evmergelo %L0,%1,%L1"; + else + return "evmergelo %L0,%L1,%1"; + case 1: + return "evldd%X1 %L0,%y1"; + } +}) (define_insn "*frob_di_" [(set (match_operand:DI 0 "nonimmediate_operand" "=&r") (subreg:DI (match_operand:SPE64TF 1 "input_operand" "r") 0))] "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) || (TARGET_SPE && mode != DFmode && mode != TFmode)" - "evmergehi %0,%1,%1\;mr %L0,%1" +{ + if (WORDS_BIG_ENDIAN) + return "evmergehi %0,%1,%1\;mr %L0,%1"; + else + return "evmergehi %L0,%1,%1\;mr %0,%1"; +} [(set_attr "length" "8")]) (define_insn "*frob_ti_tf" [(set (match_operand:TI 0 "nonimmediate_operand" "=&r") (subreg:TI (match_operand:TF 1 "input_operand" "r") 0))] "TARGET_E500_DOUBLE" - "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1" +{ + if (WORDS_BIG_ENDIAN) + return "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1"; + else + return "evmergehi %Z0,%L1,%L1\;mr %Y0,%L1\;evmergehi %L0,%1,%1\;mr %0,%1"; +} [(set_attr "length" "16")]) (define_insn "*frob___2" @@ -2275,22 +2395,40 @@ default: gcc_unreachable (); case 0: - return \"evmergehi %0,%1,%1\;mr %L0,%1\"; + if (WORDS_BIG_ENDIAN) + return \"evmergehi %0,%1,%1\;mr %L0,%1\"; + else + return \"evmergehi %L0,%1,%1\;mr %0,%1\"; case 1: /* If the address is not offsettable we need to load the whole doubleword into a 64-bit register and then copy the high word to form the correct output layout. */ if (!offsettable_nonstrict_memref_p (operands[1])) - return \"evldd%X1 %L0,%y1\;evmergehi %0,%L0,%L0\"; + { + if (WORDS_BIG_ENDIAN) + return \"evldd%X1 %L0,%y1\;evmergehi %0,%L0,%L0\"; + else + return \"evldd%X1 %0,%y1\;evmergehi %L0,%0,%0\"; + } /* If the low-address word is used in the address, we must load it last. Otherwise, load it first. Note that we cannot have auto-increment in that case since the address register is known to be dead. */ if (refers_to_regno_p (REGNO (operands[0]), REGNO (operands[0]) + 1, operands[1], 0)) - return \"lwz %L0,%L1\;lwz %0,%1\"; + { + if (WORDS_BIG_ENDIAN) + return \"lwz %L0,%L1\;lwz %0,%1\"; + else + return \"lwz %0,%1\;lwz %L0,%L1\"; + } else - return \"lwz%U1%X1 %0,%1\;lwz %L0,%L1\"; + { + if (WORDS_BIG_ENDIAN) + return \"lwz%U1%X1 %0,%1\;lwz %L0,%L1\"; + else + return \"lwz%U1%X1 %L0,%L1\;lwz %0,%1\"; + } } }" [(set_attr "length" "8,8")]) @@ -2308,15 +2446,33 @@ default: gcc_unreachable (); case 0: - return \"evmergehi %Y0,%1,%1\;mr %Z0,%1\"; + if (WORDS_BIG_ENDIAN) + return \"evmergehi %Y0,%1,%1\;mr %Z0,%1\"; + else + return \"evmergehi %Z0,%1,%1\;mr %Y0,%1\"; case 1: if (!offsettable_nonstrict_memref_p (operands[1])) - return \"evldd%X1 %Z0,%y1\;evmergehi %Y0,%Z0,%Z0\"; + { + if (WORDS_BIG_ENDIAN) + return \"evldd%X1 %Z0,%y1\;evmergehi %Y0,%Z0,%Z0\"; + else + return \"evldd%X1 %Y0,%y1\;evmergehi %Z0,%Y0,%Y0\"; + } if (refers_to_regno_p (REGNO (operands[0]), REGNO (operands[0]) + 1, operands[1], 0)) - return \"lwz %Z0,%L1\;lwz %Y0,%1\"; + { + if (WORDS_BIG_ENDIAN) + return \"lwz %Z0,%L1\;lwz %Y0,%1\"; + else + return \"lwz %Y0,%1\;lwz %Z0,%L1\"; + } else - return \"lwz%U1%X1 %Y0,%1\;lwz %Z0,%L1\"; + { + if (WORDS_BIG_ENDIAN) + return \"lwz%U1%X1 %Y0,%1\;lwz %Z0,%L1\"; + else + return \"lwz%U1%X1 %Z0,%L1\;lwz %Y0,%1\"; + } } }" [(set_attr "length" "8,8")]) @@ -2325,110 +2481,226 @@ [(set (subreg:TF (match_operand:TI 0 "gpc_reg_operand" "=&r") 0) (match_operand:TF 1 "input_operand" "r"))] "TARGET_E500_DOUBLE" - "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1" +{ + if (WORDS_BIG_ENDIAN) + return "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1"; + else + return "evmergehi %Z0,%L1,%L1\;mr %Y0,%L1\;evmergehi %L0,%1,%1\;mr %0,%1"; +} [(set_attr "length" "16")]) -(define_insn "mov_si_e500_subreg0" +(define_insn "mov_si_e500_subreg0_be" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,&r") 0) (match_operand:SI 1 "input_operand" "r,m"))] - "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)" + "WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" "@ evmergelo %0,%1,%0 evmergelohi %0,%0,%0\;lwz%U1%X1 %0,%1\;evmergelohi %0,%0,%0" [(set_attr "length" "4,12")]) -(define_insn_and_split "*mov_si_e500_subreg0_elf_low" +(define_insn "*mov_si_e500_subreg0_le" + [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,r") 0) + (match_operand:SI 1 "input_operand" "r,m"))] + "!WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" + "@ + mr %0,%1 + lwz%U1%X1 %0,%1") + +(define_insn_and_split "*mov_si_e500_subreg0_elf_low_be" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0) (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") (match_operand 2 "" "")))] - "((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)) - && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ()" + "WORDS_BIG_ENDIAN + && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode)) + && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ())" "#" "&& 1" [(pc)] { rtx tmp = gen_reg_rtx (SImode); emit_insn (gen_elf_low (tmp, operands[1], operands[2])); - emit_insn (gen_mov_si_e500_subreg0 (operands[0], tmp)); + emit_insn (gen_mov_si_e500_subreg0_be (operands[0], tmp)); DONE; } [(set_attr "length" "8")]) +(define_insn "*mov_si_e500_subreg0_elf_low_le" + [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0) + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") + (match_operand 2 "" "")))] + "!WORDS_BIG_ENDIAN + && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode)) + && TARGET_ELF && !TARGET_64BIT)" + "addic %0,%1,%K2") + ;; ??? Could use evstwwe for memory stores in some cases, depending on ;; the offset. -(define_insn "*mov_si_e500_subreg0_2" +(define_insn "*mov_si_e500_subreg0_2_be" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") (subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,&r") 0))] - "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)" + "WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" "@ evmergehi %0,%0,%1 evmergelohi %1,%1,%1\;stw%U0%X0 %1,%0" [(set_attr "length" "4,8")]) -(define_insn "*mov_si_e500_subreg4" +(define_insn "*mov_si_e500_subreg0_2_le" + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") + (subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,r") 0))] + "!WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" + "@ + mr %0,%1 + stw%U0%X0 %1,%0") + +(define_insn "*mov_si_e500_subreg4_be" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,r") 4) (match_operand:SI 1 "input_operand" "r,m"))] - "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)" + "WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" "@ mr %0,%1 lwz%U1%X1 %0,%1") -(define_insn "*mov_si_e500_subreg4_elf_low" +(define_insn "mov_si_e500_subreg4_le" + [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,&r") 4) + (match_operand:SI 1 "input_operand" "r,m"))] + "!WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" + "@ + evmergelo %0,%1,%0 + evmergelohi %0,%0,%0\;lwz%U1%X1 %0,%1\;evmergelohi %0,%0,%0" + [(set_attr "length" "4,12")]) + +(define_insn "*mov_si_e500_subreg4_elf_low_be" [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 4) (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") (match_operand 2 "" "")))] - "((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)) - && TARGET_ELF && !TARGET_64BIT" + "WORDS_BIG_ENDIAN + && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode)) + && TARGET_ELF && !TARGET_64BIT)" "addic %0,%1,%K2") -(define_insn "*mov_si_e500_subreg4_2" +(define_insn_and_split "*mov_si_e500_subreg4_elf_low_le" + [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 4) + (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r") + (match_operand 2 "" "")))] + "!WORDS_BIG_ENDIAN + && (((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode)) + && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ())" + "#" + "&& 1" + [(pc)] +{ + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_elf_low (tmp, operands[1], operands[2])); + emit_insn (gen_mov_si_e500_subreg4_le (operands[0], tmp)); + DONE; +} + [(set_attr "length" "8")]) + +(define_insn "*mov_si_e500_subreg4_2_be" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") (subreg:SI (match_operand:SPE64TF 1 "register_operand" "r,r") 4))] - "(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) - || (TARGET_SPE && mode != DFmode && mode != TFmode)" + "WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" "@ mr %0,%1 stw%U0%X0 %1,%0") -(define_insn "*mov_sitf_e500_subreg8" +(define_insn "*mov_si_e500_subreg4_2_le" + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") + (subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,&r") 4))] + "!WORDS_BIG_ENDIAN + && ((TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode)) + || (TARGET_SPE && mode != DFmode && mode != TFmode))" + "@ + evmergehi %0,%0,%1 + evmergelohi %1,%1,%1\;stw%U0%X0 %1,%0" + [(set_attr "length" "4,8")]) + +(define_insn "*mov_sitf_e500_subreg8_be" [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,&r") 8) (match_operand:SI 1 "input_operand" "r,m"))] - "TARGET_E500_DOUBLE" + "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" "@ evmergelo %L0,%1,%L0 evmergelohi %L0,%L0,%L0\;lwz%U1%X1 %L0,%1\;evmergelohi %L0,%L0,%L0" [(set_attr "length" "4,12")]) -(define_insn "*mov_sitf_e500_subreg8_2" +(define_insn "*mov_sitf_e500_subreg8_le" + [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,r") 8) + (match_operand:SI 1 "input_operand" "r,m"))] + "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" + "@ + mr %L0,%1 + lwz%U1%X1 %L0,%1") + +(define_insn "*mov_sitf_e500_subreg8_2_be" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") (subreg:SI (match_operand:TF 1 "register_operand" "+r,&r") 8))] - "TARGET_E500_DOUBLE" + "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" "@ evmergehi %0,%0,%L1 evmergelohi %L1,%L1,%L1\;stw%U0%X0 %L1,%0" [(set_attr "length" "4,8")]) -(define_insn "*mov_sitf_e500_subreg12" +(define_insn "*mov_sitf_e500_subreg8_2_le" + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") + (subreg:SI (match_operand:TF 1 "register_operand" "r,r") 8))] + "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" + "@ + mr %0,%L1 + stw%U0%X0 %L1,%0") + +(define_insn "*mov_sitf_e500_subreg12_be" [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,r") 12) (match_operand:SI 1 "input_operand" "r,m"))] - "TARGET_E500_DOUBLE" + "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" "@ mr %L0,%1 lwz%U1%X1 %L0,%1") -(define_insn "*mov_sitf_e500_subreg12_2" +(define_insn "*mov_sitf_e500_subreg12_le" + [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,&r") 12) + (match_operand:SI 1 "input_operand" "r,m"))] + "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" + "@ + evmergelo %L0,%1,%L0 + evmergelohi %L0,%L0,%L0\;lwz%U1%X1 %L0,%1\;evmergelohi %L0,%L0,%L0" + [(set_attr "length" "4,12")]) + +(define_insn "*mov_sitf_e500_subreg12_2_be" [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") (subreg:SI (match_operand:TF 1 "register_operand" "r,r") 12))] - "TARGET_E500_DOUBLE" + "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" "@ mr %0,%L1 stw%U0%X0 %L1,%0") +(define_insn "*mov_sitf_e500_subreg12_2_le" + [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m") + (subreg:SI (match_operand:TF 1 "register_operand" "+r,&r") 12))] + "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE" + "@ + evmergehi %0,%0,%L1 + evmergelohi %L1,%L1,%L1\;stw%U0%X0 %L1,%0" + [(set_attr "length" "4,8")]) + ;; FIXME: Allow r=CONST0. (define_insn "*movdf_e500_double" [(set (match_operand:DF 0 "rs6000_nonimmediate_operand" "=r,r,m") Index: gcc-fsf-trunk-quilt/gcc/testsuite/gcc.target/powerpc/spe-evmerge.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ gcc-fsf-trunk-quilt/gcc/testsuite/gcc.target/powerpc/spe-evmerge.c 2014-06-11 16:35:25.917851800 +0100 @@ -0,0 +1,71 @@ +/* Verify SPE vector permute builtins. */ +/* { dg-do run { target { powerpc*-*-* && powerpc_spe } } } */ +/* Remove `-ansi' from options so that compiles. */ +/* { dg-options "" } */ + +#include +#include + +#define vector __attribute__ ((vector_size (8))) + +#define WORDS_BIG_ENDIAN (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__) + +int +main (void) +{ + vector int a = { 0x11111111, 0x22222222 }; + vector int b = { 0x33333333, 0x44444444 }; + vector int c; + + /* c[hi] = a[hi], c[lo] = b[hi] */ + c = __ev_mergehi (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x44444444)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x22222222)) + abort (); + /* c[hi] = a[lo], c[lo] = b[lo] */ + c = __ev_mergelo (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x33333333)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x11111111)) + abort (); + /* c[hi] = a[lo], c[lo] = b[hi] */ + c = __ev_mergelohi (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x44444444)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x11111111)) + abort (); + /* c[hi] = a[hi], c[lo] = b[lo] */ + c = __ev_mergehilo (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x33333333)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x22222222)) + abort (); + + /* c[hi] = a[hi], c[lo] = b[hi] */ + c = __builtin_spe_evmergehi (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x44444444)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x22222222)) + abort (); + /* c[hi] = a[lo], c[lo] = b[lo] */ + c = __builtin_spe_evmergelo (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x33333333)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x11111111)) + abort (); + /* c[hi] = a[lo], c[lo] = b[hi] */ + c = __builtin_spe_evmergelohi (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x44444444)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x11111111)) + abort (); + /* c[hi] = a[hi], c[lo] = b[lo] */ + c = __builtin_spe_evmergehilo (a, b); + if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x33333333)) + abort (); + if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x22222222)) + abort (); + + return 0; +}