From patchwork Sat Nov 2 23:44:20 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 288035 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id AE47D2C00D0 for ; Sun, 3 Nov 2013 10:44:24 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=BlM 20SnPjGrmELPYtpgUJ1Bdk/evajx5nHTJo6oGBPxo3d/612C23/TDxbwXMT46K90 KR+hmu9fNE4UyM5bQi7XNWfc2cxkP4T5cMUhvovDoaH+rBPpqiyTu4fn2yzNziRd lCOt2yqucMCHPCm11EgPwGD9VMDWHo6Qve3sCD3M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type :content-transfer-encoding:mime-version; s=default; bh=jHWy/qAbP 95zyiaRzSjj3dWLBw8=; b=XXxWJjYmFX+UG20fc01mW9Ga5VhI3iisi5dpV5MwE +zJw2C2AGlnw4XGksT4PbKXURIeGfEAQMl/TxyCyeXuIEXdwzZIDXoLGHJNjNBei L4vqf+5Cdkkhz72zAC7e33kgJ+y554UR/ktOIEED5lJSuCFCHMvfKUMKjHPpwgPS s8= Received: (qmail 13336 invoked by alias); 2 Nov 2013 23:44:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 13326 invoked by uid 89); 2 Nov 2013 23:44:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, RDNS_NONE, URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: e28smtp06.in.ibm.com Received: from Unknown (HELO e28smtp06.in.ibm.com) (122.248.162.6) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Sat, 02 Nov 2013 23:44:13 +0000 Received: from /spool/local by e28smtp06.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 3 Nov 2013 05:13:56 +0530 Received: from d28dlp02.in.ibm.com (9.184.220.127) by e28smtp06.in.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 3 Nov 2013 05:13:54 +0530 Received: from d28relay01.in.ibm.com (d28relay01.in.ibm.com [9.184.220.58]) by d28dlp02.in.ibm.com (Postfix) with ESMTP id C39543940017 for ; Sun, 3 Nov 2013 05:13:30 +0530 (IST) Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66]) by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rA2NhncP37617702 for ; Sun, 3 Nov 2013 05:13:51 +0530 Received: from d28av04.in.ibm.com (localhost [127.0.0.1]) by d28av04.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rA2Nhp5d024281 for ; Sun, 3 Nov 2013 05:13:51 +0530 Received: from [9.65.200.189] (sig-9-65-200-189.mts.ibm.com [9.65.200.189]) by d28av04.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id rA2NhnmT024220; Sun, 3 Nov 2013 05:13:50 +0530 Message-ID: <1383435860.6275.282.camel@gnopaine> Subject: [PATCH, rs6000] Re-permute source register for postreload splits of VSX LE stores From: Bill Schmidt To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com Date: Sat, 02 Nov 2013 18:44:20 -0500 Mime-Version: 1.0 X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13110223-9574-0000-0000-00000A66C41D X-IsSubscribed: yes Hi, When I created the patch to split VSX loads and stores to add permutes so the register image was correctly little endian, I forgot to implement a known requirement. When a VSX store is split after reload, we must reuse the source register for the permute, which leaves it in a swapped state after the store. If the register remains live, this is invalid. After the store, we need to permute the register back to its original value. For each of the little endian VSX store patterns, I've replaced the define_insn_and_split with a define_insn and two define_splits, one for prior to reload and one for after reload. The post-reload split has the extra behavior. I don't know of a way to set the insn's length attribute conditionally, so I'm optimistically setting this to 8, though it will be 12 in the post-reload case. Is there any concern with that? Is there a way to make it fully accurate? Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no regressions. This fixes two failing test cases. Is this ok for trunk? Thanks! Bill 2013-11-02 Bill Schmidt * config/rs6000/vsx.md (*vsx_le_perm_store_ for VSX_D): Replace the define_insn_and_split with a define_insn and two define_splits, with the split after reload re-permuting the source register to its original value. (*vsx_le_perm_store_ for VSX_W): Likewise. (*vsx_le_perm_store_v8hi): Likewise. (*vsx_le_perm_store_v16qi): Likewise. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 204192) +++ gcc/config/rs6000/vsx.md (working copy) @@ -333,12 +333,18 @@ [(set_attr "type" "vecload") (set_attr "length" "8")]) -(define_insn_and_split "*vsx_le_perm_store_" +(define_insn "*vsx_le_perm_store_" [(set (match_operand:VSX_D 0 "memory_operand" "=Z") (match_operand:VSX_D 1 "vsx_register_operand" "+wa"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" - "!BYTES_BIG_ENDIAN && TARGET_VSX" + [(set_attr "type" "vecstore") + (set_attr "length" "8")]) + +(define_split + [(set (match_operand:VSX_D 0 "memory_operand" "") + (match_operand:VSX_D 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed" [(set (match_dup 2) (vec_select: (match_dup 1) @@ -347,21 +353,43 @@ (vec_select: (match_dup 2) (parallel [(const_int 1) (const_int 0)])))] - " { operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) : operands[1]; -} - " - [(set_attr "type" "vecstore") - (set_attr "length" "8")]) +}) -(define_insn_and_split "*vsx_le_perm_store_" +;; The post-reload split requires that we re-permute the source +;; register in case it is still live. +(define_split + [(set (match_operand:VSX_D 0 "memory_operand" "") + (match_operand:VSX_D 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed" + [(set (match_dup 1) + (vec_select: + (match_dup 1) + (parallel [(const_int 1) (const_int 0)]))) + (set (match_dup 0) + (vec_select: + (match_dup 1) + (parallel [(const_int 1) (const_int 0)]))) + (set (match_dup 1) + (vec_select: + (match_dup 1) + (parallel [(const_int 1) (const_int 0)])))] + "") + +(define_insn "*vsx_le_perm_store_" [(set (match_operand:VSX_W 0 "memory_operand" "=Z") (match_operand:VSX_W 1 "vsx_register_operand" "+wa"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" - "!BYTES_BIG_ENDIAN && TARGET_VSX" + [(set_attr "type" "vecstore") + (set_attr "length" "8")]) + +(define_split + [(set (match_operand:VSX_W 0 "memory_operand" "") + (match_operand:VSX_W 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed" [(set (match_dup 2) (vec_select: (match_dup 1) @@ -372,21 +400,46 @@ (match_dup 2) (parallel [(const_int 2) (const_int 3) (const_int 0) (const_int 1)])))] - " { operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) : operands[1]; -} - " - [(set_attr "type" "vecstore") - (set_attr "length" "8")]) +}) -(define_insn_and_split "*vsx_le_perm_store_v8hi" +;; The post-reload split requires that we re-permute the source +;; register in case it is still live. +(define_split + [(set (match_operand:VSX_W 0 "memory_operand" "") + (match_operand:VSX_W 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed" + [(set (match_dup 1) + (vec_select: + (match_dup 1) + (parallel [(const_int 2) (const_int 3) + (const_int 0) (const_int 1)]))) + (set (match_dup 0) + (vec_select: + (match_dup 1) + (parallel [(const_int 2) (const_int 3) + (const_int 0) (const_int 1)]))) + (set (match_dup 1) + (vec_select: + (match_dup 1) + (parallel [(const_int 2) (const_int 3) + (const_int 0) (const_int 1)])))] + "") + +(define_insn "*vsx_le_perm_store_v8hi" [(set (match_operand:V8HI 0 "memory_operand" "=Z") (match_operand:V8HI 1 "vsx_register_operand" "+wa"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" - "!BYTES_BIG_ENDIAN && TARGET_VSX" + [(set_attr "type" "vecstore") + (set_attr "length" "8")]) + +(define_split + [(set (match_operand:V8HI 0 "memory_operand" "") + (match_operand:V8HI 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed" [(set (match_dup 2) (vec_select:V8HI (match_dup 1) @@ -401,21 +454,52 @@ (const_int 6) (const_int 7) (const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] - " { operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) : operands[1]; -} - " - [(set_attr "type" "vecstore") - (set_attr "length" "8")]) +}) -(define_insn_and_split "*vsx_le_perm_store_v16qi" +;; The post-reload split requires that we re-permute the source +;; register in case it is still live. +(define_split + [(set (match_operand:V8HI 0 "memory_operand" "") + (match_operand:V8HI 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed" + [(set (match_dup 1) + (vec_select:V8HI + (match_dup 1) + (parallel [(const_int 4) (const_int 5) + (const_int 6) (const_int 7) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3)]))) + (set (match_dup 0) + (vec_select:V8HI + (match_dup 1) + (parallel [(const_int 4) (const_int 5) + (const_int 6) (const_int 7) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3)]))) + (set (match_dup 1) + (vec_select:V8HI + (match_dup 1) + (parallel [(const_int 4) (const_int 5) + (const_int 6) (const_int 7) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3)])))] + "") + +(define_insn "*vsx_le_perm_store_v16qi" [(set (match_operand:V16QI 0 "memory_operand" "=Z") (match_operand:V16QI 1 "vsx_register_operand" "+wa"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" - "!BYTES_BIG_ENDIAN && TARGET_VSX" + [(set_attr "type" "vecstore") + (set_attr "length" "8")]) + +(define_split + [(set (match_operand:V16QI 0 "memory_operand" "") + (match_operand:V16QI 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed" [(set (match_dup 2) (vec_select:V16QI (match_dup 1) @@ -438,16 +522,53 @@ (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])))] - " { operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) : operands[1]; -} - " - [(set_attr "type" "vecstore") - (set_attr "length" "8")]) +}) +;; The post-reload split requires that we re-permute the source +;; register in case it is still live. +(define_split + [(set (match_operand:V16QI 0 "memory_operand" "") + (match_operand:V16QI 1 "vsx_register_operand" ""))] + "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed" + [(set (match_dup 1) + (vec_select:V16QI + (match_dup 1) + (parallel [(const_int 8) (const_int 9) + (const_int 10) (const_int 11) + (const_int 12) (const_int 13) + (const_int 14) (const_int 15) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 4) (const_int 5) + (const_int 6) (const_int 7)]))) + (set (match_dup 0) + (vec_select:V16QI + (match_dup 1) + (parallel [(const_int 8) (const_int 9) + (const_int 10) (const_int 11) + (const_int 12) (const_int 13) + (const_int 14) (const_int 15) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 4) (const_int 5) + (const_int 6) (const_int 7)]))) + (set (match_dup 1) + (vec_select:V16QI + (match_dup 1) + (parallel [(const_int 8) (const_int 9) + (const_int 10) (const_int 11) + (const_int 12) (const_int 13) + (const_int 14) (const_int 15) + (const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 4) (const_int 5) + (const_int 6) (const_int 7)])))] + "") + (define_insn "*vsx_mov" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=Z,,,?Z,?wa,?wa,wQ,?&r,??Y,??r,??r,,?wa,*r,v,wZ, v") (match_operand:VSX_M 1 "input_operand" ",Z,,wa,Z,wa,r,wQ,r,Y,r,j,j,j,W,v,wZ"))]