From patchwork Thu Oct 28 17:54:41 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 69479 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D856BB6F11 for ; Fri, 29 Oct 2010 04:54:54 +1100 (EST) Received: (qmail 375 invoked by alias); 28 Oct 2010 17:54:51 -0000 Received: (qmail 364 invoked by uid 22791); 28 Oct 2010 17:54:49 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, TW_ZJ, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-pw0-f47.google.com (HELO mail-pw0-f47.google.com) (209.85.160.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 28 Oct 2010 17:54:43 +0000 Received: by pwi3 with SMTP id 3so10679pwi.20 for ; Thu, 28 Oct 2010 10:54:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.203.10 with SMTP id a10mr471254wfg.215.1288288481393; Thu, 28 Oct 2010 10:54:41 -0700 (PDT) Received: by 10.142.54.21 with HTTP; Thu, 28 Oct 2010 10:54:41 -0700 (PDT) Date: Thu, 28 Oct 2010 19:54:41 +0200 Message-ID: Subject: [PATCH, i386]: Fix PR target/46153 From: Uros Bizjak To: gcc-patches@gcc.gnu.org Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello! Attached patch fixed PR target/46153. The problem was, that expanders didn't use fixed-up destination RTX, returned from ix86_fixup_binary_operands. Since the expanded insns now satisfy ix86_binary_operator_ok, we can use this function as insn predicate to prevent combine from creating invalid instructions (FWIW, these were later fixed by reload). Please also note, that ix86_fixup_binary_operands include a couple of optimizations (i.e. when both input operands are loaded from the same memory address). 2010-10-28 Uros Bizjak PR target/46153 * config/i386/sse.md (sse_movhlps_exp): Use destination returned from ix86_fixup_binary_operands to expand insn. (sse_movlhps_exp): Ditto. (sse_loadhps_exp): Ditto. (sse_loadlps_exp): Ditto. (sse2_loadhpd_exp): Ditto. (sse2_loadlpd_exp): Ditto. (*avx_movhlps): Use ix86_binary_operator_ok in insn predicate. (sse_movhlps): Ditto. (*avx_movlhps): Ditto. (sse_movlhps): Ditto. (*avx_loadhps): Ditto. (sse_loadhps): Ditto. (*avx_loadhpd): Ditto. (sse_loadhpd): Ditto. (*avx_storelps): Prevent both operands in memory. (sse_storelps): Ditto. testsuite/ChangeLog: 2010-10-28 Uros Bizjak PR target/46153 * gcc.target/i386/pr46153.c: New test. Patch was bootstrapped on x86_64-pc-linux-gnu {,-m32} and was committed to mainline SVN. It will be comitted to 4.5 after a few days. Uros. Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 166026) +++ config/i386/sse.md (working copy) @@ -3244,8 +3244,18 @@ (const_int 2) (const_int 3)])))] "TARGET_SSE" - "ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands); + + emit_insn (gen_sse_movhlps (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + (define_insn "*avx_movhlps" [(set (match_operand:V4SF 0 "nonimmediate_operand" "=x,x,m") (vec_select:V4SF @@ -3256,7 +3266,7 @@ (const_int 7) (const_int 2) (const_int 3)])))] - "TARGET_AVX && !(MEM_P (operands[1]) && MEM_P (operands[2]))" + "TARGET_AVX && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ vmovhlps\t{%2, %1, %0|%0, %1, %2} vmovlps\t{%H2, %1, %0|%0, %1, %H2} @@ -3275,7 +3285,7 @@ (const_int 7) (const_int 2) (const_int 3)])))] - "TARGET_SSE && !(MEM_P (operands[1]) && MEM_P (operands[2]))" + "TARGET_SSE && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ movhlps\t{%2, %0|%0, %2} movlps\t{%H2, %0|%0, %H2} @@ -3294,8 +3304,18 @@ (const_int 4) (const_int 5)])))] "TARGET_SSE" - "ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands); + + emit_insn (gen_sse_movlhps (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + (define_insn "*avx_movlhps" [(set (match_operand:V4SF 0 "nonimmediate_operand" "=x,x,o") (vec_select:V4SF @@ -3701,8 +3721,18 @@ (parallel [(const_int 0) (const_int 1)])) (match_operand:V2SF 2 "nonimmediate_operand" "")))] "TARGET_SSE" - "ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands); + + emit_insn (gen_sse_loadhps (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + (define_insn "*avx_loadhps" [(set (match_operand:V4SF 0 "nonimmediate_operand" "=x,x,o") (vec_concat:V4SF @@ -3710,7 +3740,7 @@ (match_operand:V4SF 1 "nonimmediate_operand" "x,x,0") (parallel [(const_int 0) (const_int 1)])) (match_operand:V2SF 2 "nonimmediate_operand" "m,x,x")))] - "TARGET_AVX" + "TARGET_AVX && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ vmovhps\t{%2, %1, %0|%0, %1, %2} vmovlhps\t{%2, %1, %0|%0, %1, %2} @@ -3726,7 +3756,7 @@ (match_operand:V4SF 1 "nonimmediate_operand" "0,0,0") (parallel [(const_int 0) (const_int 1)])) (match_operand:V2SF 2 "nonimmediate_operand" "m,x,x")))] - "TARGET_SSE" + "TARGET_SSE && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ movhps\t{%2, %0|%0, %2} movlhps\t{%2, %0|%0, %2} @@ -3739,7 +3769,7 @@ (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "x,x,m") (parallel [(const_int 0) (const_int 1)])))] - "TARGET_AVX" + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ vmovlps\t{%1, %0|%0, %1} vmovaps\t{%1, %0|%0, %1} @@ -3753,7 +3783,7 @@ (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "x,x,m") (parallel [(const_int 0) (const_int 1)])))] - "TARGET_SSE" + "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ movlps\t{%1, %0|%0, %1} movaps\t{%1, %0|%0, %1} @@ -3769,8 +3799,18 @@ (match_operand:V4SF 1 "nonimmediate_operand" "") (parallel [(const_int 2) (const_int 3)]))))] "TARGET_SSE" - "ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V4SFmode, operands); + + emit_insn (gen_sse_loadlps (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + (define_insn "*avx_loadlps" [(set (match_operand:V4SF 0 "nonimmediate_operand" "=x,x,m") (vec_concat:V4SF @@ -3778,7 +3818,7 @@ (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "x,x,0") (parallel [(const_int 2) (const_int 3)]))))] - "TARGET_AVX" + "TARGET_AVX && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ shufps\t{$0xe4, %1, %2, %0|%0, %2, %1, 0xe4} vmovlps\t{%2, %1, %0|%0, %1, %2} @@ -3795,7 +3835,7 @@ (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "x,0,0") (parallel [(const_int 2) (const_int 3)]))))] - "TARGET_SSE" + "TARGET_SSE && ix86_binary_operator_ok (UNKNOWN, V4SFmode, operands)" "@ shufps\t{$0xe4, %1, %0|%0, %1, 0xe4} movlps\t{%2, %0|%0, %2} @@ -4898,8 +4938,18 @@ (parallel [(const_int 0)])) (match_operand:DF 2 "nonimmediate_operand" "")))] "TARGET_SSE2" - "ix86_fixup_binary_operands (UNKNOWN, V2DFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V2DFmode, operands); + + emit_insn (gen_sse2_loadhpd (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.c (define_insn "*avx_loadhpd" @@ -4909,7 +4959,7 @@ (match_operand:V2DF 1 "nonimmediate_operand" " x,x,0,0,0") (parallel [(const_int 0)])) (match_operand:DF 2 "nonimmediate_operand" " m,x,x,*f,r")))] - "TARGET_AVX && !(MEM_P (operands[1]) && MEM_P (operands[2]))" + "TARGET_AVX && ix86_binary_operator_ok (UNKNOWN, V2DFmode, operands)" "@ vmovhpd\t{%2, %1, %0|%0, %1, %2} vunpcklpd\t{%2, %1, %0|%0, %1, %2} @@ -4927,7 +4977,7 @@ (match_operand:V2DF 1 "nonimmediate_operand" " 0,0,x,0,0,0") (parallel [(const_int 0)])) (match_operand:DF 2 "nonimmediate_operand" " m,x,0,x,*f,r")))] - "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" + "TARGET_SSE2 && ix86_binary_operator_ok (UNKNOWN, V2DFmode, operands)" "@ movhpd\t{%2, %0|%0, %2} unpcklpd\t{%2, %0|%0, %2} @@ -4957,8 +5007,18 @@ (match_operand:V2DF 1 "nonimmediate_operand" "") (parallel [(const_int 1)]))))] "TARGET_SSE2" - "ix86_fixup_binary_operands (UNKNOWN, V2DFmode, operands);") +{ + rtx dst = ix86_fixup_binary_operands (UNKNOWN, V2DFmode, operands); + + emit_insn (gen_sse2_loadlpd (dst, operands[1], operands[2])); + /* Fix up the destination if needed. */ + if (dst != operands[0]) + emit_move_insn (operands[0], dst); + + DONE; +}) + ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.c (define_insn "*avx_loadlpd" Index: testsuite/gcc.target/i386/pr46153.c =================================================================== --- testsuite/gcc.target/i386/pr46153.c (revision 0) +++ testsuite/gcc.target/i386/pr46153.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-msse -ffloat-store" } */ + +typedef float v4sf __attribute__ ((__vector_size__ (16))); + +v4sf foo (v4sf a) +{ + return __builtin_ia32_movlhps (a, a); +}