From patchwork Mon Apr 18 19:52:17 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 91837 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 00254B6F7C for ; Tue, 19 Apr 2011 05:52:37 +1000 (EST) Received: (qmail 25008 invoked by alias); 18 Apr 2011 19:52:35 -0000 Received: (qmail 24990 invoked by uid 22791); 18 Apr 2011 19:52:33 -0000 X-SWARE-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RFC_ABUSE_POST, SARE_HTML_INV_TAG, TW_AV, TW_VD, TW_VX, TW_ZJ, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-pv0-f175.google.com (HELO mail-pv0-f175.google.com) (74.125.83.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 18 Apr 2011 19:52:18 +0000 Received: by pvc30 with SMTP id 30so2764163pvc.20 for ; Mon, 18 Apr 2011 12:52:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.150.32 with SMTP id x32mr3075927wfd.287.1303156337878; Mon, 18 Apr 2011 12:52:17 -0700 (PDT) Received: by 10.142.87.14 with HTTP; Mon, 18 Apr 2011 12:52:17 -0700 (PDT) Date: Mon, 18 Apr 2011 21:52:17 +0200 Message-ID: Subject: [PATCH, i386]: Macroize movmsk/maskmov insns From: Uros Bizjak To: gcc-patches@gcc.gnu.org Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello! Attached patch macroizes movmsk and maskmov instructions. As an added bonus, it also implements maskmovdqu tests, so "%%%" comment can be removed. 2011-04-18 Uros Bizjak * config/i386/i386.h (SSE_VEC_FLOAT_MODE_P): Remove. (AVX_FLOAT_MODE_P): Ditto. (AVX128_VEC_FLOAT_MODE_P): Ditto. (AVX256_VEC_FLOAT_MODE_P): Ditto. (AVX_VEC_FLOAT_MODE_P): Ditto. * config/i386/i386.md (UNSPEC_MASKLOAD): Remove. (UNSPEC_MASKSTORE): Ditto. * config/i386/sse.md (_movmsk): Merge from _movmsk and avx_movmsk256. Use VF mode iterator. (*sse2_maskmovdqu): Merge with *sse2_maskmovdqu_rex64. Use P mode iterator. (avx_maskload): New expander. (avx_maskstore): Ditto. (*avx_maskmov): New insn. testsuite/ChangeLog: 2011-04-18 Uros Bizjak * gcc.target/i386/sse2-maskmovdqu.c: New test. * gcc.target/i386/avx-vmaskmovdqu.c: Ditto. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} AVX target. Patch was committed to mainline. Uros. Index: config/i386/i386.h =================================================================== --- config/i386/i386.h (revision 172652) +++ config/i386/i386.h (working copy) @@ -1328,22 +1328,6 @@ enum reg_class #define SSE_FLOAT_MODE_P(MODE) \ ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode)) -#define SSE_VEC_FLOAT_MODE_P(MODE) \ - ((TARGET_SSE && (MODE) == V4SFmode) || (TARGET_SSE2 && (MODE) == V2DFmode)) - -#define AVX_FLOAT_MODE_P(MODE) \ - (TARGET_AVX && ((MODE) == SFmode || (MODE) == DFmode)) - -#define AVX128_VEC_FLOAT_MODE_P(MODE) \ - (TARGET_AVX && ((MODE) == V4SFmode || (MODE) == V2DFmode)) - -#define AVX256_VEC_FLOAT_MODE_P(MODE) \ - (TARGET_AVX && ((MODE) == V8SFmode || (MODE) == V4DFmode)) - -#define AVX_VEC_FLOAT_MODE_P(MODE) \ - (TARGET_AVX && ((MODE) == V4SFmode || (MODE) == V2DFmode \ - || (MODE) == V8SFmode || (MODE) == V4DFmode)) - #define FMA4_VEC_FLOAT_MODE_P(MODE) \ (TARGET_FMA4 && ((MODE) == V4SFmode || (MODE) == V2DFmode \ || (MODE) == V8SFmode || (MODE) == V4DFmode)) Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 172652) +++ config/i386/i386.md (working copy) @@ -224,8 +224,6 @@ UNSPEC_VPERMIL UNSPEC_VPERMIL2 UNSPEC_VPERMIL2F128 - UNSPEC_MASKLOAD - UNSPEC_MASKSTORE UNSPEC_CAST UNSPEC_VTESTP UNSPEC_VCVTPH2PS Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 172652) +++ config/i386/sse.md (working copy) @@ -6893,23 +6893,12 @@ (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) -(define_insn "avx_movmsk256" +(define_insn "_movmsk" [(set (match_operand:SI 0 "register_operand" "=r") (unspec:SI - [(match_operand:AVX256MODEF2P 1 "register_operand" "x")] + [(match_operand:VF 1 "register_operand" "x")] UNSPEC_MOVMSK))] - "AVX256_VEC_FLOAT_MODE_P (mode)" - "vmovmsk\t{%1, %0|%0, %1}" - [(set_attr "type" "ssecvt") - (set_attr "prefix" "vex") - (set_attr "mode" "")]) - -(define_insn "_movmsk" - [(set (match_operand:SI 0 "register_operand" "=r") - (unspec:SI - [(match_operand:SSEMODEF2P 1 "register_operand" "x")] - UNSPEC_MOVMSK))] - "SSE_VEC_FLOAT_MODE_P (mode)" + "" "%vmovmsk\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") @@ -6935,35 +6924,18 @@ "TARGET_SSE2") (define_insn "*sse2_maskmovdqu" - [(set (mem:V16QI (match_operand:SI 0 "register_operand" "D")) + [(set (mem:V16QI (match_operand:P 0 "register_operand" "D")) (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x") (match_operand:V16QI 2 "register_operand" "x") (mem:V16QI (match_dup 0))] UNSPEC_MASKMOV))] - "TARGET_SSE2 && !TARGET_64BIT" - ;; @@@ check ordering of operands in intel/nonintel syntax - "%vmaskmovdqu\t{%2, %1|%1, %2}" - [(set_attr "type" "ssemov") - (set_attr "prefix_data16" "1") - ;; The implicit %rdi operand confuses default length_vex computation. - (set_attr "length_vex" "3") - (set_attr "prefix" "maybe_vex") - (set_attr "mode" "TI")]) - -(define_insn "*sse2_maskmovdqu_rex64" - [(set (mem:V16QI (match_operand:DI 0 "register_operand" "D")) - (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "x") - (match_operand:V16QI 2 "register_operand" "x") - (mem:V16QI (match_dup 0))] - UNSPEC_MASKMOV))] - "TARGET_SSE2 && TARGET_64BIT" - ;; @@@ check ordering of operands in intel/nonintel syntax + "TARGET_SSE2" "%vmaskmovdqu\t{%2, %1|%1, %2}" [(set_attr "type" "ssemov") (set_attr "prefix_data16" "1") ;; The implicit %rdi operand confuses default length_vex computation. (set (attr "length_vex") - (symbol_ref ("REGNO (operands[2]) >= FIRST_REX_SSE_REG ? 3 + 1 : 2 + 1"))) + (symbol_ref ("3 + REX_SSE_REGNO_P (REGNO (operands[2]))"))) (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) @@ -10349,28 +10321,33 @@ (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) -(define_insn "avx_maskload" - [(set (match_operand:AVXMODEF2P 0 "register_operand" "=x") - (unspec:AVXMODEF2P - [(match_operand:AVXMODEF2P 1 "memory_operand" "m") - (match_operand: 2 "register_operand" "x") +(define_expand "avx_maskload" + [(set (match_operand:VF 0 "register_operand" "") + (unspec:VF + [(match_operand: 2 "register_operand" "") + (match_operand:VF 1 "memory_operand" "") (match_dup 0)] - UNSPEC_MASKLOAD))] - "TARGET_AVX" - "vmaskmov\t{%1, %2, %0|%0, %2, %1}" - [(set_attr "type" "sselog1") - (set_attr "prefix_extra" "1") - (set_attr "prefix" "vex") - (set_attr "mode" "")]) + UNSPEC_MASKMOV))] + "TARGET_AVX") -(define_insn "avx_maskstore" - [(set (match_operand:AVXMODEF2P 0 "memory_operand" "=m") - (unspec:AVXMODEF2P - [(match_operand: 1 "register_operand" "x") - (match_operand:AVXMODEF2P 2 "register_operand" "x") +(define_expand "avx_maskstore" + [(set (match_operand:VF 0 "memory_operand" "") + (unspec:VF + [(match_operand: 1 "register_operand" "") + (match_operand:VF 2 "register_operand" "") (match_dup 0)] - UNSPEC_MASKSTORE))] - "TARGET_AVX" + UNSPEC_MASKMOV))] + "TARGET_AVX") + +(define_insn "*avx_maskmov" + [(set (match_operand:VF 0 "nonimmediate_operand" "=x,m") + (unspec:VF + [(match_operand: 1 "register_operand" "x,x") + (match_operand:VF 2 "nonimmediate_operand" "m,x") + (match_dup 0)] + UNSPEC_MASKMOV))] + "TARGET_AVX + && (REG_P (operands[0]) == MEM_P (operands[2]))" "vmaskmov\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") Index: testsuite/gcc.target/i386/sse2-maskmovdqu.c =================================================================== --- testsuite/gcc.target/i386/sse2-maskmovdqu.c (revision 0) +++ testsuite/gcc.target/i386/sse2-maskmovdqu.c (revision 0) @@ -0,0 +1,44 @@ +/* { dg-do run } */ +/* { dg-require-effective-target sse2 } */ +/* { dg-options "-O2 -msse2" } */ + +#ifndef CHECK_H +#define CHECK_H "sse2-check.h" +#endif + +#ifndef TEST +#define TEST sse2_test +#endif + +#include CHECK_H + +#include + +#ifndef MASK +#define MASK 0x7986 +#endif + +#define mask_v(pos) (((MASK & (0x1 << (pos))) >> (pos)) << 7) + +void static +TEST (void) +{ + __m128i src, mask; + char s[16] = { 1,-2,3,-4,5,-6,7,-8,9,-10,11,-12,13,-14,15,-16 }; + char m[16]; + + char u[20] = { 0 }; + int i; + + for (i = 0; i < 16; i++) + m[i] = mask_v (i); + + src = _mm_loadu_si128 ((__m128i *)s); + mask = _mm_loadu_si128 ((__m128i *)m); + + _mm_maskmoveu_si128 (src, mask, u+3); + + for (i = 0; i < 16; i++) + if (u[i+3] != (m[i] ? s[i] : 0)) + abort (); +} Index: testsuite/gcc.target/i386/avx-vmaskmovdqu.c =================================================================== --- testsuite/gcc.target/i386/avx-vmaskmovdqu.c (revision 0) +++ testsuite/gcc.target/i386/avx-vmaskmovdqu.c (revision 0) @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-require-effective-target avx } */ +/* { dg-options "-O2 -mavx" } */ + +#define CHECK_H "avx-check.h" +#define TEST avx_test + +#include "sse2-maskmovdqu.c"