From patchwork Mon Mar 28 02:51:46 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 88535 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id BEE64B6F83 for ; Mon, 28 Mar 2011 13:51:57 +1100 (EST) Received: (qmail 9643 invoked by alias); 28 Mar 2011 02:51:54 -0000 Received: (qmail 9627 invoked by uid 22791); 28 Mar 2011 02:51:52 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_AV, TW_VX, TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail-iw0-f175.google.com (HELO mail-iw0-f175.google.com) (209.85.214.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 28 Mar 2011 02:51:47 +0000 Received: by iwn10 with SMTP id 10so4473268iwn.20 for ; Sun, 27 Mar 2011 19:51:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.43.65.72 with SMTP id xl8mr5481558icb.211.1301280706562; Sun, 27 Mar 2011 19:51:46 -0700 (PDT) Received: by 10.42.142.5 with HTTP; Sun, 27 Mar 2011 19:51:46 -0700 (PDT) In-Reply-To: References: Date: Sun, 27 Mar 2011 19:51:46 -0700 Message-ID: Subject: Re: PATCH: Split AVX 32byte unalignd load/store From: "H.J. Lu" To: Uros Bizjak Cc: GCC Patches X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Sun, Mar 27, 2011 at 11:57 AM, H.J. Lu wrote: > On Sun, Mar 27, 2011 at 10:53 AM, Uros Bizjak wrote: >> On Sun, Mar 27, 2011 at 3:44 PM, H.J. Lu wrote: >> >>> Here is a patch to split AVX 32byte unalignd load/store: >>> >>> http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00743.html >>> >>> It speeds up some SPEC CPU 2006 benchmarks by up to 6%. >>> OK for trunk? >> >>> 2011-02-11  H.J. Lu   >>> >>>       * config/i386/i386.c (flag_opts): Add -mavx256-split-unaligned-load >>>       and -mavx256-split-unaligned-store. >>>       (ix86_option_override_internal): Split 32-byte AVX unaligned >>>       load/store by default. >>>       (ix86_avx256_split_vector_move_misalign): New. >>>       (ix86_expand_vector_move_misalign): Use it. >>> >>>       * config/i386/i386.opt: Add -mavx256-split-unaligned-load and >>>       -mavx256-split-unaligned-store. >>> >>>       * config/i386/sse.md (*avx_mov_internal): Verify unaligned >>>       256bit load/store.  Generate unaligned store on misaligned memory >>>       operand. >>>       (*avx_movu): Verify unaligned >>>       256bit load/store. >>>       (*avx_movdqu): Likewise. >>> >>>       * doc/invoke.texi: Document -mavx256-split-unaligned-load and >>>       -mavx256-split-unaligned-store. >>> >>> gcc/testsuite/ >>> >>> 2011-02-11  H.J. Lu   >>> >>>       * gcc.target/i386/avx256-unaligned-load-1.c: New. >>>       * gcc.target/i386/avx256-unaligned-load-2.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-load-3.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-load-4.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-load-5.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-load-6.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-load-7.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-1.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-2.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-3.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-4.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-5.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-6.c: Likewise. >>>       * gcc.target/i386/avx256-unaligned-store-7.c: Likewise. >>> >> >> >> >>> @@ -203,19 +203,37 @@ >>>        return standard_sse_constant_opcode (insn, operands[1]); >>>      case 1: >>>      case 2: >>> +      if (GET_MODE_ALIGNMENT (mode) == 256 >>> +       && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE >>> +            && MEM_P (operands[0]) >>> +            && MEM_ALIGN (operands[0]) < 256) >>> +           || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD >>> +               && MEM_P (operands[1]) >>> +               && MEM_ALIGN (operands[1]) < 256))) >>> +     gcc_unreachable (); >> >> Please use "misaligned_operand (operands[...], mode)" instead of >> MEM_P && MEM_ALIGN combo in a couple of places. >> >> OK with that change. >> > > This is the patch I checked in. > I checked in this patch to revert unaligned 256bit load/store since they may be generated by intriniscs: http://gcc.gnu.org/ml/gcc-regression/2011-03/msg00477.html Index: ChangeLog =================================================================== --- ChangeLog (revision 171589) +++ ChangeLog (working copy) @@ -1,3 +1,10 @@ +2011-03-27 H.J. Lu + + * config/i386/sse.md (*avx_mov_internal): Don't assert + unaligned 256bit load/store. + (*avx_movu): Likewise. + (*avx_movdqu): Likewise. + 2011-03-27 Vladimir Makarov PR bootstrap/48307 Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 171589) +++ config/i386/sse.md (working copy) @@ -203,12 +203,6 @@ return standard_sse_constant_opcode (insn, operands[1]); case 1: case 2: - if (GET_MODE_ALIGNMENT (mode) == 256 - && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE - && misaligned_operand (operands[0], mode)) - || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD - && misaligned_operand (operands[1], mode)))) - gcc_unreachable (); switch (get_attr_mode (insn)) { case MODE_V8SF: @@ -416,15 +410,7 @@ UNSPEC_MOVU))] "AVX_VEC_FLOAT_MODE_P (mode) && !(MEM_P (operands[0]) && MEM_P (operands[1]))" -{ - if (GET_MODE_ALIGNMENT (mode) == 256 - && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE - && misaligned_operand (operands[0], mode)) - || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD - && misaligned_operand (operands[1], mode)))) - gcc_unreachable (); - return "vmovu\t{%1, %0|%0, %1}"; -} + "vmovu\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "movu" "1") (set_attr "prefix" "vex") @@ -483,15 +469,7 @@ [(match_operand:AVXMODEQI 1 "nonimmediate_operand" "xm,x")] UNSPEC_MOVU))] "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" -{ - if (GET_MODE_ALIGNMENT (mode) == 256 - && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE - && misaligned_operand (operands[0], mode)) - || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD - && misaligned_operand (operands[1], mode)))) - gcc_unreachable (); - return "vmovdqu\t{%1, %0|%0, %1}"; -} + "vmovdqu\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "movu" "1") (set_attr "prefix" "vex")