From patchwork Sat Nov 8 12:49:35 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 408341 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id DCE9D1400D2 for ; Sat, 8 Nov 2014 23:49:47 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type :content-id; q=dns; s=default; b=g5uT/PqgRqsBX3XCqhiJBbX6I47nyxE meCl4yUZjh2gz5jZfARdOhJbOxUtnI+kPIxqW8vUjb7II8Nr80XQsiyu5S+MI9/E ryZBnip+AVSuoqFknV/k2nT1U1+iXGW4AU6HiVpXiqc4JozNoVcmDYhNVzSUxkHs PNZey9VxdI2E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type :content-id; s=default; bh=ILrKBnBFrlUBcpW/xi2f1J8lrlo=; b=y5Mt3 1z3tW3ni1AODW9Ym/+U3PcecuSb6rCr+JqNsSrhqCZcpPgJvVCyD8JguZOyu2dcO dGv+EELOfUTpm02rjhzla27R4U10T060vzsLIDLRXOYIJtKPZSvVICKEG8Nml8Pa jr2QSEub4/DZlq6uQRKMptIXt6mnE4Rcws3F3E= Received: (qmail 19497 invoked by alias); 8 Nov 2014 12:49:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19487 invoked by uid 89); 8 Nov 2014 12:49:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail3-relais-sop.national.inria.fr Received: from mail3-relais-sop.national.inria.fr (HELO mail3-relais-sop.national.inria.fr) (192.134.164.104) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sat, 08 Nov 2014 12:49:39 +0000 Received: from stedding.saclay.inria.fr (HELO stedding) ([193.55.250.194]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/AES128-SHA; 08 Nov 2014 13:49:35 +0100 Received: from glisse (helo=localhost) by stedding with local-esmtp (Exim 4.84) (envelope-from ) id 1Xn5SR-0004O3-Qw; Sat, 08 Nov 2014 13:49:35 +0100 Date: Sat, 8 Nov 2014 13:49:35 +0100 (CET) From: Marc Glisse To: gcc-patches@gcc.gnu.org cc: ubizjak@gmail.com Subject: [x86, 4/n] Replace builtins with vector extensions Message-ID: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-ID: Hello, this patch uses &|^ for 128 bit integer vectors. I am doing the operations in type __v2du because __builtin_ia32_pand128 was apparently taking __v2di arguments, but using __v4su or any other should be equivalent. Even __int128 would in principle be ok, but since it is not usually stored in a vector register, it seems more likely to generate unexpected code (and we don't have __int256 so it would be inconsistent with other sizes). Regtested with patch 3/n. Ok for the branch? After that, I will post a last patch to generalize &|^ to sizes 256 and 512, and I think that will be enough for gcc-5, we should discuss merging. "< > == abs min max" can wait until gcc-6, possibly after getting some feedback about +-*/&|^. 2014-11-10 Marc Glisse * config/i386/emmintrin.h (_mm_and_si128, _mm_or_si128, _mm_xor_si128): Use vector extensions instead of builtins. Index: config/i386/emmintrin.h =================================================================== --- config/i386/emmintrin.h (revision 217249) +++ config/i386/emmintrin.h (working copy) @@ -1244,39 +1244,39 @@ _mm_srl_epi32 (__m128i __A, __m128i __B) extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_srl_epi64 (__m128i __A, __m128i __B) { return (__m128i)__builtin_ia32_psrlq128 ((__v2di)__A, (__v2di)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_and_si128 (__m128i __A, __m128i __B) { - return (__m128i)__builtin_ia32_pand128 ((__v2di)__A, (__v2di)__B); + return (__m128i) ((__v2du)__A & (__v2du)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_andnot_si128 (__m128i __A, __m128i __B) { return (__m128i)__builtin_ia32_pandn128 ((__v2di)__A, (__v2di)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_or_si128 (__m128i __A, __m128i __B) { - return (__m128i)__builtin_ia32_por128 ((__v2di)__A, (__v2di)__B); + return (__m128i) ((__v2du)__A | (__v2du)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_xor_si128 (__m128i __A, __m128i __B) { - return (__m128i)__builtin_ia32_pxor128 ((__v2di)__A, (__v2di)__B); + return (__m128i) ((__v2du)__A ^ (__v2du)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cmpeq_epi8 (__m128i __A, __m128i __B) { return (__m128i)__builtin_ia32_pcmpeqb128 ((__v16qi)__A, (__v16qi)__B); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cmpeq_epi16 (__m128i __A, __m128i __B)