From patchwork Fri Aug 20 18:33:59 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 62301 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id F2A32B70DF for ; Sat, 21 Aug 2010 04:33:53 +1000 (EST) Received: (qmail 16316 invoked by alias); 20 Aug 2010 18:33:51 -0000 Received: (qmail 16308 invoked by uid 22791); 20 Aug 2010 18:33:50 -0000 X-SWARE-Spam-Status: No, hits=-6.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 20 Aug 2010 18:33:45 +0000 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o7KIXavK024508 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 20 Aug 2010 14:33:36 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o7KIXVAx000956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Aug 2010 14:33:33 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id o7KIXxZu023007; Fri, 20 Aug 2010 20:33:59 +0200 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id o7KIXxVh023005; Fri, 20 Aug 2010 20:33:59 +0200 Date: Fri, 20 Aug 2010 20:33:59 +0200 From: Jakub Jelinek To: Paolo Bonzini , "H.J. Lu" Cc: Bernd Schmidt , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Optimize nested SIGN_EXTENDs/ZERO_EXTENDs (PR target/45336) Message-ID: <20100820183359.GH702@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek References: <20100819163330.GX702@tyan-ft48-01.lab.bos.redhat.com> <4C6E4F08.6070801@gnu.org> <20100820135046.GC702@tyan-ft48-01.lab.bos.redhat.com> <20100820172757.GF702@tyan-ft48-01.lab.bos.redhat.com> <4C6EC2C1.5030603@gnu.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4C6EC2C1.5030603@gnu.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Fri, Aug 20, 2010 at 08:00:33PM +0200, Paolo Bonzini wrote: > On 08/20/2010 07:27 PM, Jakub Jelinek wrote: > >Not sure what exactly is > >pextrb ..., %ecx > >insn doing to the upper 32 bits of %rcx, if it clears them > > Probably yes like every other 32-bit writeback on x86_64. The manuals confirm that. Following seems to work just fine in the quick testing I've done so far: 2010-08-20 Jakub Jelinek * config/i386/sse.md (*sse4_1_pextrb): Add SWI48 mode iterator to cover zero extension into 64-bit register. (*sse2_pextrw): Likewise. (*sse4_1_pextrd_zext): New insn. Jakub --- gcc/config/i386/sse.md.jj 2010-08-11 21:08:03.000000000 +0200 +++ gcc/config/i386/sse.md 2010-08-20 20:24:08.000000000 +0200 @@ -7075,14 +7075,14 @@ (define_insn "*sse4_1_pinsrq" (set_attr "length_immediate" "1") (set_attr "mode" "TI")]) -(define_insn "*sse4_1_pextrb" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*sse4_1_pextrb_" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (zero_extend:SWI48 (vec_select:QI (match_operand:V16QI 1 "register_operand" "x") (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))] "TARGET_SSE4_1" - "%vpextrb\t{%2, %1, %0|%0, %1, %2}" + "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -7102,14 +7102,14 @@ (define_insn "*sse4_1_pextrb_memory" (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) -(define_insn "*sse2_pextrw" - [(set (match_operand:SI 0 "register_operand" "=r") - (zero_extend:SI +(define_insn "*sse2_pextrw_" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (zero_extend:SWI48 (vec_select:HI (match_operand:V8HI 1 "register_operand" "x") (parallel [(match_operand:SI 2 "const_0_to_7_operand" "n")]))))] "TARGET_SSE2" - "%vpextrw\t{%2, %1, %0|%0, %1, %2}" + "%vpextrw\t{%2, %1, %k0|%k0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix_data16" "1") (set_attr "length_immediate" "1") @@ -7142,6 +7142,20 @@ (define_insn "*sse4_1_pextrd" (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) +(define_insn "*sse4_1_pextrd_zext" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (vec_select:SI + (match_operand:V4SI 1 "register_operand" "x") + (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n")]))))] + "TARGET_64BIT && TARGET_SSE4_1" + "%vpextrd\t{%2, %1, %k0|%k0, %1, %2}" + [(set_attr "type" "sselog") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "TI")]) + ;; It must come before *vec_extractv2di_1_sse since it is preferred. (define_insn "*sse4_1_pextrq" [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")