From patchwork Mon Oct 21 16:39:54 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 285234 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id D120C2C00BC for ; Tue, 22 Oct 2013 03:40:13 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=FE/+AVnlWaaeE0IQq1IdaiIj3lA3RTBNkliAyu6Vq1bMyBNIEZCAY aS0L2pe3esurllj8oIfzKxlHC2VwLtX34qWLPilSlc33nKrcz7ZmYB0J/k8HIXUk jDGzC9KNoGPwIs8D8XxbZvBtV+Jgxm4TaMD+W48e7ReL+GrliTYOHE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; s=default; bh=NP8mj86WJjqUXUgAmQ3t/CAGalk=; b=A6islSqdmCjbQSijYeRIwXZlEIB3 1FGCqVhAdyuPy5S1j04nUyHVV/7TqX2O2v5VXWMidEFjp9GEs4i3OqLA+UQxIbRz NjAIkegnleoWxKAUrYaDeu3omh4/e5xG1RDwAz2/73Y3CwmsQ/pfEj5wZw3X3GD6 xH8/qmF3OPQRiWk= Received: (qmail 14002 invoked by alias); 21 Oct 2013 16:40:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 13992 invoked by uid 89); 21 Oct 2013 16:40:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e23smtp09.au.ibm.com Received: from e23smtp09.au.ibm.com (HELO e23smtp09.au.ibm.com) (202.81.31.142) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 21 Oct 2013 16:40:05 +0000 Received: from /spool/local by e23smtp09.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 22 Oct 2013 02:39:35 +1000 Received: from d23dlp01.au.ibm.com (202.81.31.203) by e23smtp09.au.ibm.com (202.81.31.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 22 Oct 2013 02:39:34 +1000 Received: from d23relay05.au.ibm.com (d23relay05.au.ibm.com [9.190.235.152]) by d23dlp01.au.ibm.com (Postfix) with ESMTP id 0C62E2CE804A for ; Tue, 22 Oct 2013 03:39:34 +1100 (EST) Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay05.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r9LGMDYv8192322 for ; Tue, 22 Oct 2013 03:22:13 +1100 Received: from d23av04.au.ibm.com (localhost [127.0.0.1]) by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r9LGdXaq001317 for ; Tue, 22 Oct 2013 03:39:33 +1100 Received: from [9.65.233.72] (sig-9-65-233-72.mts.ibm.com [9.65.233.72]) by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id r9LGdUjg001290; Tue, 22 Oct 2013 03:39:32 +1100 Message-ID: <1382373594.6275.164.camel@gnopaine> Subject: Re: [PATCH, rs6000] Be careful with special permute masks for little endian From: Bill Schmidt To: David Edelsohn Cc: GCC Patches Date: Mon, 21 Oct 2013 11:39:54 -0500 In-Reply-To: References: <1382359742.6275.162.camel@gnopaine> Mime-Version: 1.0 X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13102116-3568-0000-0000-0000046F1EED X-IsSubscribed: yes Whoops, looks like I missed some simpler cases (REG with the wrong mode instead of SUBREG with the wrong mode). Is this revised version ok, assuming it passes testing? It should fix a few more test cases. The changed code from the previous version is in the last hunk. Thanks, Bill On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote: > On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt > wrote: > > Hi, > > > > In altivec_expand_vec_perm_const, we look for special masks that match > > the behavior of specific instructions, so we can use those instructions > > rather than load a constant control vector and perform a permute. Some > > of the masks must be treated differently for little endian mode. > > > > The masks that represent merge-high and merge-low operations have > > reversed meanings in little-endian, because of the reversed ordering of > > the vector elements. > > > > The masks that represent vector-pack operations remain correct when the > > mode of the input operands matches the natural mode of the instruction, > > but not otherwise. This is because the pack instructions always select > > the rightmost, low-order bits of the vector element. There are cases > > where we use this, for example, with a V8SI vector matching a vpkuwum > > mask in order to select the odd-numbered elements of the vector. In > > little endian mode, this instruction will get us the even-numbered > > elements instead. There is no alternative instruction with the desired > > behavior, so I've just disabled use of those masks for little endian > > when the mode isn't natural. > > > > These changes fix 32 failures in the test suite for little endian mode. > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new > > failures. Is this ok for trunk? > > > > Thanks, > > Bill > > > > > > 2013-10-21 Bill Schmidt > > > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse > > meaning of merge-high and merge-low masks for little endian; avoid > > use of vector-pack masks for little endian for mismatched modes. > > Okay. > > Thanks, David > Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 203792) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4]) { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } }, { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, { 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb, { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh, { 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw, { 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb, { 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh, { 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw, { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } }, { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew, { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } }, @@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4]) enum machine_mode omode = insn_data[icode].operand[0].mode; enum machine_mode imode = insn_data[icode].operand[1].mode; + /* For little-endian, don't use vpkuwum and vpkuhum if the + underlying vector type is not V4SI and V8HI, respectively. + For example, using vpkuwum with a V8HI picks up the even + halfwords (BE numbering) when the even halfwords (LE + numbering) are what we need. */ + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuwum + && ((GET_CODE (op0) == REG + && GET_MODE (op0) != V4SImode) + || (GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V4SImode))) + continue; + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuhum + && ((GET_CODE (op0) == REG + && GET_MODE (op0) != V8HImode) + || (GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V8HImode))) + continue; + /* For little-endian, the two input operands must be swapped (or swapped back) to ensure proper right-to-left numbering from 0 to 2N-1. */