From patchwork Mon Jan 26 00:44:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 432640 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3CBB31401EF for ; Mon, 26 Jan 2015 11:50:32 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type:mime-version :content-transfer-encoding; q=dns; s=default; b=fUWQ4+7HEkgxnMnC xwU75mlpKuSMZnax3lGEBABTrIxEFUm4UKeojFu1Ig9lg8fBchZk8fu8T22FY9T0 eA9dBRkPklMAhXU9EGKoGP4mTvtR8m7uhywC/ovsjJv1Kzdt7qmZX1yXuNDACNQR Ue+vZrQ9iiQhUxXCvfD1HqNjEZM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type:mime-version :content-transfer-encoding; s=default; bh=7SHzXT3aY3lzx+iT64rDFf Xb10Y=; b=uykJrK3KAk3S42svm5z/YLkZgbOBRFp71hzmTthvya0QtKuJI4/g8D 2qdUgrJ1/Is2hK6H/Hk5yyiKDbAWaUYm4PKZdjCEEzsij6USACCHJH9HaYHU/X6r tWexjm/DwcppvKg8FH4CL5F73TLaTV1EpqT9WZw5K7dLMwqqqGUqg= Received: (qmail 26788 invoked by alias); 26 Jan 2015 00:45:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 26525 invoked by uid 89); 26 Jan 2015 00:45:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e39.co.us.ibm.com Received: from e39.co.us.ibm.com (HELO e39.co.us.ibm.com) (32.97.110.160) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 26 Jan 2015 00:45:15 +0000 Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 25 Jan 2015 17:45:09 -0700 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e39.co.us.ibm.com (192.168.1.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 25 Jan 2015 17:45:07 -0700 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id CBBBC3E40030 for ; Sun, 25 Jan 2015 17:45:06 -0700 (MST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t0Q0j6KP26083510 for ; Sun, 25 Jan 2015 17:45:14 -0700 Received: from d03av02.boulder.ibm.com (localhost [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t0Q0iY1j017079 for ; Sun, 25 Jan 2015 17:44:34 -0700 Received: from [9.50.20.22] ([9.50.20.22]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t0Q0iXXF016641; Sun, 25 Jan 2015 17:44:33 -0700 Message-ID: <1422233061.321.33.camel@gnopaine> Subject: [PATCH, rs6000] Allow swap removal for convert-splat idiom From: Bill Schmidt To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com Date: Sun, 25 Jan 2015 18:44:21 -0600 Mime-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15012600-0033-0000-0000-000003743064 X-IsSubscribed: yes Hi, A not uncommon idiom on Power for vector floating-point computation is used to convert a double-precision value to single-precision and copy it to all elements of a vector float. For this we see a specific convert UNSPEC feeding an xxspltw pattern that copies from BE element zero. Since all elements of the result are the same regardless of whether swaps are present, this should not kill the vector swap removal optimization for the containing computation. This patch permits that. The issue was reported privately to me, and I have created a test case that reduces and anonymizes the original code. Is this ok for trunk after GCC 5 branches? I would also like to backport it to GCC 5 subsequently. Thanks, Bill [gcc] 2015-01-25 Bill Schmidt * config/rs6000/rs6000.c (rtx_is_swappable_p): Commentary adjustments. (insn_is_swappable_p): Return 1 for a convert from double to single precision when all of its uses are splats of BE element zero. [gcc/testsuite] 2015-01-25 Bill Schmidt * gcc.target/powerpc/swaps-p8-18.c: New test. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 219191) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -34046,7 +34046,8 @@ rtx_is_swappable_p (rtx op, unsigned int *special) order-dependent element, so additional fixup code would be needed to make those work. Vector set and non-immediate-form vector splat are element-order sensitive. A few of these - cases might be workable with special handling if required. */ + cases might be workable with special handling if required. + Adding cost modeling would be appropriate in some cases. */ int val = XINT (op, 1); switch (val) { @@ -34085,12 +34086,6 @@ rtx_is_swappable_p (rtx op, unsigned int *special) case UNSPEC_VUPKLPX: case UNSPEC_VUPKLS_V4SF: case UNSPEC_VUPKLU_V4SF: - /* The following could be handled as an idiom with XXSPLTW. - These place a scalar in BE element zero, but the XXSPLTW - will currently expect it in BE element 2 in a swapped - region. When one of these feeds an XXSPLTW with no other - defs/uses either way, we can avoid the lane change for - XXSPLTW and things will be correct. TBD. */ case UNSPEC_VSX_CVDPSPN: case UNSPEC_VSX_CVSPDP: case UNSPEC_VSX_CVSPDPN: @@ -34179,6 +34174,36 @@ insn_is_swappable_p (swap_web_entry *insn_entry, r return 0; } + /* A convert to single precision can be left as is provided that + all of its uses are in xxspltw instructions that splat BE element + zero. */ + if (GET_CODE (body) == SET + && GET_CODE (SET_SRC (body)) == UNSPEC + && XINT (SET_SRC (body), 1) == UNSPEC_VSX_CVDPSPN) + { + df_ref def; + struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); + + FOR_EACH_INSN_INFO_DEF (def, insn_info) + { + struct df_link *link = DF_REF_CHAIN (def); + if (!link) + return 0; + + for (; link; link = link->next) { + rtx use_insn = DF_REF_INSN (link->ref); + rtx use_body = PATTERN (use_insn); + if (GET_CODE (use_body) != SET + || GET_CODE (SET_SRC (use_body)) != UNSPEC + || XINT (SET_SRC (use_body), 1) != UNSPEC_VSX_XXSPLTW + || XEXP (XEXP (SET_SRC (use_body), 0), 1) != const0_rtx) + return 0; + } + } + + return 1; + } + /* Otherwise check the operands for vector lane violations. */ return rtx_is_swappable_p (body, special); } Index: gcc/testsuite/gcc.target/powerpc/swaps-p8-18.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/swaps-p8-18.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/swaps-p8-18.c (working copy) @@ -0,0 +1,35 @@ +/* { dg-do compile { target { powerpc64le-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O3" } */ +/* { dg-final { scan-assembler-not "xxpermdi" } } */ + +/* This is a test for a specific convert-splat permute removal. */ + +void compute (float*, float*, float*, int, int); +double test (void); +double gorp; + +int main (void) +{ + float X[10000], Y[256], Z[2000]; + int i; + for (i = 0; i < 2500; i++) + compute (X, Y, Z, 256, 2000); + gorp = test (); +} + +void compute(float *X, float *Y, float *Z, int m, int n) +{ + int i, j; + float w, *x, *y; + + for (i = 0; i < n; i++) + { + w = 0.0; + x = X++; + y = Y; + for (j = 0; j < m; j++) + w += (*x++) * (*y++); + Z[i] = w; + } +}