From patchwork Fri Jan 13 16:28:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 715153 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3v0SkB26Hmz9tB1 for ; Sat, 14 Jan 2017 03:28:49 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="YWIFrGAU"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=JOt73 LIbTGsyucZvm4spkfuRJaL1hxgSqbRnICrVTPVGFhNi8so19AyMftuLI+5GYyDcK RO+NgZXV3OrVNfzhG+XTFF7BeCpQTqbfM1JJSI8mhqENuE7JlJVpPEockMRCzqDL LV0ftE6tiwVK2kxuXaQUhp2QQoJBbcAfvYZkfA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=sf+jMjp/DSs jtRAP5/EHE998OGc=; b=YWIFrGAUqylnMADePvySs1NVBrKK2Hr/ovCcLMjFj7+ N/Z7oQn8NKkkZ0q3TTpvF1KO1utic2AnpoiAGWXQjF9GtMPZ0zy5s1I+uVGO8jQz PWnPbMVOLCGIiVnuO5DbEZHl9v/REyUsLxcAVnr03gR9qSlssBuGXtg5KycwlH8A = Received: (qmail 109670 invoked by alias); 13 Jan 2017 16:28:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 109651 invoked by uid 89); 13 Jan 2017 16:28:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=el, Hx-languages-length:2883 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Jan 2017 16:28:40 +0000 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id v0DGQw69006929 for ; Fri, 13 Jan 2017 11:28:38 -0500 Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by mx0a-001b2d01.pphosted.com with ESMTP id 27y2e401sj-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 13 Jan 2017 11:28:38 -0500 Received: from localhost by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 13 Jan 2017 09:28:37 -0700 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 13 Jan 2017 09:28:35 -0700 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id C83593E4005E; Fri, 13 Jan 2017 09:28:34 -0700 (MST) Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v0DGSMbA13173012; Fri, 13 Jan 2017 09:28:34 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AA267BE044; Fri, 13 Jan 2017 09:28:34 -0700 (MST) Received: from BigMac.local (unknown [9.85.189.231]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id 39B4CBE04A; Fri, 13 Jan 2017 09:28:34 -0700 (MST) To: GCC Patches Cc: Segher Boessenkool , David Edelsohn From: Bill Schmidt Subject: [PATCH, rs6000] Fix swap optimization to handle __builtin_vsx_xxspltd Date: Fri, 13 Jan 2017 10:28:33 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17011316-8235-0000-0000-00000A5B565F X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006426; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000199; SDB=6.00807166; UDB=6.00392879; IPR=6.00584498; BA=6.00005053; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013911; XFM=3.00000011; UTC=2017-01-13 16:28:36 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17011316-8236-0000-0000-0000385E5E43 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-01-13_09:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1701130223 X-IsSubscribed: yes Hi, There is a gap in swap optimization that does not properly handle code generated by __builtin_vsx_xxspltd. This is expanded into an UNSPEC_VSX_XXSPLTD, which is currently treated as ok to swap. It should instead be treated as ok to swap, with special handling to modify the lane used as the source of the splat. We have existing code to do this for other splat forms, so the patch is quite simple. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this ok for trunk? We also require backports for 5 and 6. Thanks, Bill [gcc] 2017-01-13 Bill Schmidt * config/rs6000/rs6000.c (rtx_is_swappable_p): Change UNSPEC_VSX__XXSPLTD to require special splat handling. [gcc/testsuite] 2017-01-13 Bill Schmidt * gcc.target/powerpc/swaps-p8-27.c: New. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 244382) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -41271,6 +41271,7 @@ rtx_is_swappable_p (rtx op, unsigned int *special) case UNSPEC_VSX_VEC_INIT: return 0; case UNSPEC_VSPLT_DIRECT: + case UNSPEC_VSX_XXSPLTD: *special = SH_SPLAT; return 1; case UNSPEC_REDUC_PLUS: Index: gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/swaps-p8-27.c (working copy) @@ -0,0 +1,36 @@ +/* { dg-do compile { target { powerpc64le-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O3 " } */ +/* { dg-final { scan-assembler-times "lxvd2x" 2 } } */ +/* { dg-final { scan-assembler-times "stxvd2x" 1 } } */ +/* { dg-final { scan-assembler-times "xxpermdi" 3 } } */ + +/* Verify that swap optimization works correctly for a VSX direct splat. + The three xxpermdi's that are generated correspond to two splats + and the __builtin_vsx_xxpermdi. */ + +int printf (const char *__restrict __format, ...); +typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__)); + +double s1[] = {2134.3343, 6678.346}; +double s2[] = {41124.234, 6678.346}; +long long dd[] = {1, 2}, d[2]; +union{long long l[2]; double d[2];} e; + +void +foo () +{ + __m128d source1, source2, dest; + __m128d a, b, c; + + e.d[1] = s1[1]; + e.l[0] = !__builtin_isunordered(s1[0], s2[0]) + && s1[0] == s2[0] ? -1 : 0; + source1 = __builtin_vec_vsx_ld (0, s1); + source2 = __builtin_vec_vsx_ld (0, s2); + a = __builtin_vec_splat (source1, 0); + b = __builtin_vec_splat (source2, 0); + c = (__m128d)__builtin_vec_cmpeq (a, b); + dest = __builtin_vsx_xxpermdi (source1, c, 1); + *(__m128d *)d = dest; +}