From patchwork Tue Aug 15 21:14:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 801781 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-460404-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="AHOObDjb"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xX53s6gkfz9t1t for ; Wed, 16 Aug 2017 07:20:24 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=ZuhkR FYHZZ4OJEqrpQLA6zuov5VoC8k+C0tXcVd6eHitls3VbsUymrEeQ1KVDcM0z5pzv 7XKZ0BRYdftZQ2GyZ6lR+fyNRPXYxNeIe7YZGN+9RgjpEddzwh3Vhs46jh/X9T+h MMgYFhOHN0YwMIw+U+dbl0HlOqrVEELpMHUpSM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=0LbegOTbqeQ 8Zo3hLLpwEFndWiU=; b=AHOObDjbdruzfyK0KI+hrQtiwq6vjKnxZq9Exfdtz32 GfCS+3gZp5bX+Y16n9AVybSvMe1ZrPW2lsOHlRN3SZqFIyDQNGpz4SZYqpdOVJUV hnk/qS5RlKoE/pzuoI2PUb4kjrIweBgHtosIHjnOF+qM7+fDG4izi3+mWfsoSv2s = Received: (qmail 120473 invoked by alias); 15 Aug 2017 21:16:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 54079 invoked by uid 89); 15 Aug 2017 21:14:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=loves, H*Ad:U*wschmidt X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 15 Aug 2017 21:14:32 +0000 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7FLEAlS017914 for ; Tue, 15 Aug 2017 17:14:25 -0400 Received: from e13.ny.us.ibm.com (e13.ny.us.ibm.com [129.33.205.203]) by mx0b-001b2d01.pphosted.com with ESMTP id 2cc6f6nuqj-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 15 Aug 2017 17:14:24 -0400 Received: from localhost by e13.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 15 Aug 2017 17:14:24 -0400 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e13.ny.us.ibm.com (146.89.104.200) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 15 Aug 2017 17:14:22 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v7FLEM1O23068686; Tue, 15 Aug 2017 21:14:22 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E7369B2046; Tue, 15 Aug 2017 17:11:47 -0400 (EDT) Received: from bigmac.rchland.ibm.com (unknown [9.10.86.172]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP id A6292B204E; Tue, 15 Aug 2017 17:11:47 -0400 (EDT) To: GCC Patches Cc: Segher Boessenkool , David Edelsohn , cel@linux.vnet.ibm.com From: Bill Schmidt Subject: [PATCH, rs6000] Fix endianness issue with vmrgew and vmrgow permute constant recognition Date: Tue, 15 Aug 2017 16:14:21 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 17081521-0008-0000-0000-0000026F0E65 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007551; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000221; SDB=6.00902829; UDB=6.00452211; IPR=6.00683044; BA=6.00005534; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016710; XFM=3.00000015; UTC=2017-08-15 21:14:23 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17081521-0009-0000-0000-00003662D750 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-08-15_14:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708150350 X-IsSubscribed: yes Hi, One of Carl Love's proposed built-in function patches exposed a bug in the Power code that recognizes specific permute control vector patterns for a permute, and changes the permute to a more specific and more efficient instruction. The patterns for p8_vmrgew_v4si and p8_vmrgow are generated regardless of endianness, leading to problems on the little-endian port. The normal way that would cause us to generate these patterns is via the vec_widen_[su]mult_{even,odd}_ interfaces, which are not yet instantiated for Power; hence it appears that we've gotten lucky not to run into this before. Carl's proposed patch instantiated these interfaces, triggering the discovery of the problem. This patch simply changes the handling for p8_vmrg[eo]w to match how it's done for all of the other common pack/merge/etc. patterns. In altivec.md, we already had a p8_vmrgew_v4sf_direct insn that does what we want. I generalized this for both V4SF and V4SI modes. I then added a similar p8_vmrgow__direct define_insn. The use in rs6000.c of p8_vmrgew_v4sf_direct, rather than p8_vmrgew_v4si_direct, is arbitrary. The existing code already handles converting (for free) a V4SI operand to a V4SF one, so there's no need to specify the mode directly; and it would actually complicate the code to extract the mode so the "proper" pattern would match. I think what I have here is better, but if you disagree I can change it. Bootstrapped and tested on powerpc64le-linux-gnu (P8 64-bit) and on powerpc64-linux-gnu (P7 32- and 64-bit) with no regressions. Is this okay for trunk? Thanks, Bill 2017-08-15 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_VMRGOW_DIRECT): New constant. (p8_vmrgew_v4sf_direct): Generalize to p8_vmrgew__direct. (p8_vmrgow__direct): New define_insn. * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Properly handle endianness for vmrgew and vmrgow permute patterns. Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 250965) +++ gcc/config/rs6000/altivec.md (working copy) @@ -148,6 +148,7 @@ UNSPEC_VMRGL_DIRECT UNSPEC_VSPLT_DIRECT UNSPEC_VMRGEW_DIRECT + UNSPEC_VMRGOW_DIRECT UNSPEC_VSUMSWS_DIRECT UNSPEC_VADDCUQ UNSPEC_VADDEUQM @@ -1357,15 +1358,24 @@ } [(set_attr "type" "vecperm")]) -(define_insn "p8_vmrgew_v4sf_direct" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")] +(define_insn "p8_vmrgew__direct" + [(set (match_operand:VSX_W 0 "register_operand" "=v") + (unspec:VSX_W [(match_operand:VSX_W 1 "register_operand" "v") + (match_operand:VSX_W 2 "register_operand" "v")] UNSPEC_VMRGEW_DIRECT))] "TARGET_P8_VECTOR" "vmrgew %0,%1,%2" [(set_attr "type" "vecperm")]) +(define_insn "p8_vmrgow__direct" + [(set (match_operand:VSX_W 0 "register_operand" "=v") + (unspec:VSX_W [(match_operand:VSX_W 1 "register_operand" "v") + (match_operand:VSX_W 2 "register_operand" "v")] + UNSPEC_VMRGOW_DIRECT))] + "TARGET_P8_VECTOR" + "vmrgow %0,%1,%2" + [(set_attr "type" "vecperm")]) + (define_expand "vec_widen_umult_even_v16qi" [(use (match_operand:V8HI 0 "register_operand" "")) (use (match_operand:V16QI 1 "register_operand" "")) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 250965) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -35209,9 +35209,13 @@ altivec_expand_vec_perm_const (rtx operands[4]) (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct : CODE_FOR_altivec_vmrghw_direct), { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } }, - { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew_v4si, + { OPTION_MASK_P8_VECTOR, + (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgew_v4sf_direct + : CODE_FOR_p8_vmrgow_v4sf_direct), { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } }, - { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgow, + { OPTION_MASK_P8_VECTOR, + (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct + : CODE_FOR_p8_vmrgew_v4sf_direct), { 4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31 } } };