From patchwork Tue Jun 21 20:14:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 638874 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rYzW20gyQz9t0Y for ; Wed, 22 Jun 2016 06:15:41 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=vodRYFIj; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :content-type:content-transfer-encoding:subject:date:cc:to :mime-version:message-id; q=dns; s=default; b=GeZJby16z+K9zNIvvB 72ZOR9mHDqQ63q+UgDLtAry6gpjtDFVfK+w9PmiMekQFXbI1+Yxp7YkkQkk9zRQF 9XD4+yxOLth0jDqifbOWzEQyyy2xgH9IgRuuJbGebzmAN8yp9875ONOxGoD1zB2S f9KZB7Y50uKLsUT0LP7SlnPoI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :content-type:content-transfer-encoding:subject:date:cc:to :mime-version:message-id; s=default; bh=rPYOaYnD216qCYOUFOmh7ot3 Ii0=; b=vodRYFIj+6LN9lpOLSuv+ytuWfIcVYSL0WpBvXMLlIwZLSe12n9pnDzH QbXCK/BBF6F4CgSsxoZqRjdsBY6rfYo3v3wQkAVoIcmcP0qMqk6HvvMJTvqc4b8F yVJt9nRfnoftxBkBodQzI8vSLzuzuS0+PAmLFb2uP2tM/OF53L8= Received: (qmail 64657 invoked by alias); 21 Jun 2016 20:15:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 56644 invoked by uid 89); 21 Jun 2016 20:15:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=altivech, altivec.h, UD:altivec.h, vmx X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 21 Jun 2016 20:15:01 +0000 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u5LK8m7p120315 for ; Tue, 21 Jun 2016 16:14:58 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0b-001b2d01.pphosted.com with ESMTP id 23q7023u4g-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 21 Jun 2016 16:14:58 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 21 Jun 2016 14:14:58 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 21 Jun 2016 14:14:55 -0600 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: wschmidt@linux.vnet.ibm.com Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 29EFF19D8059; Tue, 21 Jun 2016 14:14:33 -0600 (MDT) Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u5LKEsZV43646990; Tue, 21 Jun 2016 13:14:54 -0700 Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 45087136049; Tue, 21 Jun 2016 14:14:54 -0600 (MDT) Received: from [9.80.223.194] (unknown [9.80.223.194]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTPS id B08A413603A; Tue, 21 Jun 2016 14:14:53 -0600 (MDT) From: Bill Schmidt Subject: [PATCH, rs6000] Prefer vspltisw/h over xxspltib+instruction when available Date: Tue, 21 Jun 2016 15:14:51 -0500 Cc: Segher Boessenkool , David Edelsohn To: GCC Patches Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16062120-0020-0000-0000-00000926DF5A X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16062120-0021-0000-0000-000053058AA7 Message-Id: <73436B77-BC85-4919-975E-915A6D93F585@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-06-21_10:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606210224 X-IsSubscribed: yes Hi, I discovered recently that, with -mcpu=power9, an attempt to generate a vspltish instruction resulted instead in an xxspltib followed by a vupkhsb. This is semantically correct but the extra instruction is not optimal. I found that there was some logic in xxspltib_constant_p to do special casing for const_vector with small constants, but not for vec_duplicate with small constants. This patch duplicates that logic so we can generate the single instruction when possible. When I did this, I ran into a problem with an existing test case. We end up matching the *vsx_splat_v4si_internal pattern instead of falling back to the altivec_vspltisw pattern. The constraints don't match for constant input. To avoid this, I added a pattern ahead of this one that will match for VMX output registers and produce the vspltisw as desired. This corrected the failing test and produces the expected code. I've added a test case to demonstrate the code works properly now in the usual case. Bootstrapped and tested on powerpc64le-unknown-linux-gnu. OK for trunk, and for 6.2 after suitable burn-in? Thanks! Bill [gcc] 2016-06-21 Bill Schmidt * config/rs6000/rs6000.c (xxspltib_constant_p): Prefer vspltisw/h for vec_duplicate when this is cheaper. * config/rs6000/vsx.md (*vsx_splat_v4si_altivec): New define_insn. [gcc/testsuite] 2016-06-21 Bill Schmidt * gcc.target/powerpc/splat-p9-1.c: New test. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 237619) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6329,6 +6329,13 @@ xxspltib_constant_p (rtx op, value = INTVAL (element); if (!IN_RANGE (value, -128, 127)) return false; + + /* See if we could generate vspltisw/vspltish directly instead of + xxspltib + sign extend. Special case 0/-1 to allow getting + any VSX register instead of an Altivec register. */ + if (!IN_RANGE (value, -1, 0) && EASY_VECTOR_15 (value) + && (mode == V4SImode || mode == V8HImode)) + return false; } /* Handle (const_vector [...]). */ Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 237619) +++ gcc/config/rs6000/vsx.md (working copy) @@ -2400,6 +2400,17 @@ operands[1] = force_reg (mode, operands[1]); }) +;; The pattern following this one hides altivec_vspltisw, which we +;; prefer to match when possible, so duplicate that here for +;; TARGET_P9_VECTOR. +(define_insn "*vsx_splat_v4si_altivec" + [(set (match_operand:V4SI 0 "altivec_register_operand" "=v") + (vec_duplicate:V4SI + (match_operand:QI 1 "s5bit_cint_operand" "i")))] + "TARGET_P9_VECTOR" + "vspltisw %0,%1" + [(set_attr "type" "vecperm")]) + (define_insn "*vsx_splat_v4si_internal" [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa,wa") (vec_duplicate:V4SI Index: gcc/testsuite/gcc.target/powerpc/splat-p9-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/splat-p9-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/splat-p9-1.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-maltivec -mcpu=power9" } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-final { scan-assembler "vspltish" } } */ +/* { dg-final { scan-assembler-not "xxspltib" } } */ + +/* Make sure we don't use an inefficient sequence for small integer splat. */ + +#include + +vector short +foo () +{ + return vec_splat_s16 (5); +}