From patchwork Tue Sep 19 17:38:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 815737 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-462518-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Cos5DYRs"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xxVV359QHz9sMN for ; Wed, 20 Sep 2017 03:38:50 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=ULiNB nKwTcKx1c3PwxYXFS258IMJpYON1XE0baFPSaeHdNcU4rBtmA9zaxSctG7X0krCa +d3MGnCQzvLxSdHHjT3WqR5HzXSGLZe9OBNica9Bui9X/RhEJlSWzWbkY42BWTxN rIMfycB0LLvJN+bNVL1NN8PoATAkd+GTQwqTvw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=LjzYFjQUVEo /G9ENnF5LJLdjFVw=; b=Cos5DYRsPcQv56kfw1YFX9p4X2padr7O7k2lz3a44oh Nf4Zxc1WFfO3NniLpzrFoRHDkem4f2aNx2RATqBvj7nbvGJV29lcARwTBcyM+CYB CQVIRf8HXbqqpIXyhuZG6fneUi5kz4oFnWa3bkWOXwlfr1gYx+P5PQwfxmKkqczI = Received: (qmail 10381 invoked by alias); 19 Sep 2017 17:38:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10133 invoked by uid 89); 19 Sep 2017 17:38:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Sep 2017 17:38:40 +0000 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v8JHYBxI019774 for ; Tue, 19 Sep 2017 13:38:38 -0400 Received: from e38.co.us.ibm.com (e38.co.us.ibm.com [32.97.110.159]) by mx0b-001b2d01.pphosted.com with ESMTP id 2d374njb8f-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 19 Sep 2017 13:38:38 -0400 Received: from localhost by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 19 Sep 2017 11:38:37 -0600 Received: from b03cxnp08027.gho.boulder.ibm.com (9.17.130.19) by e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 19 Sep 2017 11:38:36 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v8JHcZL864880768; Tue, 19 Sep 2017 10:38:35 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B17DABE03B; Tue, 19 Sep 2017 11:38:35 -0600 (MDT) Received: from bigmac.rchland.ibm.com (unknown [9.10.86.143]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id 8632DBE039; Tue, 19 Sep 2017 11:38:35 -0600 (MDT) To: GCC Patches , Richard Biener From: Bill Schmidt Subject: [PATCH] Fix PR82255 (vectorizer cost model overcounts some vector load costs) Date: Tue, 19 Sep 2017 12:38:35 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 17091917-0028-0000-0000-00000861A2DD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007763; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000230; SDB=6.00919367; UDB=6.00461890; IPR=6.00699593; BA=6.00005598; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017213; XFM=3.00000015; UTC=2017-09-19 17:38:37 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17091917-0029-0000-0000-0000379DC1E8 Message-Id: <7570cb71-cb74-d97f-3b7a-b161631e36c5@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-09-19_07:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1709190247 X-IsSubscribed: yes Hi, https://gcc.gnu.org/PR82255 identifies a problem in the vector cost model where a vectorized load is treated as having the cost of a strided load in a case where we will not actually generate a strided load. This is simply a mismatch between the conditions tested in the cost model and those tested in the code that generates vectorized instructions. This patch fixes the problem by recognizing when only a single non-strided load will be generated and reporting the cost accordingly. I believe this patch is sufficient to catch all such cases, but I admit that the code in vectorizable_load is complex enough that I could have missed a trick. I've added a test in the PowerPC cost model subdirectory. Even though this isn't a target-specific issue, the test does rely on a 16-byte vector size, so this seems safest. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this ok for trunk? Thanks! Bill [gcc] 2017-09-19 Bill Schmidt PR tree-optimization/82255 * tree-vect-stmts.c (vect_model_load_cost): Don't count vec_construct cost when a true strided load isn't present. [gcc/testsuite] 2017-09-19 Bill Schmidt PR tree-optimization/82255 * gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c: New file. Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c =================================================================== --- gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c (nonexistent) +++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c (working copy) @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ + +/* PR82255: Ensure we don't require a vec_construct cost when we aren't + going to generate a strided load. */ + +extern int abs (int __x) __attribute__ ((__nothrow__, __leaf__)) __attribute__ ((__const__)); + +static int +foo (unsigned char *w, int i, unsigned char *x, int j) +{ + int tot = 0; + for (int a = 0; a < 16; a++) + { + for (int b = 0; b < 16; b++) + tot += abs (w[b] - x[b]); + w += i; + x += j; + } + return tot; +} + +void +bar (unsigned char *w, unsigned char *x, int i, int *result) +{ + *result = foo (w, 16, x, i); +} + +/* { dg-final { scan-tree-dump-times "vec_construct required" 0 "vect" } } */ + Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c (revision 252760) +++ gcc/tree-vect-stmts.c (working copy) @@ -1091,8 +1091,20 @@ vect_model_load_cost (stmt_vec_info stmt_info, int prologue_cost_vec, body_cost_vec, true); if (memory_access_type == VMAT_ELEMENTWISE || memory_access_type == VMAT_STRIDED_SLP) - inside_cost += record_stmt_cost (body_cost_vec, ncopies, vec_construct, - stmt_info, 0, vect_body); + { + stmt_vec_info stmt_info = vinfo_for_stmt (first_stmt); + int group_size = GROUP_SIZE (stmt_info); + int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)); + if (group_size < nunits) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_model_load_cost: vec_construct required"); + inside_cost += record_stmt_cost (body_cost_vec, ncopies, + vec_construct, stmt_info, 0, + vect_body); + } + } if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location,