From patchwork Thu Nov 7 03:22:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1190879 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512666-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="T7vcQZaf"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 477pcX4rszz9sP4 for ; Thu, 7 Nov 2019 14:22:35 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; q=dns; s=default; b=cTLOgBnCcdCN8X0zpK dXXSM9c6sREdO5eUJ80Uz75VdHy1ZLnsTfe3OGrAYpv+W+XO9XeZmVY2Xlo+JUGZ SMPeCFrU1Nw6Yqxk5sACx1wW5h8dQWoAQTST+8YtmD+A1PUJ79ud/RJORdHLXmFG UAqdPplLg2J+Z4OjXS+ntHlkg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; s=default; bh=a+JbLj0MtfT9AL8dCgZJ0AIf PV8=; b=T7vcQZafDYk1cFHGpXIC6it2rn2RS7zWSXxXLXDgX3zhDbbucHnRPB+p 0dO64iu79oaBqwSnDxnd2NvN13P1/4TbrQGAMgFQziEMImWRk3iUDWm4oEC9jhV4 kx1AS6ucw/KwQe2YbS3eGJAZ2TgrWGDFbG8A7jTnUwkA/36c6J4= Received: (qmail 52821 invoked by alias); 7 Nov 2019 03:22:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 52807 invoked by uid 89); 7 Nov 2019 03:22:26 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_NUMSUBJECT, MIME_CHARSET_FARAWAY, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=H*f:sk:804b71d, H*f:sk:b01729e, H*f:sk:562230c, H*f:sk:f6df051 X-HELO: mx0b-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0b-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Nov 2019 03:22:24 +0000 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id xA73CQOf114421 for ; Wed, 6 Nov 2019 22:22:21 -0500 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2w41w6h1ct-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 06 Nov 2019 22:22:21 -0500 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 7 Nov 2019 03:22:19 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 7 Nov 2019 03:22:16 -0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id xA73MFiX48627790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 7 Nov 2019 03:22:15 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E162511C05B; Thu, 7 Nov 2019 03:22:14 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7C21911C04A; Thu, 7 Nov 2019 03:22:13 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.198]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 7 Nov 2019 03:22:13 +0000 (GMT) Subject: [PATCH, rs6000 v2] Make load cost more in vectorization cost for P8/P9 To: Segher Boessenkool Cc: GCC Patches , Bill Schmidt References: <562230cb-ec3c-e46c-f59f-b7d69f3000b7@linux.ibm.com> <804b71d6-40c3-7c0d-8bfa-b347a7b7fda4@linux.ibm.com> <20191104202110.GF16031@gate.crashing.org> <20191106173833.GQ16031@gate.crashing.org> From: "Kewen.Lin" Date: Thu, 7 Nov 2019 11:22:12 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191106173833.GQ16031@gate.crashing.org> x-cbid: 19110703-0016-0000-0000-000002C16EAB X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19110703-0017-0000-0000-00003322EDA5 Message-Id: <2e5accd9-ebde-2bbc-02df-f0ad10776c73@linux.ibm.com> X-IsSubscribed: yes Hi Segher, on 2019/11/7 上午1:38, Segher Boessenkool wrote: > Hi! > > On Tue, Nov 05, 2019 at 10:14:46AM +0800, Kewen.Lin wrote: >>>> + benefits were observed on Power8 and up, we can unify it if similar >>>> + profits are measured on Power6 and Power7. */ >>>> + if (TARGET_P8_VECTOR) >>>> + return 2; >>>> + else >>>> + return 1; >>> >>> Hrm, but you showed benchmark improvements for p9 as well? >>> >> >> No significant gains but no degradation as well, so I thought it's fine to align >> it together. Does it make sense? > > It's a bit strange at this point to do tunings for p8 that do we do not > do for later cpus. > >>> What happens if you enable this for everything as well? >> >> My concern was that if we enable it for everything, it's possible to introduce >> degradation for some benchmarks on P6 or P7 where we didn't evaluate the >> performance impact. > > No one cares about p6. OK. :) > > We reasonably expect it will work just as well on p7 as on p8 and later. > That you haven't tested on p7 yet says something about how important that > platform is now ;-) > Yes, exactly. >> Although it's reasonable from the point view of load latency, >> it's possible to get worse result in the actual benchmarks based on my fine grain >> cost adjustment experiment before. >> >> Or do you suggest enabling it everywhere and solve the degradation issue if exposed? >> I'm also fine with that. :) > > Yeah, let's just enable it everywhere. One updated patch to enable it everywhere attached. BR, Kewen ------------------------------------------- gcc/ChangeLog 2019-11-07 Kewen Lin * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Make scalar_load, vector_load, unaligned_load and vector_gather_load cost more to conform hardware latency and insn cost settings. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 5876714..1094fbd 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4763,15 +4763,17 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, switch (type_of_cost) { case scalar_stmt: - case scalar_load: case scalar_store: case vector_stmt: - case vector_load: case vector_store: case vec_to_scalar: case scalar_to_vec: case cond_branch_not_taken: return 1; + case scalar_load: + case vector_load: + /* Like rs6000_insn_cost, make load insns cost a bit more. */ + return 2; case vec_perm: /* Power7 has only one permute unit, make it a bit expensive. */ @@ -4792,42 +4794,44 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, case unaligned_load: case vector_gather_load: + /* Like rs6000_insn_cost, make load insns cost a bit more. */ if (TARGET_EFFICIENT_UNALIGNED_VSX) - return 1; - - if (TARGET_VSX && TARGET_ALLOW_MOVMISALIGN) - { - elements = TYPE_VECTOR_SUBPARTS (vectype); - if (elements == 2) - /* Double word aligned. */ - return 2; - - if (elements == 4) - { - switch (misalign) - { - case 8: - /* Double word aligned. */ - return 2; + return 2; - case -1: - /* Unknown misalignment. */ - case 4: - case 12: - /* Word aligned. */ - return 22; + if (TARGET_VSX && TARGET_ALLOW_MOVMISALIGN) + { + elements = TYPE_VECTOR_SUBPARTS (vectype); + if (elements == 2) + /* Double word aligned. */ + return 4; - default: - gcc_unreachable (); - } - } - } + if (elements == 4) + { + switch (misalign) + { + case 8: + /* Double word aligned. */ + return 4; + + case -1: + /* Unknown misalignment. */ + case 4: + case 12: + /* Word aligned. */ + return 44; + + default: + gcc_unreachable (); + } + } + } - if (TARGET_ALTIVEC) - /* Misaligned loads are not supported. */ - gcc_unreachable (); + if (TARGET_ALTIVEC) + /* Misaligned loads are not supported. */ + gcc_unreachable (); - return 2; + /* Like rs6000_insn_cost, make load insns cost a bit more. */ + return 4; case unaligned_store: case vector_scatter_store: