From patchwork Wed Oct 30 03:34:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 1186465 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512010-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="wtvVJJiY"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 472vGc5dNtz9sPL for ; Wed, 30 Oct 2019 14:35:02 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; q=dns; s=default; b=CTbyRPbACIZW FFc4whxdaxUCDt+8yYzlD5DLUtlJs41b3+qUTL/V7Bum3Xdo/PUP8p63KqfA2b+H LAyd2OhEEo/lCf2UnGh3QA17eeGj7bAVjxvwZd5QpbAvtcOQ+22k4dalS7/WKFaa H4wP1cePeasReV71fr9gzJ1ePMzYcvU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; s=default; bh=ph6ehy9JJrkeW6tXfj 2B684iu0g=; b=wtvVJJiY/rOmZEs/rdcrcCxxmAf1HluxvrJHZa4xQd/pLIgn6z PTdcSU5q2hzcXNry3pXe5Xy0oNA78OzrRmr4liXZxsCedIIC10lIBb7bpgjgJjon FO6IZUjGyvGEop33m7pupnp7vri6RltL9luc5sdl3SRBUF7+Rep+vxKA4= Received: (qmail 50029 invoked by alias); 30 Oct 2019 03:34:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 50021 invoked by uid 89); 30 Oct 2019 03:34:54 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=2019-10-30 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 30 Oct 2019 03:34:52 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x9U2bfMw096980 for ; Tue, 29 Oct 2019 23:34:50 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2vxwnerbxy-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Oct 2019 23:34:50 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 30 Oct 2019 03:34:48 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 30 Oct 2019 03:34:45 -0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x9U3YiPj12714012 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 30 Oct 2019 03:34:44 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE923AE057; Wed, 30 Oct 2019 03:34:43 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 11342AE051; Wed, 30 Oct 2019 03:34:43 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 30 Oct 2019 03:34:42 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: guojiufu@linux.ibm.com, wschmidt@linux.ibm.com, segher@kernel.crashing.org, rguenther@suse.de Subject: [PATCH V2] rs6000: Refine unroll factor with target unroll_adjust hook. Date: Wed, 30 Oct 2019 11:34:42 +0800 x-cbid: 19103003-0016-0000-0000-000002BEFC08 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19103003-0017-0000-0000-0000332056BB Message-Id: <1572406482-33289-1-git-send-email-guojiufu@linux.ibm.com> X-IsSubscribed: yes Hi, In this patch, loop unroll adjust hook is introduced for powerpc. In this hook, we can do target related hueristic adjustment. For this patch, we tunned for O2 to unroll small loops with small unroll factor (2 times), for other optimization, default unroll factor is used. Bootstrapped and regtested on powerpc64le. Is this ok for trunk? Jiufu BR. gcc/ 2019-10-30 Jiufu Guo PR tree-optimization/88760 * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove code which changes PARAM_MAX_UNROLL_TIMES and PARAM_MAX_UNROLLED_INSNS. (TARGET_LOOP_UNROLL_ADJUST): Add loop unroll adjust hook. (rs6000_loop_unroll_adjust): New hook for loop unroll adjust. Unrolling small loop 2 times for -O2. gcc.testsuite/ 2019-10-29 Jiufu Guo PR tree-optimization/88760 * gcc.dg/pr59643.c: Update back to r277550. * gcc.dg/unroll-8.c: Update cases. * gcc.dg/var-expand3.c: Update cases. --- gcc/config/rs6000/rs6000.c | 37 ++++++++++++++++++++++++++----------- gcc/testsuite/gcc.dg/pr59643.c | 3 --- gcc/testsuite/gcc.dg/unroll-8.c | 4 ++++ gcc/testsuite/gcc.dg/var-expand3.c | 2 ++ 4 files changed, 32 insertions(+), 14 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 9ed5151..183dceb 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1428,6 +1428,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_VECTORIZE_DESTROY_COST_DATA #define TARGET_VECTORIZE_DESTROY_COST_DATA rs6000_destroy_cost_data +#undef TARGET_LOOP_UNROLL_ADJUST +#define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust + #undef TARGET_INIT_BUILTINS #define TARGET_INIT_BUILTINS rs6000_init_builtins #undef TARGET_BUILTIN_DECL @@ -4540,20 +4543,11 @@ rs6000_option_override_internal (bool global_init_p) global_options.x_param_values, global_options_set.x_param_values); - /* unroll very small loops 2 time if no -funroll-loops. */ + /* If funroll-loops is implicitly enabled, do not turn fweb or + frename-registers on implicitly. */ if (!global_options_set.x_flag_unroll_loops && !global_options_set.x_flag_unroll_all_loops) { - maybe_set_param_value (PARAM_MAX_UNROLL_TIMES, 2, - global_options.x_param_values, - global_options_set.x_param_values); - - maybe_set_param_value (PARAM_MAX_UNROLLED_INSNS, 20, - global_options.x_param_values, - global_options_set.x_param_values); - - /* If fweb or frename-registers are not specificed in command-line, - do not turn them on implicitly. */ if (!global_options_set.x_flag_web) global_options.x_flag_web = 0; if (!global_options_set.x_flag_rename_registers) @@ -5101,6 +5095,27 @@ rs6000_destroy_cost_data (void *data) free (data); } +/* This target hook implementation for TARGET_LOOP_UNROLL_ADJUST calculates + a new heristic number struct loop *loop should be unrolled. */ + +static unsigned +rs6000_loop_unroll_adjust (unsigned nunroll, struct loop * loop) +{ + /* For -O2, we only unroll small loops with small unroll factor. */ + if (optimize == 2) + { + /* If the loop contains few insns, treated it as small loops. + TODO: Uing 10 hard code for now. Continue to refine, For example, + if loop constians only 1-2 insns, we may unroll more times(4). + And we may use PARAM to control kinds of loop size. */ + if (loop->ninsns <= 10) + return MIN (2, nunroll); + else + return 0; + } + return nunroll; +} + /* Handler for the Mathematical Acceleration Subsystem (mass) interface to a library with vectorized intrinsics. */ diff --git a/gcc/testsuite/gcc.dg/pr59643.c b/gcc/testsuite/gcc.dg/pr59643.c index 4446f6e..de78d60 100644 --- a/gcc/testsuite/gcc.dg/pr59643.c +++ b/gcc/testsuite/gcc.dg/pr59643.c @@ -1,9 +1,6 @@ /* PR tree-optimization/59643 */ /* { dg-do compile } */ /* { dg-options "-O3 -fdump-tree-pcom-details" } */ -/* { dg-additional-options "--param max-unrolled-insns=400" { target { powerpc*-*-* } } } */ -/* Implicit threashold of max-unrolled-insn on ppc at O3 is too small for the - loop of this case. */ void foo (double *a, double *b, double *c, double d, double e, int n) diff --git a/gcc/testsuite/gcc.dg/unroll-8.c b/gcc/testsuite/gcc.dg/unroll-8.c index b16df67..b1d38a6 100644 --- a/gcc/testsuite/gcc.dg/unroll-8.c +++ b/gcc/testsuite/gcc.dg/unroll-8.c @@ -4,6 +4,10 @@ struct a {int a[7];}; int t(struct a *a, int n) { int i; + + /* Testing unroller message if arrary size is smaller than unroll factor. */ + /* Using pragma unroll to make sure unroll factor is large enough. */ + #pragma GCC unroll 8 for (i=0;ia[i]++; } diff --git a/gcc/testsuite/gcc.dg/var-expand3.c b/gcc/testsuite/gcc.dg/var-expand3.c index dce6ec1..d2ca6df 100644 --- a/gcc/testsuite/gcc.dg/var-expand3.c +++ b/gcc/testsuite/gcc.dg/var-expand3.c @@ -21,6 +21,8 @@ foo (int n) vaccum = vzero; + /* Using pragma unroll to make sure unroll enough times. */ + #pragma GCC unroll 8 for (i = 0; i < n; i++) { vp1 = vec_ld (i * 16, in1);