From patchwork Tue Sep 12 14:41:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: will schmidt X-Patchwork-Id: 812877 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-461935-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="GrXXMi/8"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xs6v86PJSz9s3T for ; Wed, 13 Sep 2017 00:41:56 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:reply-to:to:cc:content-type:date:mime-version :content-transfer-encoding:message-id; q=dns; s=default; b=IJkwa h3h19avCxLFyEMsM7uo9Pm2qKVM+az3/z4uFFpm64nPe2uZDOsI6iAVfnRIuMslH BaFdxzpW/7icxNvas3noHT4QBeanJR3obz9JNevutNqZsjVnJ+zGt5Xwy2+oTu+J 71tGVbH5TL5noNGxEDFMfSCgxwjDRIkvzhUBOM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:reply-to:to:cc:content-type:date:mime-version :content-transfer-encoding:message-id; s=default; bh=gzwS5rrKolG vz/RnXIN4DNLZJY0=; b=GrXXMi/8UJt0/ERF0e21AIaEVgKPRg2Pca6vxuuVVP4 4Pim7brxjevqXVUrEXq36tie6ywGBwUvcjOoXNElTJbhZk1/DTaVbeUxoObXaPxQ 2nQR2L86zcifJhF6+snVUhQOA7ZD7s9KE41IYoTOg/XqkL54DUyzCHIg3AUruFpE = Received: (qmail 128788 invoked by alias); 12 Sep 2017 14:41:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128774 invoked by uid 89); 12 Sep 2017 14:41:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=H*MI:155 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 12 Sep 2017 14:41:45 +0000 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v8CEfUwa104947 for ; Tue, 12 Sep 2017 10:41:44 -0400 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 2cxe6fbumr-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 12 Sep 2017 10:41:42 -0400 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 12 Sep 2017 08:41:06 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 12 Sep 2017 08:41:03 -0600 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v8CEf3dm10027292; Tue, 12 Sep 2017 07:41:03 -0700 Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EFD80C6042; Tue, 12 Sep 2017 08:41:02 -0600 (MDT) Received: from [9.10.86.107] (unknown [9.10.86.107]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP id 8EA17C604A; Tue, 12 Sep 2017 08:41:02 -0600 (MDT) Subject: [PATCH, rs6000] Folding of vector loads in GIMPLE From: Will Schmidt Reply-To: will_schmidt@vnet.ibm.com To: GCC Patches Cc: Segher Boessenkool , Richard Biener , Bill Schmidt , David Edelsohn Date: Tue, 12 Sep 2017 09:41:02 -0500 Mime-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 17091214-0016-0000-0000-000007814D9C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007711; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000227; SDB=6.00915977; UDB=6.00459924; IPR=6.00696208; BA=6.00005587; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017126; XFM=3.00000015; UTC=2017-09-12 14:41:05 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17091214-0017-0000-0000-00003B705432 Message-Id: <1505227262.14827.155.camel@brimstone.rchland.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-09-12_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1709120205 X-IsSubscribed: yes Hi [PATCH, rs6000] Folding of vector loads in GIMPLE Folding of vector loads in GIMPLE. - Add code to handle gimple folding for the vec_ld builtins. - Remove the now obsoleted folding code for vec_ld from rs6000-c.c. Surrounding comments have been adjusted slightly so they continue to read OK for the vec_st code that remains. The resulting code is specifically verified by the powerpc/fold-vec-ld-*.c tests which have been posted separately. (a few minutes ago). Regtest successfully completed on power6 and newer. (p6,p7,p8le,p8be,p9). OK for trunk? Thanks, -Will [gcc] 2017-09-12 Will Schmidt * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling for early folding of vector loads (ALTIVEC_BUILTIN_LVX_*). * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Remove obsoleted code for handling ALTIVEC_BUILTIN_VEC_LD. diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 897306c..73e14d9 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -6459,92 +6459,19 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl, convert (TREE_TYPE (stmt), arg0)); stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl); return stmt; } - /* Expand vec_ld into an expression that masks the address and - performs the load. We need to expand this early to allow + /* Expand vec_st into an expression that masks the address and + performs the store. We need to expand this early to allow the best aliasing, as by the time we get into RTL we no longer are able to honor __restrict__, for example. We may want to consider this for all memory access built-ins. When -maltivec=be is specified, or the wrong number of arguments is provided, simply punt to existing built-in processing. */ - if (fcode == ALTIVEC_BUILTIN_VEC_LD - && (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG) - && nargs == 2) - { - tree arg0 = (*arglist)[0]; - tree arg1 = (*arglist)[1]; - - /* Strip qualifiers like "const" from the pointer arg. */ - tree arg1_type = TREE_TYPE (arg1); - if (!POINTER_TYPE_P (arg1_type) && TREE_CODE (arg1_type) != ARRAY_TYPE) - goto bad; - - tree inner_type = TREE_TYPE (arg1_type); - if (TYPE_QUALS (TREE_TYPE (arg1_type)) != 0) - { - arg1_type = build_pointer_type (build_qualified_type (inner_type, - 0)); - arg1 = fold_convert (arg1_type, arg1); - } - - /* Construct the masked address. Let existing error handling take - over if we don't have a constant offset. */ - arg0 = fold (arg0); - - if (TREE_CODE (arg0) == INTEGER_CST) - { - if (!ptrofftype_p (TREE_TYPE (arg0))) - arg0 = build1 (NOP_EXPR, sizetype, arg0); - - tree arg1_type = TREE_TYPE (arg1); - if (TREE_CODE (arg1_type) == ARRAY_TYPE) - { - arg1_type = TYPE_POINTER_TO (TREE_TYPE (arg1_type)); - tree const0 = build_int_cstu (sizetype, 0); - tree arg1_elt0 = build_array_ref (loc, arg1, const0); - arg1 = build1 (ADDR_EXPR, arg1_type, arg1_elt0); - } - - tree addr = fold_build2_loc (loc, POINTER_PLUS_EXPR, arg1_type, - arg1, arg0); - tree aligned = fold_build2_loc (loc, BIT_AND_EXPR, arg1_type, addr, - build_int_cst (arg1_type, -16)); - - /* Find the built-in to get the return type so we can convert - the result properly (or fall back to default handling if the - arguments aren't compatible). */ - for (desc = altivec_overloaded_builtins; - desc->code && desc->code != fcode; desc++) - continue; - - for (; desc->code == fcode; desc++) - if (rs6000_builtin_type_compatible (TREE_TYPE (arg0), desc->op1) - && (rs6000_builtin_type_compatible (TREE_TYPE (arg1), - desc->op2))) - { - tree ret_type = rs6000_builtin_type (desc->ret_type); - if (TYPE_MODE (ret_type) == V2DImode) - /* Type-based aliasing analysis thinks vector long - and vector long long are different and will put them - in distinct alias classes. Force our return type - to be a may-alias type to avoid this. */ - ret_type - = build_pointer_type_for_mode (ret_type, Pmode, - true/*can_alias_all*/); - else - ret_type = build_pointer_type (ret_type); - aligned = build1 (NOP_EXPR, ret_type, aligned); - tree ret_val = build_indirect_ref (loc, aligned, RO_NULL); - return ret_val; - } - } - } - /* Similarly for stvx. */ if (fcode == ALTIVEC_BUILTIN_VEC_ST && (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG) && nargs == 3) { tree arg0 = (*arglist)[0]; diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index cf744d8..5b14789 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16473,10 +16473,65 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) res = gimple_build (&stmts, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res); gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); update_call_from_tree (gsi, res); return true; } + /* Vector loads. */ + case ALTIVEC_BUILTIN_LVX_V16QI: + case ALTIVEC_BUILTIN_LVX_V8HI: + case ALTIVEC_BUILTIN_LVX_V4SI: + case ALTIVEC_BUILTIN_LVX_V4SF: + case ALTIVEC_BUILTIN_LVX_V2DI: + case ALTIVEC_BUILTIN_LVX_V2DF: + { + gimple *g; + arg0 = gimple_call_arg (stmt, 0); // offset + arg1 = gimple_call_arg (stmt, 1); // address + + /* Limit folding of loads to LE targets. */ + if (BYTES_BIG_ENDIAN || VECTOR_ELT_ORDER_BIG) + return false; + + lhs = gimple_call_lhs (stmt); + location_t loc = gimple_location (stmt); + + tree arg1_type = TREE_TYPE (arg1); + tree lhs_type = TREE_TYPE (lhs); + + /* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'. Create + the tree using the value from arg0. The resulting type will match + the type of arg1. */ + tree temp_offset = create_tmp_reg_or_ssa_name (sizetype); + g = gimple_build_assign (temp_offset, NOP_EXPR, arg0); + gimple_set_location (g, loc); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + tree temp_addr = create_tmp_reg_or_ssa_name (arg1_type); + g = gimple_build_assign (temp_addr, POINTER_PLUS_EXPR, arg1, + temp_offset); + gimple_set_location (g, loc); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + + /* Mask off any lower bits from the address. */ + tree alignment_mask = build_int_cst (arg1_type, -16); + tree aligned_addr = create_tmp_reg_or_ssa_name (arg1_type); + g = gimple_build_assign (aligned_addr, BIT_AND_EXPR, + temp_addr, alignment_mask); + gimple_set_location (g, loc); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + + /* Use the build2 helper to set up the mem_ref. The MEM_REF could also + take an offset, but since we've already incorporated the offset + above, here we just pass in a zero. */ + g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type, aligned_addr, + build_int_cst (arg1_type, 0))); + gimple_set_location (g, loc); + gsi_replace (gsi, g, true); + + return true; + + } + default: if (TARGET_DEBUG_BUILTIN) fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s\n", fn_code, fn_name1, fn_name2); break;