From patchwork Tue Oct 21 21:59:46 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 401841 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4B6C4140076 for ; Wed, 22 Oct 2014 09:01:16 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=XX4HsdM3LtyDd21a3wUYSU/AT89rV2j+AgB11W2Jgiaabb D3B/xYDJ4Il1LV+t1HTRitTy4hh/3H280KJRNWqaHADJRn3xJijqU5cpXLIdes1i Y5Yxtz3P89+1eAvwUDA3MKd2auiBj2DINcKe9SjSRgMbtR9QnoGrVyOUSrCXY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=mZLeYOcgKQJAeIXXkXxRT1gXnbQ=; b=hBxoyloBSUltbC/np8B7 zNmJohQUPipZjcZBQL4pjeNLGntnLulvt5qOMLSQEzwMTzi28kwLU0wdCmjbLI+k JCwUTjNDdLvgMMchbyN+zxDgGmMPNPEe9+p11fMFDsBWJHoyOwK2DUubYpjUw/9b 5rrcasJ5VP0Y6wrF3mmBG8o= Received: (qmail 29256 invoked by alias); 21 Oct 2014 22:01:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29241 invoked by uid 89); 21 Oct 2014 22:01:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, T_FROM_12LTRDOM autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 Oct 2014 22:01:05 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1XghUD-0001ut-Km from Bernd_Schmidt@mentor.com ; Tue, 21 Oct 2014 15:01:02 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.181.6; Tue, 21 Oct 2014 23:01:00 +0100 Message-ID: <5446D752.9010405@codesourcery.com> Date: Tue, 21 Oct 2014 23:59:46 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.8.0 MIME-Version: 1.0 To: GCC Patches , gfortran Subject: Avoid calls to realloc for nvptx This is a followup patch for the nvptx port. Since malloc and free are magically provided by the ptx environment, but realloc is missing, it's nontrivial to provide an implementation for it. The Fortran frontend likes to generate calls to realloc, but in one case it seems like we can compute the old size, and call a function that does malloc/memcpy/free instead. It seems desirable to continue calling plain realloc on most targets, but the decision which function to use must happen after the LTO stage for it to work when ptx is used as an offload target. The following patch provides a new builtin function which gets expanded to one of the two alternatives based on a target macro, and modifies the Fortran frontend to call the new builtin instead of using realloc directly when the old size is known. Bootstrapped and tested on x86_64-linux. Two oddities: the last test run had some plugin tests failing due to a version check; this went away when I recompiled gcc/plugin.c - it doesn't seem like the sort of thing that would be caused by this patch and I didn't see it in any of the earlier test runs. The other oddity is that the order of args when realloc is defined by f95-lang seems reversed. However, fixing this causes infinite loops in the fre1 pass for some Fortran tests so I've left it as-is for now. Ok once the nvptx port is in? Bernd gcc/fortran/ * trans-array.c (gfc_grow_array): Calculate old array size and pass it to gfc_call_realloc. * trans.c (gfc_call_realloc): Accept new arg old_size. All callers changed. If it is not NULL_TREE, use the realloc_known_size builtin. * trans.h (gfc_call_realloc): Adjust declaration. * f95-lang.c (gfc_init_builtin_functions): Define realloc_known_size builtin. gcc/ * builtins.c (expand_builtin): Handle BUILT_IN_REALLOC_KNOWN_SIZE. * builtins.def (BUILT_IN_REALLOC_KNOWN_SIZE): New. * builtin-types.def (BT_FN_PTR_PTR_SIZE_SIZE): New. * target.def (avoid_realloc): New data hook. * tm.texi.in (TARGET_AVOID_REALLOC): Add. * tm.texi: Regenerate. * config/nvptx/nvptx.c (TARGET_AVOID_REALLOC): Define to true. libgcc/ * config/nvptx/t-nvptx (LIB2ADD): New. * reallo_known_size.c: New file. Index: gcc/fortran/trans-array.c =================================================================== --- gcc/fortran/trans-array.c.orig +++ gcc/fortran/trans-array.c @@ -1233,7 +1233,7 @@ gfc_get_iteration_count (tree start, tre static void gfc_grow_array (stmtblock_t * pblock, tree desc, tree extra) { - tree arg0, arg1; + tree arg0, arg1, arg2; tree tmp; tree size; tree ubound; @@ -1242,6 +1242,14 @@ gfc_grow_array (stmtblock_t * pblock, tr return; ubound = gfc_conv_descriptor_ubound_get (desc, gfc_rank_cst[0]); + size = TYPE_SIZE_UNIT (gfc_get_element_type (TREE_TYPE (desc))); + + /* Calculate the old array size. */ + tmp = fold_build2_loc (input_location, PLUS_EXPR, gfc_array_index_type, + ubound, gfc_index_one_node); + arg2 = fold_build2_loc (input_location, MULT_EXPR, size_type_node, + fold_convert (size_type_node, tmp), + fold_convert (size_type_node, size)); /* Add EXTRA to the upper bound. */ tmp = fold_build2_loc (input_location, PLUS_EXPR, gfc_array_index_type, @@ -1252,7 +1260,6 @@ gfc_grow_array (stmtblock_t * pblock, tr arg0 = gfc_conv_descriptor_data_get (desc); /* Calculate the new array size. */ - size = TYPE_SIZE_UNIT (gfc_get_element_type (TREE_TYPE (desc))); tmp = fold_build2_loc (input_location, PLUS_EXPR, gfc_array_index_type, ubound, gfc_index_one_node); arg1 = fold_build2_loc (input_location, MULT_EXPR, size_type_node, @@ -1260,7 +1267,7 @@ gfc_grow_array (stmtblock_t * pblock, tr fold_convert (size_type_node, size)); /* Call the realloc() function. */ - tmp = gfc_call_realloc (pblock, arg0, arg1); + tmp = gfc_call_realloc (pblock, arg0, arg1, arg2); gfc_conv_descriptor_data_set (pblock, desc, tmp); } Index: gcc/fortran/trans-openmp.c =================================================================== --- gcc/fortran/trans-openmp.c.orig +++ gcc/fortran/trans-openmp.c @@ -761,7 +761,7 @@ gfc_omp_clause_assign_op (tree clause, t gfc_init_block (&inner_block); gfc_add_modify (&inner_block, ptr, - gfc_call_realloc (&inner_block, ptr, size)); + gfc_call_realloc (&inner_block, ptr, size, NULL_TREE)); else_b = gfc_finish_block (&inner_block); gfc_add_expr_to_block (&cond_block2, Index: gcc/fortran/trans.c =================================================================== --- gcc/fortran/trans.c.orig +++ gcc/fortran/trans.c @@ -1459,7 +1459,7 @@ internal_realloc (void *mem, size_t size return res; } */ tree -gfc_call_realloc (stmtblock_t * block, tree mem, tree size) +gfc_call_realloc (stmtblock_t * block, tree mem, tree size, tree old_size) { tree msg, res, nonzero, null_result, tmp; tree type = TREE_TYPE (mem); @@ -1473,9 +1473,19 @@ gfc_call_realloc (stmtblock_t * block, t res = gfc_create_var (type, NULL); /* Call realloc and check the result. */ - tmp = build_call_expr_loc (input_location, - builtin_decl_explicit (BUILT_IN_REALLOC), 2, - fold_convert (pvoid_type_node, mem), size); + if (old_size == NULL_TREE) + tmp = build_call_expr_loc (input_location, + builtin_decl_explicit (BUILT_IN_REALLOC), 2, + fold_convert (pvoid_type_node, mem), size); + else + { + if (TREE_TYPE (old_size) != TREE_TYPE (size_type_node)) + old_size = fold_convert (size_type_node, old_size); + tree decl = builtin_decl_explicit (BUILT_IN_REALLOC_KNOWN_SIZE); + tmp = build_call_expr_loc (input_location, decl, 3, + fold_convert (pvoid_type_node, mem), size, + old_size); + } gfc_add_modify (block, res, fold_convert (type, tmp)); null_result = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node, res, build_int_cst (pvoid_type_node, 0)); Index: gcc/fortran/trans.h =================================================================== --- gcc/fortran/trans.h.orig +++ gcc/fortran/trans.h @@ -643,8 +643,8 @@ tree gfc_deallocate_with_status (tree, t gfc_expr *, bool); tree gfc_deallocate_scalar_with_status (tree, tree, bool, gfc_expr*, gfc_typespec); -/* Generate code to call realloc(). */ -tree gfc_call_realloc (stmtblock_t *, tree, tree); +/* Generate code to call realloc or __realloc_known_size. */ +tree gfc_call_realloc (stmtblock_t *, tree, tree, tree); /* Generate code for an assignment, includes scalarization. */ tree gfc_trans_assignment (gfc_expr *, gfc_expr *, bool, bool); Index: gcc/builtins.c =================================================================== --- gcc/builtins.c.orig +++ gcc/builtins.c @@ -6168,6 +6168,16 @@ expand_builtin (tree exp, rtx target, rt return target; break; + case BUILT_IN_REALLOC_KNOWN_SIZE: + if (!targetm.avoid_realloc) + { + tree fn = builtin_decl_implicit (BUILT_IN_REALLOC); + exp = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 2, + CALL_EXPR_ARG (exp, 0), + CALL_EXPR_ARG (exp, 1)); + } + break; + case BUILT_IN_SETJMP: /* This should have been lowered to the builtins below. */ gcc_unreachable (); Index: gcc/builtins.def =================================================================== --- gcc/builtins.def.orig +++ gcc/builtins.def @@ -766,6 +766,7 @@ DEF_GCC_BUILTIN (BUILT_IN_POPCOUN DEF_EXT_LIB_BUILTIN (BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF) DEF_GCC_BUILTIN (BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST) DEF_LIB_BUILTIN (BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST) +DEF_EXT_LIB_BUILTIN (BUILT_IN_REALLOC_KNOWN_SIZE, "__realloc_known_size", BT_FN_PTR_PTR_SIZE_SIZE, ATTR_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN (BUILT_IN_RETURN, "return", BT_FN_VOID_PTR, ATTR_NORETURN_NOTHROW_LEAF_LIST) DEF_GCC_BUILTIN (BUILT_IN_RETURN_ADDRESS, "return_address", BT_FN_PTR_UINT, ATTR_LEAF_LIST) DEF_GCC_BUILTIN (BUILT_IN_SAVEREGS, "saveregs", BT_FN_PTR_VAR, ATTR_NULL) Index: gcc/builtin-types.def =================================================================== --- gcc/builtin-types.def.orig +++ gcc/builtin-types.def @@ -356,6 +356,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_INT_S BT_PTR, BT_PTR, BT_INT, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_SIZE, BT_VOID, BT_PTR, BT_INT, BT_SIZE) +DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_SIZE_SIZE, + BT_PTR, BT_PTR, BT_SIZE, BT_SIZE) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_INT, BT_VOID, BT_PTR, BT_INT, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_CONST_PTR_PTR_SIZE, Index: gcc/config/nvptx/nvptx.c =================================================================== --- gcc/config/nvptx/nvptx.c.orig +++ gcc/config/nvptx/nvptx.c @@ -2019,6 +2019,9 @@ nvptx_file_end (void) #undef TARGET_VECTOR_ALIGNMENT #define TARGET_VECTOR_ALIGNMENT nvptx_vector_alignment +#undef TARGET_AVOID_REALLOC +#define TARGET_AVOID_REALLOC true + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-nvptx.h" Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi.orig +++ gcc/doc/tm.texi @@ -5154,6 +5154,12 @@ is set to true, the @file{tm.h} file mus @code{#define LIBGCC2_GNU_PREFIX}. @end deftypevr +@deftypevr {Target Hook} bool TARGET_AVOID_REALLOC +If true, the compiler tries to avoid replace calls to @code{realloc} +with an internal library function implemented with @code{malloc}, +@code{free} and @code{memcpy}. The default is false. +@end deftypevr + @defmac FLOAT_LIB_COMPARE_RETURNS_BOOL (@var{mode}, @var{comparison}) This macro should return @code{true} if the library routine that implements the floating point comparison operator @var{comparison} in Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in.orig +++ gcc/doc/tm.texi.in @@ -3974,6 +3974,8 @@ are ABI-mandated names that the compiler @hook TARGET_LIBFUNC_GNU_PREFIX +@hook TARGET_AVOID_REALLOC + @defmac FLOAT_LIB_COMPARE_RETURNS_BOOL (@var{mode}, @var{comparison}) This macro should return @code{true} if the library routine that implements the floating point comparison operator @var{comparison} in Index: gcc/target.def =================================================================== --- gcc/target.def.orig +++ gcc/target.def @@ -2246,6 +2246,14 @@ is set to true, the @file{tm.h} file mus @code{#define LIBGCC2_GNU_PREFIX}.", bool, false) +/* Add a __gnu_ prefix to library functions rather than just __. */ +DEFHOOKPOD +(avoid_realloc, + "If true, the compiler tries to avoid replace calls to @code{realloc}\n\ +with an internal library function implemented with @code{malloc},\n\ +@code{free} and @code{memcpy}. The default is false.", + bool, false) + /* Given a decl, a section name, and whether the decl initializer has relocs, choose attributes for the section. */ /* ??? Should be merged with SELECT_SECTION and UNIQUE_SECTION. */ Index: libgcc/config/nvptx/t-nvptx =================================================================== --- libgcc/config/nvptx/t-nvptx.orig +++ libgcc/config/nvptx/t-nvptx @@ -1,5 +1,6 @@ LIB2ADDEH= LIB2FUNCS_EXCLUDE=__main +LIB2ADD = $(srcdir)/realloc_known_size.c crt0.o: $(srcdir)/config/nvptx/crt0.s cp $< $@ Index: libgcc/realloc_known_size.c =================================================================== --- /dev/null +++ libgcc/realloc_known_size.c @@ -0,0 +1,38 @@ +/* Avoid calls to realloc if the size is known. */ +/* Copyright (C) 2014 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +#include +#include + +void * +__realloc_known_size (void *ptr, size_t newsize, size_t oldsize) +{ + void *newp = malloc (newsize); + if (ptr != 0) + { + memcpy (newp, ptr, oldsize); + free (ptr); + } + return newp; +} Index: gcc/fortran/f95-lang.c =================================================================== --- gcc/fortran/f95-lang.c.orig +++ gcc/fortran/f95-lang.c @@ -985,6 +985,13 @@ gfc_define_builtin ("__builtin_realloc", ftype, BUILT_IN_REALLOC, "realloc", ATTR_NOTHROW_LEAF_LIST); + ftype = build_function_type_list (pvoid_type_node, + pvoid_type_node, size_type_node, + size_type_node, NULL_TREE); + gfc_define_builtin ("__builtin_realloc_known_size", ftype, + BUILT_IN_REALLOC_KNOWN_SIZE, + "__realloc_known_size", ATTR_NOTHROW_LEAF_LIST); + ftype = build_function_type_list (integer_type_node, void_type_node, NULL_TREE); gfc_define_builtin ("__builtin_isnan", ftype, BUILT_IN_ISNAN,