From patchwork Sun Nov 10 15:27:00 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 290056 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id BB59A2C00AC for ; Mon, 11 Nov 2013 02:27:27 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=NS2bGPb3WVTdt0/Tj/yPF/JMTvraJcDhuiovrri8Ef52pi2tu8nxb DImazP0Dw9SwDUBGIKoDlRr7HZ3rNkEC8SmfgaQ8KiCQN2vkasjQkAW2NQ/G5sbe 5UEX5533vTCBjK6CtjeDdLxGprkuHQJxb0NrANft8fZKmBmuw8mvcc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=UQNmAYs6C5NxTrJFT4W3IifJBRo=; b=lYti0zDj1j8JfkKUgV4t NzRK3NeWFEm7gquLqrqtOuH0l/FbEFCa0+p/d0NUI2RFpEnvqtUg1ZqJ1uOi3QHp 4V55HD7MZx0BVrZ0HxwnwL5KA/rlIr64qhnF+6wiCySYgILEwCK5tqlnH0dt/ALH HOhNyg8w/7URhE+fRQfDwsk= Received: (qmail 15123 invoked by alias); 10 Nov 2013 15:27:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 15104 invoked by uid 89); 10 Nov 2013 15:27:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_50, RDNS_NONE autolearn=no version=3.3.2 X-HELO: mail3-relais-sop.national.inria.fr Received: from Unknown (HELO mail3-relais-sop.national.inria.fr) (192.134.164.104) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sun, 10 Nov 2013 15:27:10 +0000 Received: from stedding.saclay.inria.fr ([193.55.250.194]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 10 Nov 2013 16:27:00 +0100 Received: from glisse (helo=localhost) by stedding.saclay.inria.fr with local-esmtp (Exim 4.80) (envelope-from ) id 1VfWui-0005LP-LJ for gcc-patches@gcc.gnu.org; Sun, 10 Nov 2013 16:27:00 +0100 Date: Sun, 10 Nov 2013 16:27:00 +0100 (CET) From: Marc Glisse To: gcc-patches@gcc.gnu.org Subject: [RFC] replace malloc with a decl on the stack Message-ID: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Hello, I am posting this patch to get some feedback on the approach. The goal is to replace malloc+free with a stack allocation (a decl actually) when the size is a small constant. For testing, I highjacked the "leaf" attribute, but it isn't right, I'll remove it from the list (not sure what I'll do for the testcases then). What I'd want instead is a "returns" attribute that means the function will return (either by "return" or an exception), as opposed to having an infinite loop, calling exit or longjmp, etc (sched-deps.c has a related call_may_noreturn_p). The situation I am trying to avoid is: p=malloc(12); f(p) free(p) where f contains somewhere: free(p); exit(42); (or longjmp or something else that takes the regular call to free out of the execution flow). It passes most of the testsuite, but breaks a couple __builtin_object_size tests: struct A { char a[10]; int b; char c[10]; }; int main(){ struct A *p = malloc (2 * sizeof (struct A)); assert (__builtin_object_size (&p->a, 1) == sizeof (p->a)); free (p); } __builtin_object_size now returns 56 instead of 10. I am not sure what to do about that. The size above which the malloc->stack transformation is not applied should depend on a parameter, I don't know if it should get its own or depend on an existing one. In any case, I'd like to avoid ending up with a ridiculously low threshold (my programs use GMP, which internally uses alloca up to 65536 bytes (even in recursive functions that have a dozen allocations), so I don't want gcc to tell me that 50 bytes are too much). A program with a double-free may, with this patch, end up crashing on the first free instead of the second, but that's an invalid program anyway. I don't know if tree-ssa-forwprop is a good place for it, but it was convenient for a prototype. I like the idea of running it several times: replacing malloc with a decl early can help other optimizations, and other optimizations can make this one possible later. The walk could be a bit expensive, but we only do it if we detected a malloc of a small constant and at least one matching free. I guess I should mark the malloc somehow to avoid performing the walk twice if there are several (but not enough) matching free. stack_vec is nice, it would be convenient if bitmaps also had a version with a destructor so we don't need to explicitly deallocate them. Index: gcc/testsuite/g++.dg/tree-ssa/heapstack-1.C =================================================================== --- gcc/testsuite/g++.dg/tree-ssa/heapstack-1.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/heapstack-1.C (working copy) @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +struct A { + void*p; + A(void*q) : p(q) {} + ~A(){ __builtin_free(p); } +}; +void f(void*)__attribute__((__leaf__)); +void h(void*)__attribute__((__leaf__,__nothrow__)); +void g(){ + void*p=__builtin_malloc(12); + A a(p); + f(p); +} + +void i(){ + void*p=__builtin_malloc(12); + h(p); + __builtin_free(p); +} + +/* { dg-final { scan-tree-dump-not "malloc" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: gcc/testsuite/g++.dg/tree-ssa/heapstack-1.C ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: gcc/testsuite/g++.dg/tree-ssa/heapstack-2.C =================================================================== --- gcc/testsuite/g++.dg/tree-ssa/heapstack-2.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/heapstack-2.C (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +void f(void*)__attribute__((__leaf__)); +void g(){ + void*p=__builtin_malloc(12); + f(p); + __builtin_free(p); +} + +/* { dg-final { scan-tree-dump-times "malloc" 1 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: gcc/testsuite/g++.dg/tree-ssa/heapstack-2.C ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: gcc/testsuite/gcc.dg/tree-ssa/heapstack-1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/heapstack-1.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/heapstack-1.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +void f(void*)__attribute__((__leaf__)); +void g(int m,int n){ + int i; + void*p=__builtin_malloc(12); + switch(n){ + case 1: + for (i=0; i *list_of_frees) +{ + tree var = gimple_call_lhs (stmt); + basic_block bb = gimple_bb (stmt); + stack_vec bb_to_visit; + bitmap bb_visited = BITMAP_ALLOC (NULL); + bitmap_set_bit (bb_visited, bb->index); + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + +next_stmt: + gsi_next_nondebug (&gsi); + +handle_stmt: + if (gsi_end_p (gsi)) + { + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) + { + bb_to_visit.safe_push (e->dest); + } + goto next_bb; + } + stmt = gsi_stmt (gsi); + if (stmt_can_throw_external (stmt)) + /* We could give up only if VAR has escaped (same for return), but that + means there is a memory leak, so don't bother. */ + goto unsafe; + + switch (gimple_code (stmt)) + // TODO: GIMPLE_ASM, EH related gimples? + { + /* We leave the function without calling free. */ + case GIMPLE_RETURN: + goto unsafe; + + /* Statements that are irrelevant. */ + case GIMPLE_ASSIGN: + case GIMPLE_LABEL: + case GIMPLE_NOP: + /* Last stmt of the bb, handled by looking at the outgoing edges. */ + case GIMPLE_GOTO: + case GIMPLE_COND: + // TODO: special case: if (VAR == 0) ... + case GIMPLE_SWITCH: + goto next_stmt; + + case GIMPLE_CALL: + { + // TODO: __builtin_(abort|trap|exit|unreachable) + // should be safe as well. + if (gimple_call_builtin_p (stmt, BUILT_IN_FREE) + && gimple_call_arg (stmt, 0) == var) + { + /* Nothing could throw an exception earlier in the block, + so we don't need to check the EH edges. */ + list_of_frees->safe_push (stmt); + goto next_bb; + } + int flags = gimple_call_flags (stmt); + // FIXME: leaf is actually not safe, we need some new ECF_* flags. + if (flags & (ECF_CONST|ECF_PURE|ECF_LEAF|ECF_NOVOPS)) + goto next_stmt; + goto unsafe; + } + default: + goto unsafe; + } + +next_bb: + if (bb_to_visit.is_empty ()) + goto safe; + bb = bb_to_visit.last (); + bb_to_visit.pop (); + if (!bitmap_set_bit (bb_visited, bb->index)) + goto next_bb; + gsi = gsi_start_bb (bb); + goto handle_stmt; + +unsafe: + BITMAP_FREE (bb_visited); + return false; +safe: + BITMAP_FREE (bb_visited); + return true; +} + /* *GSI_P is a GIMPLE_CALL to a builtin function. Optimize memcpy (p, "abcd", 4); memset (p + 4, ' ', 3); into memcpy (p, "abcd ", 7); call if the latter can be stored by pieces during expansion. */ static bool simplify_builtin_call (gimple_stmt_iterator *gsi_p, tree callee2) @@ -1681,20 +1775,64 @@ simplify_builtin_call (gimple_stmt_itera gimple_call_set_arg (stmt2, 2, build_int_cst (TREE_TYPE (len2), src_len)); unlink_stmt_vdef (stmt1); gsi_remove (&gsi, true); release_defs (stmt1); update_stmt (stmt2); return false; } } break; + + case BUILT_IN_FREE: + { + tree size; + tree ptr = gimple_call_arg (stmt2, 0); + if (TREE_CODE (ptr) != SSA_NAME) + return false; + stmt1 = SSA_NAME_DEF_STMT (ptr); + /* TODO: handle calloc. */ + if (gimple_call_builtin_p (stmt1, BUILT_IN_MALLOC) + && host_integerp ((size = gimple_call_arg (stmt1, 0)), 1) + /* param_value(...) / 10? Some new parameter? */ + && TREE_INT_CST_LOW (size) + <= (unsigned HOST_WIDE_INT)PARAM_VALUE (PARAM_LARGE_STACK_FRAME)) + { + gcc_checking_assert (gimple_call_lhs (stmt1) == ptr); + stack_vec list_of_frees; + if (!malloc_safe_on_stack (stmt1, &list_of_frees)) + return false; + /* Declare array. */ + tree elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1); + tree array_type = build_array_type_nelts (elem_type, + TREE_INT_CST_LOW (size)); + tree var = create_tmp_var (array_type, NULL); + tree p = fold_convert (TREE_TYPE (ptr), build_fold_addr_expr (var)); + /* Replace free with a clobber. */ + int i; + gimple free_stmt; + FOR_EACH_VEC_ELT (list_of_frees, i, free_stmt) + { + tree clobber = build_constructor (array_type, NULL); + TREE_THIS_VOLATILE (clobber) = 1; + gimple clobber_stmt = gimple_build_assign (var, clobber); + gimple_stmt_iterator gsi_free = gsi_for_stmt (free_stmt); + gsi_replace (&gsi_free, clobber_stmt, false); + } + /* Replace malloc with the address of the variable. */ + gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1); + update_call_from_tree (&gsi1, p); + update_stmt (gsi_stmt (gsi1)); + return true; + } + + } default: break; } return false; } /* Checks if expression has type of one-bit precision, or is a known truth-valued expression. */ static bool truth_valued_ssa_name (tree name)