Message ID | 4D84FD2C.60607@netcologne.de |
---|---|
State | New |
Headers | show |
On Saturday 19 March 2011 19:59:56 Thomas Koenig wrote: > Am 19.03.2011 00:23, schrieb Tobias Burnus: > > I have not followed the discussion nor have I fully read the patch, but > > what's the reason for allowing ELEMENTAL functions? > > Here's an updated version of the patch, which removes the elemental > functions as well. I have also added an option which allows full access > to all function call eliminations, so if any user wants it, it is there. > (I know I will use it :-) This option is not enabled by any > optimization option. > > Regression-tested. Before committing, I'll check on the status of the > gfc_free removal patch, and re-test. Also tested with "make dvi" and > "make info". > > OK for trunk? Not yet, comment here, nits below. I'm a bit worried about the patch possibly conflicting with other optimizations at code generation time. I'm thinking especially about cases where we manage not to create a temporary without the patch: - Inline intrinsics have only been for scalar cases so far, so it should not matter for now. - Array elementals: here it's hard to tell which one is faster; save function calls to a temporary and use the temporary or do multiple function calls every time (but without temporary). - Transpose optimization: here the transpose call is changed into a direct array access, so your patch will definitely make things worse. Even if transpose calls are used multiple times as actual argument for example, better create multiple descriptors than copy the whole lot to a temp. The common function elimination should be disabled (IMO) in this case. Mikael. > Index: frontend-passes.c > =================================================================== > --- frontend-passes.c (Revision 170960) > +++ frontend-passes.c (Arbeitskopie) > @@ -106,11 +128,237 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT > return 0; > } > > +/* Compare two functions for equality. We could use gfc_dep_compare_expr > + except that we also consider impure functions equal, because anybody > + changing the return value of the function within an expression would > + violate the Fortran standard. */ Given how much the codes match, it looks like one can move the relevant gfc_dep_compare_expr code to a function, use that function, driving the small differences with a flag, no? > + > +static bool > +compare_functions (gfc_expr **ep1, gfc_expr **ep2) > +{ > + gfc_expr *e1, *e2; > + > + e1 = *ep1; > + e2 = *ep2; > + > + if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION) > + return false; > + > + if ((e1->value.function.esym && e2->value.function.esym > + && e1->value.function.esym == e2->value.function.esym) > + || (e1->value.function.isym && e2->value.function.isym > + && e1->value.function.isym == e2->value.function.isym)) > + { > + gfc_actual_arglist *args1, *args2; > + > + args1 = e1->value.function.actual; > + args2 = e2->value.function.actual; > + > + /* Compare the argument lists for equality. */ > + while (args1 && args2) > + { > + /* Bitwise xor, since C has no non-bitwise xor operator. */ > + if ((args1->expr == NULL) ^ (args2->expr == NULL)) > + return false; > + > + if (args1->expr != NULL && args2->expr != NULL > + && gfc_dep_compare_expr (args1->expr, args2->expr) != 0) > + return false; > + > + args1 = args1->next; > + args2 = args2->next; > + } > + return args1 == NULL && args2 == NULL; > + } > + else > + return false; > + > +} > + > +/* Callback function for gfc_expr_walker, called from cfe_expr_0. Put all > + eligible function expressions into expr_array. We can't do allocatable > + functions. */ > + > +static int > +cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED, > + void *data ATTRIBUTE_UNUSED) Please use more descriptive names. There are 3 cfe* functions and only one has a comment explaining what cfe means. Something like "register_function_expr" seems to match what this function does, but you may propose something better ;-) > +{ > + if ((*e)->expr_type != EXPR_FUNCTION) > + return 0; > + > + /* We don't do character functions (yet). */ > + if ((*e)->ts.type == BT_CHARACTER) > + return 0; > + > + /* If we don't know the shape at compile time, we do not create a temporary > + variable to hold the intermediate result. FIXME: Change this later when > + allocation on assignment works for intrinsics. */ > + > + if ((*e)->rank > 0 && (*e)->shape == NULL) > + return 0; > + > + /* Skip the test for pure functions if -faggressive-function-elimination > + is specified. */ > + if (!gfc_option.flag_aggressive_function_elimination) > + { > + if ((*e)->value.function.esym) > + { > + if ((*e)->value.function.esym->attr.allocatable) > + return 0; Is it expected that you allow allocatables with -faggressive-function- elimination? > + > + if (!(*e)->value.function.esym->attr.pure > + && !(*e)->value.function.esym->attr.implicit_pure) > + return 0; > + } > + } > + > + if ((*e)->value.function.isym) > + { > + if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION) > + return 0; > + > + if (! (*e)->value.function.isym->pure > + && !(*e)->value.function.isym->elemental) Tobias' comment also applies here, even if there is no intrinsic impure elemental: the following code in intrinsic.c makes the elemental check redundant. next_sym->pure = (cl != CLASS_IMPURE); next_sym->elemental = (cl == CLASS_ELEMENTAL); > + return 0; > + } > + > + if (expr_count >= expr_size) > + { > + expr_size += expr_size; > + expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size); > + } > + expr_array[expr_count] = e; > + expr_count ++; > + return 0; > +} > + > +/* Returns a new expression (a variable) to be used in place of the old one, > + with an an assignment statement before the current statement to set > + the value of the variable. */ > + > +static gfc_expr* > +create_var (gfc_expr * e) > +{ > + char name[GFC_MAX_SYMBOL_LEN +1]; > + static int num = 1; > + gfc_symtree *symtree; > + gfc_symbol *symbol; > + gfc_expr *result; > + gfc_code *n; > + int i; > + > + sprintf(name, "__var_%d",num++); > + if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0) > + gcc_unreachable (); > + > + symbol = symtree->n.sym; > + symbol->ts = e->ts; > + symbol->as = gfc_get_array_spec (); > + symbol->as->rank = e->rank; > + symbol->as->type = AS_EXPLICIT; > + for (i=0; i<e->rank; i++) > + { > + gfc_expr *p, *q; > + > + p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind, > + &(e->where)); > + mpz_set_si (p->value.integer, 1); > + symbol->as->lower[i] = p; > + > + q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind, > + &(e->where)); > + mpz_set (q->value.integer, e->shape[i]); > + symbol->as->upper[i] = q; > + } > + > + symbol->attr.flavor = FL_VARIABLE; > + symbol->attr.referenced = 1; > + symbol->attr.dimension = e->rank > 0; > + gfc_commit_symbol (symbol); > + > + result = gfc_get_expr (); > + result->expr_type = EXPR_VARIABLE; > + result->ts = e->ts; > + result->rank = e->rank; > + result->shape = gfc_copy_shape (e->shape, e->rank); > + result->symtree = symtree; > + result->where = e->where; > + if (e->rank > 0) > + { > + result->ref = gfc_get_ref (); > + result->ref->type = REF_ARRAY; > + result->ref->u.ar.type = AR_FULL; > + result->ref->u.ar.where = e->where; > + result->ref->u.ar.as = symbol->as; > + } > + > + /* Generate the new assignment. */ > + n = XCNEW (gfc_code); > + n->op = EXEC_ASSIGN; > + n->loc = (*current_code)->loc; > + n->next = *current_code; > + n->expr1 = gfc_copy_expr (result); > + n->expr2 = e; > + *current_code = n; > + > + return result; > +} > + > +/* Callback function for the code walker for doing common function > + elimination. This builds up the list of functions in the expression > + and goes through them to detect duplicates, which it then replaces > + by variables. */ > + > +static int > +cfe_expr_0 (gfc_expr **e, int *walk_subtrees, > + void *data ATTRIBUTE_UNUSED) > +{ > + int i,j; > + gfc_expr *newvar; > + > + expr_count = 0; > + gfc_expr_walker (e, cfe_expr, NULL); > + /* Walk backwards through all the functions to make sure we > + catch the leaf functions first. */ > + for (i=expr_count-1; i>=1; i--) > + { Tiny optimization here ;-): /* Don't bother if the expression has been factored already. */ if ((*(expr_array[i]))->expr_type == EXPR_VARIABLE) continue; > + newvar = NULL; > + for (j=i-1; j>=0; j--) > + { > + if (compare_functions(expr_array[i], expr_array[j])) > + { > + if (newvar == NULL) > + newvar = create_var (*(expr_array[i])); > + gfc_free (*(expr_array[j])); > + *(expr_array[j]) = gfc_copy_expr (newvar); > + } > + } > + if (newvar) > + *(expr_array[i]) = newvar; > + } > + > + /* We did all the necessary walking in this function. */ > + *walk_subtrees = 0; > + return 0; > +} > + Small comment here, or better, more descriptive name > +static int > +cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED, > + void *data ATTRIBUTE_UNUSED) > +{ > + current_code = c; > + return 0; > +} > + > /* Optimize a namespace, including all contained namespaces. */ > > static void > optimize_namespace (gfc_namespace *ns) > { > + > + current_ns = ns; > + > + gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL); > gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL); > > for (ns = ns->contained; ns; ns = ns->sibling) > Index: options.c > =================================================================== > --- options.c (Revision 170960) > +++ options.c (Arbeitskopie) > @@ -150,6 +150,7 @@ gfc_init_options (unsigned int decoded_options_cou > gfc_option.flag_align_commons = 1; > gfc_option.flag_protect_parens = 1; > gfc_option.flag_realloc_lhs = -1; > + gfc_option.flag_aggressive_function_elimination = 0; > > gfc_option.fpe = 0; > gfc_option.rtcheck = 0; > @@ -972,6 +973,10 @@ gfc_handle_option (size_t scode, const char *arg, > gfc_option.flag_align_commons = value; > break; > > + case OPT_faggressive_function_elimination: > + gfc_option.flag_aggressive_function_elimination = value; > + break; > + > case OPT_fprotect_parens: > gfc_option.flag_protect_parens = value; > break;
Index: gfortran.h =================================================================== --- gfortran.h (Revision 170960) +++ gfortran.h (Arbeitskopie) @@ -2232,6 +2232,7 @@ typedef struct int flag_whole_file; int flag_protect_parens; int flag_realloc_lhs; + int flag_aggressive_function_elimination; int fpe; int rtcheck; Index: lang.opt =================================================================== --- lang.opt (Revision 170960) +++ lang.opt (Arbeitskopie) @@ -278,6 +278,10 @@ d Fortran Joined ; Documented in common.opt +faggressive-function-elimination +Fortran +Eliminate multiple function invokations also for impure functions + falign-commons Fortran Enable alignment of COMMON blocks Index: invoke.texi =================================================================== --- invoke.texi (Revision 170960) +++ invoke.texi (Arbeitskopie) @@ -1468,6 +1468,18 @@ need to be in effect. An allocatable left-hand side of an intrinsic assignment is automatically (re)allocated if it is either unallocated or has a different shape. The option is enabled by default except when @option{-std=f95} is given. + +@item -faggressive-function-elimination +@opindex @code{faggressive-function-elimination} +@cindex Elimination of functions with identical argument lists +Functions with identical argument lists are eliminated within +statements, regardless of whether these functions are marked +@code{PURE} or not. For example, in +@smallexample + a = f(b,c) + f(b,c) +@end smallexample +there will only be a single call to @code{f}. + @end table @xref{Code Gen Options,,Options for Code Generation Conventions, @@ -1475,7 +1487,6 @@ gcc,Using the GNU Compiler Collection (GCC)}, for offered by the GBE shared by @command{gfortran}, @command{gcc}, and other GNU compilers. - @c man end @node Environment Variables Index: frontend-passes.c =================================================================== --- frontend-passes.c (Revision 170960) +++ frontend-passes.c (Arbeitskopie) @@ -40,6 +40,21 @@ static bool optimize_trim (gfc_expr *); static int count_arglist; +/* Pointer to an array of gfc_expr ** we operate on, plus its size + and counter. */ + +static gfc_expr ***expr_array; +static int expr_size, expr_count; + +/* Pointer to the gfc_code we currently work on - to be able to insert + a statement before. */ + +static gfc_code **current_code; + +/* The namespace we are currently dealing with. */ + +gfc_namespace *current_ns; + /* Entry point - run all passes for a namespace. So far, only an optimization pass is run. */ @@ -48,9 +63,16 @@ gfc_run_passes (gfc_namespace *ns) { if (optimize) { + expr_size = 20; + expr_array = XNEWVEC(gfc_expr **, expr_size); + optimize_namespace (ns); if (gfc_option.dump_fortran_optimized) gfc_dump_parse_tree (ns, stdout); + + /* FIXME: The following should be XDELETEVEC(expr_array); + but we cannot do that because it depends on free. */ + gfc_free (expr_array); } } @@ -106,11 +128,237 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT return 0; } +/* Compare two functions for equality. We could use gfc_dep_compare_expr + except that we also consider impure functions equal, because anybody + changing the return value of the function within an expression would + violate the Fortran standard. */ + +static bool +compare_functions (gfc_expr **ep1, gfc_expr **ep2) +{ + gfc_expr *e1, *e2; + + e1 = *ep1; + e2 = *ep2; + + if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION) + return false; + + if ((e1->value.function.esym && e2->value.function.esym + && e1->value.function.esym == e2->value.function.esym) + || (e1->value.function.isym && e2->value.function.isym + && e1->value.function.isym == e2->value.function.isym)) + { + gfc_actual_arglist *args1, *args2; + + args1 = e1->value.function.actual; + args2 = e2->value.function.actual; + + /* Compare the argument lists for equality. */ + while (args1 && args2) + { + /* Bitwise xor, since C has no non-bitwise xor operator. */ + if ((args1->expr == NULL) ^ (args2->expr == NULL)) + return false; + + if (args1->expr != NULL && args2->expr != NULL + && gfc_dep_compare_expr (args1->expr, args2->expr) != 0) + return false; + + args1 = args1->next; + args2 = args2->next; + } + return args1 == NULL && args2 == NULL; + } + else + return false; + +} + +/* Callback function for gfc_expr_walker, called from cfe_expr_0. Put all + eligible function expressions into expr_array. We can't do allocatable + functions. */ + +static int +cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + if ((*e)->expr_type != EXPR_FUNCTION) + return 0; + + /* We don't do character functions (yet). */ + if ((*e)->ts.type == BT_CHARACTER) + return 0; + + /* If we don't know the shape at compile time, we do not create a temporary + variable to hold the intermediate result. FIXME: Change this later when + allocation on assignment works for intrinsics. */ + + if ((*e)->rank > 0 && (*e)->shape == NULL) + return 0; + + /* Skip the test for pure functions if -faggressive-function-elimination + is specified. */ + if (!gfc_option.flag_aggressive_function_elimination) + { + if ((*e)->value.function.esym) + { + if ((*e)->value.function.esym->attr.allocatable) + return 0; + + if (!(*e)->value.function.esym->attr.pure + && !(*e)->value.function.esym->attr.implicit_pure) + return 0; + } + } + + if ((*e)->value.function.isym) + { + if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION) + return 0; + + if (! (*e)->value.function.isym->pure + && !(*e)->value.function.isym->elemental) + return 0; + } + + if (expr_count >= expr_size) + { + expr_size += expr_size; + expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size); + } + expr_array[expr_count] = e; + expr_count ++; + return 0; +} + +/* Returns a new expression (a variable) to be used in place of the old one, + with an an assignment statement before the current statement to set + the value of the variable. */ + +static gfc_expr* +create_var (gfc_expr * e) +{ + char name[GFC_MAX_SYMBOL_LEN +1]; + static int num = 1; + gfc_symtree *symtree; + gfc_symbol *symbol; + gfc_expr *result; + gfc_code *n; + int i; + + sprintf(name, "__var_%d",num++); + if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0) + gcc_unreachable (); + + symbol = symtree->n.sym; + symbol->ts = e->ts; + symbol->as = gfc_get_array_spec (); + symbol->as->rank = e->rank; + symbol->as->type = AS_EXPLICIT; + for (i=0; i<e->rank; i++) + { + gfc_expr *p, *q; + + p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind, + &(e->where)); + mpz_set_si (p->value.integer, 1); + symbol->as->lower[i] = p; + + q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind, + &(e->where)); + mpz_set (q->value.integer, e->shape[i]); + symbol->as->upper[i] = q; + } + + symbol->attr.flavor = FL_VARIABLE; + symbol->attr.referenced = 1; + symbol->attr.dimension = e->rank > 0; + gfc_commit_symbol (symbol); + + result = gfc_get_expr (); + result->expr_type = EXPR_VARIABLE; + result->ts = e->ts; + result->rank = e->rank; + result->shape = gfc_copy_shape (e->shape, e->rank); + result->symtree = symtree; + result->where = e->where; + if (e->rank > 0) + { + result->ref = gfc_get_ref (); + result->ref->type = REF_ARRAY; + result->ref->u.ar.type = AR_FULL; + result->ref->u.ar.where = e->where; + result->ref->u.ar.as = symbol->as; + } + + /* Generate the new assignment. */ + n = XCNEW (gfc_code); + n->op = EXEC_ASSIGN; + n->loc = (*current_code)->loc; + n->next = *current_code; + n->expr1 = gfc_copy_expr (result); + n->expr2 = e; + *current_code = n; + + return result; +} + +/* Callback function for the code walker for doing common function + elimination. This builds up the list of functions in the expression + and goes through them to detect duplicates, which it then replaces + by variables. */ + +static int +cfe_expr_0 (gfc_expr **e, int *walk_subtrees, + void *data ATTRIBUTE_UNUSED) +{ + int i,j; + gfc_expr *newvar; + + expr_count = 0; + gfc_expr_walker (e, cfe_expr, NULL); + /* Walk backwards through all the functions to make sure we + catch the leaf functions first. */ + for (i=expr_count-1; i>=1; i--) + { + newvar = NULL; + for (j=i-1; j>=0; j--) + { + if (compare_functions(expr_array[i], expr_array[j])) + { + if (newvar == NULL) + newvar = create_var (*(expr_array[i])); + gfc_free (*(expr_array[j])); + *(expr_array[j]) = gfc_copy_expr (newvar); + } + } + if (newvar) + *(expr_array[i]) = newvar; + } + + /* We did all the necessary walking in this function. */ + *walk_subtrees = 0; + return 0; +} + +static int +cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + current_code = c; + return 0; +} + /* Optimize a namespace, including all contained namespaces. */ static void optimize_namespace (gfc_namespace *ns) { + + current_ns = ns; + + gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL); gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL); for (ns = ns->contained; ns; ns = ns->sibling) Index: options.c =================================================================== --- options.c (Revision 170960) +++ options.c (Arbeitskopie) @@ -150,6 +150,7 @@ gfc_init_options (unsigned int decoded_options_cou gfc_option.flag_align_commons = 1; gfc_option.flag_protect_parens = 1; gfc_option.flag_realloc_lhs = -1; + gfc_option.flag_aggressive_function_elimination = 0; gfc_option.fpe = 0; gfc_option.rtcheck = 0; @@ -972,6 +973,10 @@ gfc_handle_option (size_t scode, const char *arg, gfc_option.flag_align_commons = value; break; + case OPT_faggressive_function_elimination: + gfc_option.flag_aggressive_function_elimination = value; + break; + case OPT_fprotect_parens: gfc_option.flag_protect_parens = value; break;