From patchwork Mon Nov 22 17:44:20 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Diego Novillo X-Patchwork-Id: 72567 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id ABAA7B70F9 for ; Tue, 23 Nov 2010 04:44:45 +1100 (EST) Received: (qmail 5470 invoked by alias); 22 Nov 2010 17:44:39 -0000 Received: (qmail 5451 invoked by uid 22791); 22 Nov 2010 17:44:33 -0000 X-SWARE-Spam-Status: No, hits=-3.7 required=5.0 tests=AWL, BAYES_50, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.35) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 22 Nov 2010 17:44:26 +0000 Received: from wpaz9.hot.corp.google.com (wpaz9.hot.corp.google.com [172.24.198.73]) by smtp-out.google.com with ESMTP id oAMHiMDg026944 for ; Mon, 22 Nov 2010 09:44:22 -0800 Received: from tobiano.tor.corp.google.com (tobiano.tor.corp.google.com [172.29.41.6]) by wpaz9.hot.corp.google.com with ESMTP id oAMHiKqf023198 for ; Mon, 22 Nov 2010 09:44:21 -0800 Received: by tobiano.tor.corp.google.com (Postfix, from userid 54752) id 62833AE1DD; Mon, 22 Nov 2010 12:44:20 -0500 (EST) Date: Mon, 22 Nov 2010 12:44:20 -0500 From: Diego Novillo To: gcc-patches@gcc.gnu.org Subject: [gimplefe] Re-factor parser/lexer Message-ID: <20101122174416.GA26860@google.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch starts re-factoring the lexer and parser. We now read input files from the command line, but parsing is still largely non-functional. I'll address that in the next patch I'm working on. Diego. gimple/ChangeLog 2010-11-22 Diego Novillo * Make-lang.in (GIMPLE_PARSER_H): Add CPPLIB_H and VEC_H (gimple-lang.o, parser.o): Fix dependencies. * parser.h: Include cpplib.h and vec.h. (struct gimple_lexer): Define. (struct gimple_parser): Define. (gimple_token): Define. * config-lang.in (gtfiles): Add parser.h and parser.c * gimple-lang.c: Do not include flags.h nor tm.h. (handle_leaf_attribute): New. (gimple_attributes): Add attribute 'leaf'. * parser.c: Do not include cpplib.h nor input.h. Include toplev.h, timevar.h. Move include of ggc.h earlier in the file. Change all references to cpp_token with gimple_token. (parser_gc_root__): New. (gl_init): New. (gp_init): New. (gl_lex): New. (gl_consume_token): New. (gp_parse): New. (gp_finish): New. (gimple_main): Call gp_init, gl_lex, gp_parse and gp_finish. Index: gimple/Make-lang.in =================================================================== --- gimple/Make-lang.in (revision 166537) +++ gimple/Make-lang.in (working copy) @@ -24,7 +24,7 @@ GIMPLE_EXE = gimple1$(exeext) # The GIMPLE-specific object files inclued in $(GIMPLE_EXE). GIMPLE_OBJS = gimple/gimple-lang.o gimple/parser.o attribs.o -GIMPLE_PARSER_H = gimple/parser.h +GIMPLE_PARSER_H = gimple/parser.h $(CPPLIB_H) $(VEC_H) GIMPLE_TREE_H = gimple/gimple-tree.h # Rules @@ -75,10 +75,10 @@ $(GIMPLE_EXE): $(GIMPLE_OBJS) $(BACKEND) # Dependencies gimple/gimple-lang.o: gimple/gimple-lang.c $(CONFIG_H) $(SYSTEM_H) \ - coretypes.h flags.h $(TM_H) $(TREE_H) $(TARGET_H) \ - langhooks.h $(LANGHOOKS_DEF_H) debug.h $(GIMPLE_PARSER_H) \ - $(DIAGNOSTIC_CORE_H) $(TOPLEV_H) $(GIMPLE_TREE_H) \ - $(GGC_H) gtype-gimple.h gt-gimple-gimple-lang.h + coretypes.h $(TREE_H) $(TARGET_H) langhooks.h $(LANGHOOKS_DEF_H) \ + debug.h $(GIMPLE_PARSER_H) $(GIMPLE_H) $(TOPLEV_H) $(GIMPLE_TREE_H) \ + diagnostic-core.h $(TOPLEV_H) $(GGC_H) gtype-gimple.h \ + gt-gimple-gimple-lang.h gimple/parser.o: gimple/parser.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - $(CPPLIB_H) $(INPUT_H) $(DIAGNOSTIC_H) $(TREE_H) $(GIMPLE_H) \ - $(TOPLEV_H) $(GIMPLE_PARSER_H) + $(DIAGNOSTIC_H) $(TOPLEV_H) $(TIMEVAR_H) $(TREE_H) $(GIMPLE_H) \ + $(GIMPLE_PARSER_H) $(GGC_H) gt-gimple-parser.h Index: gimple/parser.h =================================================================== --- gimple/parser.h (revision 166537) +++ gimple/parser.h (working copy) @@ -22,6 +22,53 @@ along with GCC; see the file COPYING3. #ifndef GIMPLE_PARSER_H #define GIMPLE_PARSER_H +#include "cpplib.h" +#include "vec.h" + +struct gimple_parser; +typedef cpp_token gimple_token; + +DEF_VEC_O (gimple_token); +DEF_VEC_ALLOC_O (gimple_token, gc); + +/* The GIMPLE lexer. */ + +typedef struct GTY(()) gimple_lexer { + /* Associated parser. */ + struct gimple_parser *parser; + + /* Path to the main input file name. */ + const char *filename; + + /* The cpp reader to get pre-processed tokens. */ + struct GTY((skip)) cpp_reader *reader; + + /* The array of tokens read by the lexer. */ + VEC(gimple_token, gc) *tokens; + + /* Token to be consumed by the parser. */ + size_t cur_token_ix; +} gimple_lexer; + + +/* The GIMPLE parser. */ + +typedef struct GTY(()) gimple_parser { + /* Reader we use for lexing. */ + gimple_lexer *lexer; + + /* Non-zero if '-dy' is enabled (dump debugging information during + parsing). */ + unsigned int debug_p : 1; + + /* Line table. */ + struct line_maps *line_table; + + /* Identifier table. */ + struct GTY((skip)) ht *ident_hash; +} gimple_parser; + + /* In parser.c */ extern void gimple_main (int); Index: gimple/config-lang.in =================================================================== --- gimple/config-lang.in (revision 166537) +++ gimple/config-lang.in (working copy) @@ -22,6 +22,6 @@ language="gimple" compilers="gimple1\$(exeext)" stagestuff="gimple1\$(exeext)" -gtfiles="\$(srcdir)/gimple/gimple-tree.h \$(srcdir)/gimple/gimple-lang.c" +gtfiles="\$(srcdir)/gimple/gimple-tree.h \$(srcdir)/gimple/gimple-lang.c \$(srcdir)/gimple/parser.h \$(srcdir)/gimple/parser.c" build_by_default=yes Index: gimple/gimple-lang.c =================================================================== --- gimple/gimple-lang.c (revision 166537) +++ gimple/gimple-lang.c (working copy) @@ -22,8 +22,6 @@ along with GCC; see the file COPYING3. #include "config.h" #include "system.h" #include "coretypes.h" -#include "flags.h" -#include "tm.h" #include "tree.h" #include "target.h" #include "langhooks.h" @@ -37,6 +35,7 @@ along with GCC; see the file COPYING3. static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *); static tree handle_const_attribute (tree *, tree, tree, int, bool *); +static tree handle_leaf_attribute (tree *, tree, tree, int, bool *); static tree handle_malloc_attribute (tree *, tree, tree, int, bool *); static tree handle_pure_attribute (tree *, tree, tree, int, bool *); static tree handle_novops_attribute (tree *, tree, tree, int, bool *); @@ -56,6 +55,8 @@ const struct attribute_spec gimple_attri /* The same comments as for noreturn attributes apply to const ones. */ { "const", 0, 0, true, false, false, handle_const_attribute }, + { "leaf", 0, 0, true, false, false, + handle_leaf_attribute }, { "malloc", 0, 0, true, false, false, handle_malloc_attribute }, { "pure", 0, 0, true, false, false, @@ -211,6 +212,29 @@ handle_const_attribute (tree *node, tree } +/* Handle a "leaf" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_leaf_attribute (tree *node, tree name, + tree ARG_UNUSED (args), + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL) + { + warning (OPT_Wattributes, "%qE attribute ignored", name); + *no_add_attrs = true; + } + if (!TREE_PUBLIC (*node)) + { + warning (OPT_Wattributes, "%qE attribute has no effect on unit local functions", name); + *no_add_attrs = true; + } + + return NULL_TREE; +} + + /* Handle a "malloc" attribute; arguments as in struct attribute_spec.handler. */ Index: gimple/parser.c =================================================================== --- gimple/parser.c (revision 166541) +++ gimple/parser.c (working copy) @@ -22,21 +22,27 @@ along with GCC; see the file COPYING3. #include "config.h" #include "system.h" #include "coretypes.h" -#include "cpplib.h" -#include "input.h" #include "diagnostic.h" +#include "toplev.h" +#include "timevar.h" #include "tree.h" #include "gimple.h" -#include "toplev.h" #include "parser.h" +#include "ggc.h" + +/* The GIMPLE parser. Note: do not use this variable directly. It is + declared here only to serve as a root for the GC machinery. The + parser pointer should be passed as a parameter to every function + that needs to access it. */ +static GTY(()) gimple_parser *parser_gc_root__; /* Consumes a token if the EXPECTED_TOKEN_TYPE is exactly the one we are looking for. The token is obtained by reading it from the reader P. */ -static const cpp_token * +static const gimple_token * gimple_parse_expect_token (cpp_reader *p, enum cpp_ttype expected_token_type) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); @@ -65,7 +71,7 @@ gimple_parse_expect_token (cpp_reader *p static void gimple_parse_expect_subcode (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; const char *text; int i; @@ -98,7 +104,7 @@ gimple_parse_expect_subcode (cpp_reader static void gimple_parse_expect_lhs (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; /* Just before the name of the identifier we might get the symbol of dereference too. If we do get it then consume that token, else @@ -117,7 +123,7 @@ gimple_parse_expect_lhs (cpp_reader *p) static void gimple_parse_expect_rhs1 (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); /* Currently there is duplication in the following blocks but there @@ -152,7 +158,7 @@ gimple_parse_expect_rhs1 (cpp_reader *p) static void gimple_parse_expect_rhs2 (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); /* ??? Can there be more possibilities than these ? */ @@ -190,7 +196,7 @@ gimple_parse_assign_stmt (cpp_reader *p) static void gimple_parse_expect_op1 (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); switch (next_token->type) @@ -213,7 +219,7 @@ gimple_parse_expect_op1 (cpp_reader *p) static void gimple_parse_expect_op2 (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); switch (next_token->type) @@ -302,7 +308,7 @@ gimple_parse_label_stmt (cpp_reader *p) static void gimple_parse_switch_stmt (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; gimple_parse_expect_token (p, CPP_LESS); gimple_parse_expect_token (p, CPP_NAME); @@ -353,7 +359,7 @@ gimple_parse_expect_function_name (cpp_r static void gimple_parse_expect_return_var (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); @@ -371,7 +377,7 @@ gimple_parse_expect_return_var (cpp_read static void gimple_parse_expect_argument (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; next_token = cpp_peek_token (p, 0); @@ -399,7 +405,7 @@ gimple_parse_expect_argument (cpp_reader static void gimple_parse_call_stmt (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; gimple_parse_expect_function_name (p); gimple_parse_expect_return_var (p); @@ -436,7 +442,7 @@ gimple_parse_return_stmt (cpp_reader *p) for recognizing the statements in a function body. */ static void -gimple_parse_stmt (cpp_reader *p, const cpp_token *tok) +gimple_parse_stmt (cpp_reader *p, const gimple_token *tok) { const char *text; int i; @@ -523,7 +529,7 @@ gimple_parse_expect_field_decl (cpp_read static void gimple_parse_record_type (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; gimple_parse_expect_token (p, CPP_LESS); gimple_parse_expect_token (p, CPP_NAME); @@ -568,7 +574,7 @@ gimple_parse_record_type (cpp_reader *p) static void gimple_parse_union_type (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; gimple_parse_expect_token (p, CPP_LESS); gimple_parse_expect_token (p, CPP_NAME); @@ -624,7 +630,7 @@ gimple_parse_expect_const_decl (cpp_read static void gimple_parse_enum_type (cpp_reader *p) { - const cpp_token *next_token; + const gimple_token *next_token; gimple_parse_expect_token (p, CPP_LESS); gimple_parse_expect_token (p, CPP_NAME); @@ -653,7 +659,7 @@ gimple_parse_enum_type (cpp_reader *p) for recognizing the type and variable declarations. */ static void -gimple_parse_type (cpp_reader *p, const cpp_token *tok) +gimple_parse_type (cpp_reader *p, const gimple_token *tok) { const char *text; int i; @@ -689,36 +695,113 @@ gimple_parse_type (cpp_reader *p, const } -/* Main entry point for the GIMPLE front end. */ +/* Initialize the lexer. */ -void -gimple_main (int debug_p ATTRIBUTE_UNUSED) +static gimple_lexer * +gl_init (gimple_parser *p) { - /* We invoke the parser here. */ - cpp_reader *p; - const cpp_token *tok; - const char *input_file = "/tmp/gimple.txt"; - const char *output_file; - bool inside_type_section_p = true; - struct line_maps *line_tab; + gimple_lexer *l; - line_tab = ggc_alloc_cleared_line_maps (); - linemap_init (line_tab); - p = cpp_create_reader (CLK_GNUC99, ident_hash, line_tab); - output_file = cpp_read_main_file (p, input_file); - if (output_file) + l = ggc_alloc_cleared_gimple_lexer (); + l->parser = p; + l->filename = main_input_filename; + l->reader = cpp_create_reader (CLK_GNUC99, p->ident_hash, p->line_table); + l->filename = cpp_read_main_file (l->reader, l->filename); + l->cur_token_ix = 0; + + return l; +} + + +/* Initialize the parser data structures. */ + +static gimple_parser * +gp_init (int debug_p) +{ + gimple_parser *p = ggc_alloc_cleared_gimple_parser (); + p->debug_p = debug_p; + line_table = p->line_table = ggc_alloc_cleared_line_maps (); + p->ident_hash = ident_hash; + linemap_init (p->line_table); + p->lexer = gl_init (p); + + return p; +} + + +/* Get all the tokens from the file in LEXER. */ + +static void +gl_lex (gimple_lexer *lexer) +{ + const gimple_token *gimple_tok; + + timevar_push (TV_CPP); + + do { - tok = cpp_get_token (p); - while (tok->type != CPP_EOF) - { - if (inside_type_section_p) - gimple_parse_type (p, tok); - else - gimple_parse_stmt (p, tok); - tok = cpp_get_token (p); - } + location_t loc; + + gimple_tok = cpp_get_token_with_location (lexer->reader, &loc); + if (gimple_tok->type != CPP_EOF) + VEC_safe_push (gimple_token, gc, lexer->tokens, gimple_tok); } + while (gimple_tok->type != CPP_EOF); - cpp_finish (p, NULL); - cpp_destroy (p); + timevar_pop (TV_CPP); +} + + +/* Consume the next token from PARSER. */ + +static gimple_token * +gl_consume_token (gimple_lexer *lexer) +{ + return VEC_index (gimple_token, lexer->tokens, lexer->cur_token_ix++); } + +/* Parse the translation unit in PARSER. */ + +static void +gp_parse (gimple_parser *parser) +{ + while (!VEC_empty (gimple_token, parser->lexer->tokens)) + { + gimple_token *tok = gl_consume_token (parser->lexer); + if (1) + gimple_parse_type (parser->lexer->reader, tok); + else + gimple_parse_stmt (parser->lexer->reader, tok); + } +} + + +/* Finalize parsing and release allocated memory in PARSER. */ + +static void +gp_finish (gimple_parser *parser) +{ + cpp_finish (parser->lexer->reader, NULL); + cpp_destroy (parser->lexer->reader); + parser_gc_root__ = NULL; +} + + +/* Main entry point for the GIMPLE front end. */ + +void +gimple_main (int debug_p) +{ + gimple_parser *parser; + + parser_gc_root__ = parser = gp_init (debug_p); + + if (parser->lexer->filename == NULL) + return; + + gl_lex (parser->lexer); + gp_parse (parser); + gp_finish (parser); +} + +#include "gt-gimple-parser.h"