From patchwork Thu Aug 1 14:56:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arthur Cohen X-Patchwork-Id: 1967790 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.a=rsa-sha256 header.s=google header.b=ckDJqyep; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZXnG121sz1ybX for ; Fri, 2 Aug 2024 01:24:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 377CA3860770 for ; Thu, 1 Aug 2024 15:24:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id 8DF653860763 for ; Thu, 1 Aug 2024 14:59:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8DF653860763 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8DF653860763 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::531 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524405; cv=none; b=enuqZlULXYMl+f8abqFCR9Z2I1u0nMBfkjQ8Sfh5SXvvN2fo8/BQXZEFobaz2jWjlSNEiojjK7ugKwZrXqkvED+JRle9MiP5/REaNvQ9DdtBHE/tzXlupNs9t/1b6ECOudDxcgbZrHGFQZluOHhaZWZev7lNarXK7jJLiVrW8gw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524405; c=relaxed/simple; bh=T2sEnCm3u3U9/y+LF3p1+Q2Mh0KL2KPpTDT0xuSqQrY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=wFihnRP6IbdsH68wcnKragY0ZvK4E/chN1crcLHIszicEu4w641yJLmF5njQ6S3KEYLFnMe5W4IMlQkJiGSw53fnqK4sZ5/aF42etvK4XAIttyo/i52nAr/TzCJOoUnmauk7n9rR6fhhwbcLFmDTQrVf2kVhQ4tYfPP64nY3xa8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-5a10835487fso10070343a12.1 for ; Thu, 01 Aug 2024 07:59:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1722524368; x=1723129168; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qkkb2RPDp12LkpvMTRxs/3MMuVB30Ktmq5r0S1CJah0=; b=ckDJqyepnm+nPnYgFtyikz6idemGcheTeafeDaFqMiT6n1Q+00T3D3NptHXuvIkNjQ vVZC+SX4fAyf8PK8+0eTsGt26mwd07hVAAcaX4pkVZGP6mcEUrGjFy0eB1nw/rMkjmia ekb7antGyERRvKIxacbJ71REmC4gn/olzmtD4JpLPpU0xwdvBCNA9/PL3Ov+N8yUotCv WG2cOYlXFp8PFMY6Dg9D57pS5u0fGkxOzpiwO0MzweDDXIvyL/8HkGs8tScxxs/Mt+7M 5hX1nVpI4pjO6HXTya5CuR9us6B6YWa1594QL55Fyp+JKN5vOjjHq97X9JTCUPxV8PB3 iDlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722524368; x=1723129168; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qkkb2RPDp12LkpvMTRxs/3MMuVB30Ktmq5r0S1CJah0=; b=AUq8nf7NAUa+EtMn4i8BuG0e0G8AgrbPpbKrdM/RXFBiIfNIdLE386/RfUuJzerENH jirKl9l09L++7pLXqKRnmOPmhIPCmQxRlfEi2otgakSIPvJdiAv/f6FxvzRLHaj891id 2PPJVR3ntLndlXsF9bUQymT9kqMkyuRX042BYUp+EfSLuJQTYsoQVb8z2qxkkAzaaHA1 DoieYEA2Clnc2l2tQLNE7IaoykF3igiQteM4Q208MK5kjtE6I4qxixGdPFTnI//Lehr1 +NweEf+9NmXaLyD6IHsV25LFY+4B6en4m21XzRsxrXUIQOSyjPaAT1QnZKLS6GVAt1yR fdpA== X-Gm-Message-State: AOJu0Ywc5sp8E6gIwz6CRFcmufS0U6VMCUtwOHYU/ewLLaITjLBy9ce1 X6n3fdeg40DxcvRfE4PdSxTFv5/HO2zdScIr4hysa1VJfRfaBDRpXDza9XPgIoCbjG0y2ax//vJ h40GB X-Google-Smtp-Source: AGHT+IHFKCkxQ+egaHo7L1xWxXA05/6PG9kw54AtzEDODHAgO9PGKyZEGgGNKgL3bHwoV3k86N3hIA== X-Received: by 2002:aa7:c449:0:b0:5a2:6350:75ac with SMTP id 4fb4d7f45d1cf-5b7f35fed0dmr436730a12.8.1722524357072; Thu, 01 Aug 2024 07:59:17 -0700 (PDT) Received: from platypus.lan ([2a04:cec2:9:dc84:3622:6733:ff49:ee91]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5ac63590592sm10252456a12.25.2024.08.01.07.59.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Aug 2024 07:59:16 -0700 (PDT) From: Arthur Cohen To: gcc-patches@gcc.gnu.org Cc: gcc-rust@gcc.gnu.org, jjasmine Subject: [PATCH 057/125] gccrs: Split up rust-macro-builtins.cc Date: Thu, 1 Aug 2024 16:56:53 +0200 Message-ID: <20240801145809.366388-59-arthur.cohen@embecosm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240801145809.366388-2-arthur.cohen@embecosm.com> References: <20240801145809.366388-2-arthur.cohen@embecosm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: jjasmine Fixes issue #2855 gcc/rust/ChangeLog: * Make-lang.in: add new .o builds for new .cc files * expand/rust-cfg-strip.h (RUST_CFG_STRIP_H): Add include guards for rust-cfg-strip * expand/rust-macro-builtins.cc (make_macro_path_str): moved to new respective files (make_token): moved to new respective files (make_string): moved to new respective files (macro_end_token): moved to new respective files (try_extract_string_literal_from_fragment): moved to new respective files (try_expand_many_expr): moved to new respective files (parse_single_string_literal): moved to new respective files (source_relative_path): moved to new respective files (load_file_bytes): moved to new respective files (MacroBuiltin::assert_handler): moved to new respective files (MacroBuiltin::file_handler): moved to new respective files (MacroBuiltin::column_handler): moved to new respective files (MacroBuiltin::include_bytes_handler): moved to new respective files (MacroBuiltin::include_str_handler): moved to new respective files (MacroBuiltin::compile_error_handler): moved to new respective files (MacroBuiltin::concat_handler): moved to new respective files (MacroBuiltin::env_handler): moved to new respective files (MacroBuiltin::cfg_handler): moved to new respective files (MacroBuiltin::include_handler): moved to new respective files (MacroBuiltin::line_handler): moved to new respective files (MacroBuiltin::stringify_handler): moved to new respective files (struct FormatArgsInput): moved to new respective files (struct FormatArgsParseError): moved to new respective files (format_args_parse_arguments): moved to new respective files (MacroBuiltin::format_args_handler): moved to new respective files * expand/rust-macro-builtins.h (builtin_macro_from_string): merge tl::optional from master * expand/rust-macro-builtins-asm.cc: New file. * expand/rust-macro-builtins-format-args.cc: New file. * expand/rust-macro-builtins-helpers.cc: New file. * expand/rust-macro-builtins-helpers.h: New file. * expand/rust-macro-builtins-include.cc: New file. * expand/rust-macro-builtins-location.cc: New file. * expand/rust-macro-builtins-log-debug.cc: New file. * expand/rust-macro-builtins-test-bench.cc: New file. * expand/rust-macro-builtins-trait.cc: New file. * expand/rust-macro-builtins-utility.cc: New file. --- gcc/rust/Make-lang.in | 9 + gcc/rust/expand/rust-cfg-strip.h | 4 + gcc/rust/expand/rust-macro-builtins-asm.cc | 20 + .../expand/rust-macro-builtins-format-args.cc | 192 ++++ .../expand/rust-macro-builtins-helpers.cc | 284 +++++ gcc/rust/expand/rust-macro-builtins-helpers.h | 90 ++ .../expand/rust-macro-builtins-include.cc | 249 +++++ .../expand/rust-macro-builtins-location.cc | 61 ++ .../expand/rust-macro-builtins-log-debug.cc | 31 + .../expand/rust-macro-builtins-test-bench.cc | 20 + gcc/rust/expand/rust-macro-builtins-trait.cc | 20 + .../expand/rust-macro-builtins-utility.cc | 294 ++++++ gcc/rust/expand/rust-macro-builtins.cc | 981 +----------------- gcc/rust/expand/rust-macro-builtins.h | 79 +- 14 files changed, 1315 insertions(+), 1019 deletions(-) create mode 100644 gcc/rust/expand/rust-macro-builtins-asm.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-format-args.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-helpers.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-helpers.h create mode 100644 gcc/rust/expand/rust-macro-builtins-include.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-location.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-log-debug.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-test-bench.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-trait.cc create mode 100644 gcc/rust/expand/rust-macro-builtins-utility.cc diff --git a/gcc/rust/Make-lang.in b/gcc/rust/Make-lang.in index 67df843349f..10af0814372 100644 --- a/gcc/rust/Make-lang.in +++ b/gcc/rust/Make-lang.in @@ -102,6 +102,15 @@ GRS_OBJS = \ rust/rust-proc-macro-invoc-lexer.o \ rust/rust-macro-substitute-ctx.o \ rust/rust-macro-builtins.o \ + rust/rust-macro-builtins-helpers.o \ + rust/rust-macro-builtins-asm.o \ + rust/rust-macro-builtins-trait.o \ + rust/rust-macro-builtins-utility.o \ + rust/rust-macro-builtins-log-debug.o \ + rust/rust-macro-builtins-test-bench.o \ + rust/rust-macro-builtins-format-args.o \ + rust/rust-macro-builtins-location.o \ + rust/rust-macro-builtins-include.o \ rust/rust-fmt.o \ rust/rust-hir.o \ rust/rust-hir-map.o \ diff --git a/gcc/rust/expand/rust-cfg-strip.h b/gcc/rust/expand/rust-cfg-strip.h index 4a8e6041ff2..048cebdb991 100644 --- a/gcc/rust/expand/rust-cfg-strip.h +++ b/gcc/rust/expand/rust-cfg-strip.h @@ -15,6 +15,8 @@ // You should have received a copy of the GNU General Public License // along with GCC; see the file COPYING3. If not see // . +#ifndef RUST_CFG_STRIP_H +#define RUST_CFG_STRIP_H #include "rust-ast-visitor.h" #include "rust-ast.h" @@ -190,3 +192,5 @@ public: } }; } // namespace Rust + +#endif // RUST_CFG_STRIP_Hs diff --git a/gcc/rust/expand/rust-macro-builtins-asm.cc b/gcc/rust/expand/rust-macro-builtins-asm.cc new file mode 100644 index 00000000000..62bb5935280 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-asm.cc @@ -0,0 +1,20 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" diff --git a/gcc/rust/expand/rust-macro-builtins-format-args.cc b/gcc/rust/expand/rust-macro-builtins-format-args.cc new file mode 100644 index 00000000000..f42a07ff205 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-format-args.cc @@ -0,0 +1,192 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . +#include "rust-macro-builtins-helpers.h" +#include "rust-expand-format-args.h" + +namespace Rust { + +struct FormatArgsInput +{ + std::string format_str; + AST::FormatArguments args; + // bool is_literal? +}; + +struct FormatArgsParseError +{ + enum class Kind + { + MissingArguments + } kind; +}; + +static tl::expected +format_args_parse_arguments (AST::MacroInvocData &invoc) +{ + MacroInvocLexer lex (invoc.get_delim_tok_tree ().to_token_stream ()); + Parser parser (lex); + + // TODO: check if EOF - return that format_args!() requires at least one + // argument + + auto args = AST::FormatArguments (); + auto last_token_id = macro_end_token (invoc.get_delim_tok_tree (), parser); + std::unique_ptr format_expr = nullptr; + + // TODO: Handle the case where we're not parsing a string literal (macro + // invocation for e.g.) + if (parser.peek_current_token ()->get_id () == STRING_LITERAL) + format_expr = parser.parse_literal_expr (); + + // TODO(Arthur): Clean this up - if we haven't parsed a string literal but a + // macro invocation, what do we do here? return a tl::unexpected? + auto format_str = static_cast (*format_expr) + .get_literal () + .as_string (); + + // TODO: Allow implicit captures ONLY if the the first arg is a string literal + // and not a macro invocation + + // TODO: How to consume all of the arguments until the delimiter? + + // TODO: What we then want to do is as follows: + // for each token, check if it is an identifier + // yes? is the next token an equal sign (=) + // yes? + // -> if that identifier is already present in our map, error + // out + // -> parse an expression, return a FormatArgument::Named + // no? + // -> if there have been named arguments before, error out + // (positional after named error) + // -> parse an expression, return a FormatArgument::Normal + while (parser.peek_current_token ()->get_id () != last_token_id) + { + parser.skip_token (COMMA); + + if (parser.peek_current_token ()->get_id () == IDENTIFIER + && parser.peek (1)->get_id () == EQUAL) + { + // FIXME: This is ugly - just add a parser.parse_identifier()? + auto ident_tok = parser.peek_current_token (); + auto ident = Identifier (ident_tok); + + parser.skip_token (IDENTIFIER); + parser.skip_token (EQUAL); + + auto expr = parser.parse_expr (); + + // TODO: Handle graciously + if (!expr) + rust_unreachable (); + + args.push (AST::FormatArgument::named (ident, std::move (expr))); + } + else + { + auto expr = parser.parse_expr (); + + // TODO: Handle graciously + if (!expr) + rust_unreachable (); + + args.push (AST::FormatArgument::normal (std::move (expr))); + } + // we need to skip commas, don't we? + } + + return FormatArgsInput{std::move (format_str), std::move (args)}; +} + +tl::optional +MacroBuiltin::format_args_handler (location_t invoc_locus, + AST::MacroInvocData &invoc, + AST::FormatArgs::Newline nl) +{ + auto input = format_args_parse_arguments (invoc); + + if (!input) + { + rust_error_at (invoc_locus, + "could not parse arguments to %"); + return tl::nullopt; + } + + // TODO(Arthur): We need to handle this + // // if it is not a literal, it's an eager macro invocation - return it + // if (!fmt_expr->is_literal ()) + // { + // auto token_tree = invoc.get_delim_tok_tree (); + // return AST::Fragment ({AST::SingleASTNode (std::move (fmt_expr))}, + // token_tree.to_token_stream ()); + // } + + // TODO(Arthur): Handle this as well - raw strings are special for the + // format_args parser auto fmt_str = static_cast + // (*fmt_arg.get ()); Switch on the format string to know if the string is raw + // or cooked switch (fmt_str.get_lit_type ()) + // { + // // case AST::Literal::RAW_STRING: + // case AST::Literal::STRING: + // break; + // case AST::Literal::CHAR: + // case AST::Literal::BYTE: + // case AST::Literal::BYTE_STRING: + // case AST::Literal::INT: + // case AST::Literal::FLOAT: + // case AST::Literal::BOOL: + // case AST::Literal::ERROR: + // rust_unreachable (); + // } + + bool append_newline = nl == AST::FormatArgs::Newline::Yes; + + auto fmt_str = std::move (input->format_str); + if (append_newline) + fmt_str += '\n'; + + auto pieces = Fmt::Pieces::collect (fmt_str, append_newline); + + // TODO: + // do the transformation into an AST::FormatArgs node + // return that + // expand it during lowering + + // TODO: we now need to take care of creating `unfinished_literal`? this is + // for creating the `template` + + auto fmt_args_node = AST::FormatArgs (invoc_locus, std::move (pieces), + std::move (input->args)); + + auto expanded + = Fmt::expand_format_args (fmt_args_node, + invoc.get_delim_tok_tree ().to_token_stream ()); + + if (!expanded.has_value ()) + return AST::Fragment::create_error (); + + return *expanded; + + // auto node = std::unique_ptr (fmt_args_node); + // auto single_node = AST::SingleASTNode (std::move (node)); + + // return AST::Fragment ({std::move (single_node)}, + // invoc.get_delim_tok_tree ().to_token_stream ()); +} + +} // namespace Rust \ No newline at end of file diff --git a/gcc/rust/expand/rust-macro-builtins-helpers.cc b/gcc/rust/expand/rust-macro-builtins-helpers.cc new file mode 100644 index 00000000000..e9bf54f068a --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-helpers.cc @@ -0,0 +1,284 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins-helpers.h" + +namespace Rust { + +std::string +make_macro_path_str (BuiltinMacro kind) +{ + auto str = MacroBuiltin::builtins.lookup (kind); + rust_assert (str.has_value ()); + + return str.value (); +} + +std::vector> +check_for_eager_invocations ( + std::vector> &expressions) +{ + std::vector> pending; + + for (auto &expr : expressions) + if (expr->get_ast_kind () == AST::Kind::MACRO_INVOCATION) + pending.emplace_back (std::unique_ptr ( + static_cast (expr->clone_expr ().release ()))); + + return pending; +} + +// +// Shorthand function for creating unique_ptr tokens +// +std::unique_ptr +make_token (const TokenPtr tok) +{ + return std::unique_ptr (new AST::Token (tok)); +} + +std::unique_ptr +make_string (location_t locus, std::string value) +{ + return std::unique_ptr ( + new AST::LiteralExpr (value, AST::Literal::STRING, + PrimitiveCoreType::CORETYPE_STR, {}, locus)); +} + +// TODO: Is this correct? +AST::Fragment +make_eager_builtin_invocation ( + BuiltinMacro kind, location_t locus, AST::DelimTokenTree arguments, + std::vector> &&pending_invocations) +{ + auto path_str = make_macro_path_str (kind); + + std::unique_ptr node = AST::MacroInvocation::Builtin ( + kind, + AST::MacroInvocData (AST::SimplePath ( + {AST::SimplePathSegment (path_str, locus)}), + std::move (arguments)), + {}, locus, std::move (pending_invocations)); + + return AST::Fragment ({AST::SingleASTNode (std::move (node))}, + arguments.to_token_stream ()); +} + +/* Match the end token of a macro given the start delimiter of the macro */ +TokenId +macro_end_token (AST::DelimTokenTree &invoc_token_tree, + Parser &parser) +{ + auto last_token_id = TokenId::RIGHT_CURLY; + switch (invoc_token_tree.get_delim_type ()) + { + case AST::DelimType::PARENS: + last_token_id = TokenId::RIGHT_PAREN; + rust_assert (parser.skip_token (LEFT_PAREN)); + break; + + case AST::DelimType::CURLY: + rust_assert (parser.skip_token (LEFT_CURLY)); + break; + + case AST::DelimType::SQUARE: + last_token_id = TokenId::RIGHT_SQUARE; + rust_assert (parser.skip_token (LEFT_SQUARE)); + break; + } + + return last_token_id; +} + +// Expand and then extract a string literal from the macro +std::unique_ptr +try_extract_string_literal_from_fragment (const location_t &parent_locus, + std::unique_ptr &node) +{ + auto maybe_lit = static_cast (node.get ()); + if (!node || !node->is_literal () + || maybe_lit->get_lit_type () != AST::Literal::STRING) + { + rust_error_at (parent_locus, "argument must be a string literal"); + if (node) + rust_inform (node->get_locus (), "expanded from here"); + return nullptr; + } + return std::unique_ptr ( + static_cast (node->clone_expr ().release ())); +} + +std::vector> +try_expand_many_expr (Parser &parser, + const TokenId last_token_id, MacroExpander *expander, + bool &has_error) +{ + auto restrictions = Rust::ParseRestrictions (); + // stop parsing when encountered a braces/brackets + restrictions.expr_can_be_null = true; + // we can't use std::optional, so... + auto result = std::vector> (); + auto empty_expr = std::vector> (); + + auto first_token = parser.peek_current_token ()->get_id (); + if (first_token == COMMA) + { + rust_error_at (parser.peek_current_token ()->get_locus (), + "expected expression, found %<,%>"); + has_error = true; + return empty_expr; + } + + while (parser.peek_current_token ()->get_id () != last_token_id + && parser.peek_current_token ()->get_id () != END_OF_FILE) + { + auto expr = parser.parse_expr (AST::AttrVec (), restrictions); + // something must be so wrong that the expression could not be parsed + rust_assert (expr); + result.push_back (std::move (expr)); + + auto next_token = parser.peek_current_token (); + if (!parser.skip_token (COMMA) && next_token->get_id () != last_token_id) + { + rust_error_at (next_token->get_locus (), "expected token: %<,%>"); + // TODO: is this recoverable? to avoid crashing the parser in the next + // fragment we have to exit early here + has_error = true; + return empty_expr; + } + } + + return result; +} + +// Parse a single string literal from the given delimited token tree, +// and return the LiteralExpr for it. Allow for an optional trailing comma, +// but otherwise enforce that these are the only tokens. +// FIXME(Arthur): This function needs a rework - it should not emit errors, it +// should probably be smaller +std::unique_ptr +parse_single_string_literal (BuiltinMacro kind, + AST::DelimTokenTree &invoc_token_tree, + location_t invoc_locus, MacroExpander *expander) +{ + MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); + Parser parser (lex); + + auto last_token_id = macro_end_token (invoc_token_tree, parser); + + std::unique_ptr lit_expr = nullptr; + std::unique_ptr macro_invoc = nullptr; + + if (parser.peek_current_token ()->get_id () == STRING_LITERAL) + { + lit_expr = parser.parse_literal_expr (); + parser.maybe_skip_token (COMMA); + if (parser.peek_current_token ()->get_id () != last_token_id) + { + lit_expr = nullptr; + rust_error_at (invoc_locus, "macro takes 1 argument"); + } + } + else if (parser.peek_current_token ()->get_id () == last_token_id) + rust_error_at (invoc_locus, "macro takes 1 argument"); + else + { + macro_invoc = parser.parse_macro_invocation (AST::AttrVec ()); + + parser.maybe_skip_token (COMMA); + if (parser.peek_current_token ()->get_id () != last_token_id) + { + lit_expr = nullptr; + rust_error_at (invoc_locus, "macro takes 1 argument"); + } + + if (macro_invoc != nullptr) + { + auto path_str = make_macro_path_str (kind); + + auto pending_invocations + = std::vector> (); + + pending_invocations.push_back (std::move (macro_invoc)); + + return AST::MacroInvocation::Builtin ( + kind, + AST::MacroInvocData (AST::SimplePath ({AST::SimplePathSegment ( + path_str, invoc_locus)}), + std::move (invoc_token_tree)), + {}, invoc_locus, std::move (pending_invocations)); + } + else + { + rust_error_at (invoc_locus, "argument must be a string literal or a " + "macro which expands to a string"); + } + } + + parser.skip_token (last_token_id); + + return std::unique_ptr (std::move (lit_expr)); +} + +/* Treat PATH as a path relative to the source file currently being + compiled, and return the absolute path for it. */ +std::string +source_relative_path (std::string path, location_t locus) +{ + std::string compile_fname = LOCATION_FILE (locus); + + auto dir_separator_pos = compile_fname.rfind (file_separator); + + /* If there is no file_separator in the path, use current dir ('.'). */ + std::string dirname; + if (dir_separator_pos == std::string::npos) + dirname = std::string (".") + file_separator; + else + dirname = compile_fname.substr (0, dir_separator_pos) + file_separator; + + return dirname + path; +} + +/* Read the full contents of the file FILENAME and return them in a vector. + FIXME: platform specific. */ +tl::optional> +load_file_bytes (location_t invoc_locus, const char *filename) +{ + RAIIFile file_wrap (filename); + if (file_wrap.get_raw () == nullptr) + { + rust_error_at (invoc_locus, "cannot open filename %s: %m", filename); + return tl::nullopt; + } + + FILE *f = file_wrap.get_raw (); + fseek (f, 0L, SEEK_END); + long fsize = ftell (f); + fseek (f, 0L, SEEK_SET); + + std::vector buf (fsize); + + if (fsize > 0 && fread (&buf[0], fsize, 1, f) != 1) + { + rust_error_at (invoc_locus, "error reading file %s: %m", filename); + return std::vector (); + } + + return buf; +} +} // namespace Rust \ No newline at end of file diff --git a/gcc/rust/expand/rust-macro-builtins-helpers.h b/gcc/rust/expand/rust-macro-builtins-helpers.h new file mode 100644 index 00000000000..f5d018b35f8 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-helpers.h @@ -0,0 +1,90 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#ifndef GCCRS_RUST_MACRO_BUILTINS_HELPERS_H +#define GCCRS_RUST_MACRO_BUILTINS_HELPERS_H +#include "rust-ast-fragment.h" +#include "rust-ast.h" +#include "rust-cfg-strip.h" +#include "rust-diagnostics.h" +#include "rust-early-name-resolver.h" +#include "rust-expr.h" +#include "rust-lex.h" +#include "rust-macro-builtins.h" +#include "rust-macro-invoc-lexer.h" +#include "rust-macro.h" +#include "rust-parse.h" +#include "rust-session-manager.h" +#include "rust-system.h" +#include "rust-token.h" +namespace Rust { + +std::string +make_macro_path_str (BuiltinMacro kind); + +std::vector> +check_for_eager_invocations ( + std::vector> &expressions); + +// Shorthand function for creating unique_ptr tokens + +std::unique_ptr +make_token (const TokenPtr tok); + +std::unique_ptr +make_string (location_t locus, std::string value); +// TODO: Is this correct? +AST::Fragment +make_eager_builtin_invocation ( + BuiltinMacro kind, location_t locus, AST::DelimTokenTree arguments, + std::vector> &&pending_invocations); +// Match the end token of a macro given the start delimiter of the macro +TokenId +macro_end_token (AST::DelimTokenTree &invoc_token_tree, + Parser &parser); +// Expand and then extract a string literal from the macro +std::unique_ptr +try_extract_string_literal_from_fragment (const location_t &parent_locus, + std::unique_ptr &node); + +std::vector> +try_expand_many_expr (Parser &parser, + const TokenId last_token_id, MacroExpander *expander, + bool &has_error); + +// Parse a single string literal from the given delimited token tree, +// and return the LiteralExpr for it. Allow for an optional trailing comma, +// but otherwise enforce that these are the only tokens. + +std::unique_ptr +parse_single_string_literal (BuiltinMacro kind, + AST::DelimTokenTree &invoc_token_tree, + location_t invoc_locus, MacroExpander *expander); + +// Treat PATH as a path relative to the source file currently being +// compiled, and return the absolute path for it. + +std::string +source_relative_path (std::string path, location_t locus); + +// Read the full contents of the file FILENAME and return them in a vector. +// FIXME: platform specific. +tl::optional> +load_file_bytes (location_t invoc_locus, const char *filename); +} // namespace Rust +#endif // GCCRS_RUST_MACRO_BUILTINS_HELPERS_H diff --git a/gcc/rust/expand/rust-macro-builtins-include.cc b/gcc/rust/expand/rust-macro-builtins-include.cc new file mode 100644 index 00000000000..c3b1e65ffc6 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-include.cc @@ -0,0 +1,249 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" +#include "optional.h" +namespace Rust { +/* Expand builtin macro include_bytes!("filename"), which includes the contents +of the given file as reference to a byte array. Yields an expression of type +&'static [u8; N]. */ + +tl::optional +MacroBuiltin::include_bytes_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + /* Get target filename from the macro invocation, which is treated as a path + relative to the include!-ing file (currently being compiled). */ + auto lit_expr + = parse_single_string_literal (BuiltinMacro::IncludeBytes, + invoc.get_delim_tok_tree (), invoc_locus, + invoc.get_expander ()); + if (lit_expr == nullptr) + return AST::Fragment::create_error (); + + rust_assert (lit_expr->is_literal ()); + + std::string target_filename + = source_relative_path (lit_expr->as_string (), invoc_locus); + + auto maybe_bytes = load_file_bytes (invoc_locus, target_filename.c_str ()); + + if (!maybe_bytes.has_value ()) + return AST::Fragment::create_error (); + + std::vector bytes = maybe_bytes.value (); + + /* Is there a more efficient way to do this? */ + std::vector> elts; + + // We create the tokens for a borrow expression of a byte array, so + // & [ , , ... ] + std::vector> toks; + toks.emplace_back (make_token (Token::make (AMP, invoc_locus))); + toks.emplace_back (make_token (Token::make (LEFT_SQUARE, invoc_locus))); + + for (uint8_t b : bytes) + { + elts.emplace_back ( + new AST::LiteralExpr (std::string (1, (char) b), AST::Literal::BYTE, + PrimitiveCoreType::CORETYPE_U8, + {} /* outer_attrs */, invoc_locus)); + toks.emplace_back (make_token (Token::make_byte_char (invoc_locus, b))); + toks.emplace_back (make_token (Token::make (COMMA, invoc_locus))); + } + + toks.emplace_back (make_token (Token::make (RIGHT_SQUARE, invoc_locus))); + + auto elems = std::unique_ptr ( + new AST::ArrayElemsValues (std::move (elts), invoc_locus)); + + auto array = std::unique_ptr ( + new AST::ArrayExpr (std::move (elems), {}, {}, invoc_locus)); + + auto borrow = std::unique_ptr ( + new AST::BorrowExpr (std::move (array), false, false, {}, invoc_locus)); + + auto node = AST::SingleASTNode (std::move (borrow)); + + return AST::Fragment ({node}, std::move (toks)); +} + +/* Expand builtin macro include_str!("filename"), which includes the contents + of the given file as a string. The file must be UTF-8 encoded. Yields an + expression of type &'static str. */ + +tl::optional +MacroBuiltin::include_str_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + /* Get target filename from the macro invocation, which is treated as a path + relative to the include!-ing file (currently being compiled). */ + auto lit_expr + = parse_single_string_literal (BuiltinMacro::IncludeStr, + invoc.get_delim_tok_tree (), invoc_locus, + invoc.get_expander ()); + if (lit_expr == nullptr) + return AST::Fragment::create_error (); + + if (!lit_expr->is_literal ()) + { + auto token_tree = invoc.get_delim_tok_tree (); + return AST::Fragment ({AST::SingleASTNode (std::move (lit_expr))}, + token_tree.to_token_stream ()); + } + + std::string target_filename + = source_relative_path (lit_expr->as_string (), invoc_locus); + + auto maybe_bytes = load_file_bytes (invoc_locus, target_filename.c_str ()); + + if (!maybe_bytes.has_value ()) + return AST::Fragment::create_error (); + + std::vector bytes = maybe_bytes.value (); + + /* FIXME: reuse lexer */ + int expect_single = 0; + for (uint8_t b : bytes) + { + if (expect_single) + { + if ((b & 0xC0) != 0x80) + /* character was truncated, exit with expect_single != 0 */ + break; + expect_single--; + } + else if (b & 0x80) + { + if (b >= 0xF8) + { + /* more than 4 leading 1s */ + expect_single = 1; + break; + } + else if (b >= 0xF0) + { + /* 4 leading 1s */ + expect_single = 3; + } + else if (b >= 0xE0) + { + /* 3 leading 1s */ + expect_single = 2; + } + else if (b >= 0xC0) + { + /* 2 leading 1s */ + expect_single = 1; + } + else + { + /* only 1 leading 1 */ + expect_single = 1; + break; + } + } + } + + std::string str; + if (expect_single) + rust_error_at (invoc_locus, "%s was not a valid utf-8 file", + target_filename.c_str ()); + else + str = std::string ((const char *) bytes.data (), bytes.size ()); + + auto node = AST::SingleASTNode (make_string (invoc_locus, str)); + auto str_tok = make_token (Token::make_string (invoc_locus, std::move (str))); + + return AST::Fragment ({node}, std::move (str_tok)); +} + +/* Expand builtin macro include!(), which includes a source file at the current +scope compile time. */ + +tl::optional +MacroBuiltin::include_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + /* Get target filename from the macro invocation, which is treated as a path + relative to the include!-ing file (currently being compiled). */ + auto lit_expr + = parse_single_string_literal (BuiltinMacro::Include, + invoc.get_delim_tok_tree (), invoc_locus, + invoc.get_expander ()); + if (lit_expr == nullptr) + return AST::Fragment::create_error (); + + rust_assert (lit_expr->is_literal ()); + + std::string filename + = source_relative_path (lit_expr->as_string (), invoc_locus); + auto target_filename + = Rust::Session::get_instance ().include_extra_file (std::move (filename)); + + RAIIFile target_file (target_filename); + Linemap *linemap = Session::get_instance ().linemap; + + if (!target_file.ok ()) + { + rust_error_at (lit_expr->get_locus (), + "cannot open included file %qs: %m", target_filename); + return AST::Fragment::create_error (); + } + + rust_debug ("Attempting to parse included file %s", target_filename); + + Lexer lex (target_filename, std::move (target_file), linemap); + Parser parser (lex); + + auto parsed_items = parser.parse_items (); + bool has_error = !parser.get_errors ().empty (); + + for (const auto &error : parser.get_errors ()) + error.emit (); + + if (has_error) + { + // inform the user that the errors above are from a included file + rust_inform (invoc_locus, "included from here"); + return AST::Fragment::create_error (); + } + + std::vector nodes{}; + for (auto &item : parsed_items) + { + AST::SingleASTNode node (std::move (item)); + nodes.push_back (node); + } + + // FIXME: This returns an empty vector of tokens and works fine, but is that + // the expected behavior? `include` macros are a bit harder to reason about + // since they include tokens. Furthermore, our lexer has no easy way to return + // a slice of tokens like the MacroInvocLexer. So it gets even harder to + // extrac tokens from here. For now, let's keep it that way and see if it + // eventually breaks, but I don't expect it to cause many issues since the + // list of tokens is only used when a macro invocation mixes eager + // macro invocations and already expanded tokens. Think + // `concat!(a!(), 15, b!())`. We need to be able to expand a!(), expand b!(), + // and then insert the `15` token in between. In the case of `include!()`, we + // only have one argument. So it's either going to be a macro invocation or a + // string literal. + return AST::Fragment (nodes, std::vector> ()); +} +} // namespace Rust \ No newline at end of file diff --git a/gcc/rust/expand/rust-macro-builtins-location.cc b/gcc/rust/expand/rust-macro-builtins-location.cc new file mode 100644 index 00000000000..19857487c0b --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-location.cc @@ -0,0 +1,61 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" + +namespace Rust { +tl::optional +MacroBuiltin::file_handler (location_t invoc_locus, AST::MacroInvocData &) +{ + auto current_file = LOCATION_FILE (invoc_locus); + auto file_str = AST::SingleASTNode (make_string (invoc_locus, current_file)); + auto str_token + = make_token (Token::make_string (invoc_locus, std::move (current_file))); + + return AST::Fragment ({file_str}, std::move (str_token)); +} + +tl::optional +MacroBuiltin::column_handler (location_t invoc_locus, AST::MacroInvocData &) +{ + auto current_column = LOCATION_COLUMN (invoc_locus); + + auto column_tok = make_token ( + Token::make_int (invoc_locus, std::to_string (current_column))); + auto column_no = AST::SingleASTNode (std::unique_ptr ( + new AST::LiteralExpr (std::to_string (current_column), AST::Literal::INT, + PrimitiveCoreType::CORETYPE_U32, {}, invoc_locus))); + + return AST::Fragment ({column_no}, std::move (column_tok)); +} + +tl::optional +MacroBuiltin::line_handler (location_t invoc_locus, AST::MacroInvocData &) +{ + auto current_line = LOCATION_LINE (invoc_locus); + + auto line_no = AST::SingleASTNode (std::unique_ptr ( + new AST::LiteralExpr (std::to_string (current_line), AST::Literal::INT, + PrimitiveCoreType::CORETYPE_U32, {}, invoc_locus))); + auto tok + = make_token (Token::make_int (invoc_locus, std::to_string (current_line))); + + return AST::Fragment ({line_no}, std::move (tok)); +} +} // namespace Rust diff --git a/gcc/rust/expand/rust-macro-builtins-log-debug.cc b/gcc/rust/expand/rust-macro-builtins-log-debug.cc new file mode 100644 index 00000000000..56bd3034b08 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-log-debug.cc @@ -0,0 +1,31 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" + +namespace Rust { +tl::optional +MacroBuiltin::assert_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + rust_debug ("assert!() called"); + + return AST::Fragment::create_error (); +} +} // namespace Rust \ No newline at end of file diff --git a/gcc/rust/expand/rust-macro-builtins-test-bench.cc b/gcc/rust/expand/rust-macro-builtins-test-bench.cc new file mode 100644 index 00000000000..62bb5935280 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-test-bench.cc @@ -0,0 +1,20 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" diff --git a/gcc/rust/expand/rust-macro-builtins-trait.cc b/gcc/rust/expand/rust-macro-builtins-trait.cc new file mode 100644 index 00000000000..62bb5935280 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-trait.cc @@ -0,0 +1,20 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" diff --git a/gcc/rust/expand/rust-macro-builtins-utility.cc b/gcc/rust/expand/rust-macro-builtins-utility.cc new file mode 100644 index 00000000000..c5932b3d978 --- /dev/null +++ b/gcc/rust/expand/rust-macro-builtins-utility.cc @@ -0,0 +1,294 @@ +// Copyright (C) 2020-2024 Free Software Foundation, Inc. + +// This file is part of GCC. + +// GCC is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free +// Software Foundation; either version 3, or (at your option) any later +// version. + +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or +// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +// for more details. + +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#include "rust-fmt.h" +#include "rust-macro-builtins.h" +#include "rust-macro-builtins-helpers.h" + +namespace Rust { + +/* Expand builtin macro compile_error!("error"), which forces a compile error + during the compile time. */ +tl::optional +MacroBuiltin::compile_error_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + auto lit_expr + = parse_single_string_literal (BuiltinMacro::CompileError, + invoc.get_delim_tok_tree (), invoc_locus, + invoc.get_expander ()); + if (lit_expr == nullptr) + return AST::Fragment::create_error (); + + rust_assert (lit_expr->is_literal ()); + + std::string error_string = lit_expr->as_string (); + rust_error_at (invoc_locus, "%s", error_string.c_str ()); + + return AST::Fragment::create_error (); +} + +/* Expand builtin macro concat!(), which joins all the literal parameters + into a string with no delimiter. */ + +// This is a weird one. We want to do something where, if something cannot be +// expanded yet (i.e. macro invocation?) we return the whole MacroInvocation +// node again but expanded as much as possible. +// Is that possible? How do we do that? +// +// Let's take a few examples: +// +// 1. concat!(1, 2, true); +// 2. concat!(a!(), 2, true); +// 3. concat!(concat!(1, false), 2, true); +// 4. concat!(concat!(1, a!()), 2, true); +// +// 1. We simply want to return the new fragment: "12true" +// 2. We want to return `concat!(a_expanded, 2, true)` as a fragment +// 3. We want to return `concat!(1, false, 2, true)` +// 4. We want to return `concat!(concat!(1, a_expanded), 2, true); +// +// How do we do that? +// +// For each (un)expanded fragment: we check if it is expanded fully +// +// 1. What is expanded fully? +// 2. How to check? +// +// If it is expanded fully and not a literal, then we error out. +// Otherwise we simply emplace it back and keep going. +// +// In the second case, we must mark that this concat invocation still has some +// expansion to do: This allows us to return a `MacroInvocation { ... }` as an +// AST fragment, instead of a completed string. +// +// This means that we must change all the `try_expand_many_*` APIs and so on to +// return some sort of index or way to signify that we might want to reuse some +// bits and pieces of the original token tree. +// +// Now, before that: How do we resolve the names used in a builtin macro +// invocation? +// Do we split the two passes of parsing the token tree and then expanding it? +// Can we do that easily? +tl::optional +MacroBuiltin::concat_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + auto invoc_token_tree = invoc.get_delim_tok_tree (); + MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); + Parser parser (lex); + + auto str = std::string (); + bool has_error = false; + + auto last_token_id = macro_end_token (invoc_token_tree, parser); + + auto start = lex.get_offs (); + /* NOTE: concat! could accept no argument, so we don't have any checks here */ + auto expanded_expr = try_expand_many_expr (parser, last_token_id, + invoc.get_expander (), has_error); + auto end = lex.get_offs (); + + auto tokens = lex.get_token_slice (start, end); + + auto pending_invocations = check_for_eager_invocations (expanded_expr); + if (!pending_invocations.empty ()) + return make_eager_builtin_invocation (BuiltinMacro::Concat, invoc_locus, + invoc.get_delim_tok_tree (), + std::move (pending_invocations)); + + for (auto &expr : expanded_expr) + { + if (!expr->is_literal () + && expr->get_ast_kind () != AST::Kind::MACRO_INVOCATION) + { + has_error = true; + rust_error_at (expr->get_locus (), "expected a literal"); + // diagnostics copied from rustc + rust_inform (expr->get_locus (), + "only literals (like %<\"foo\"%>, %<42%> and " + "%<3.14%>) can be passed to %"); + continue; + } + auto *literal = static_cast (expr.get ()); + if (literal->get_lit_type () == AST::Literal::BYTE + || literal->get_lit_type () == AST::Literal::BYTE_STRING) + { + has_error = true; + rust_error_at (expr->get_locus (), + "cannot concatenate a byte string literal"); + continue; + } + str += literal->as_string (); + } + + parser.skip_token (last_token_id); + + if (has_error) + return AST::Fragment::create_error (); + + auto node = AST::SingleASTNode (make_string (invoc_locus, str)); + auto str_tok = make_token (Token::make_string (invoc_locus, std::move (str))); + + return AST::Fragment ({node}, std::move (str_tok)); +} + +/* Expand builtin macro env!(), which inspects an environment variable at + compile time. */ +tl::optional +MacroBuiltin::env_handler (location_t invoc_locus, AST::MacroInvocData &invoc) +{ + auto invoc_token_tree = invoc.get_delim_tok_tree (); + MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); + Parser parser (lex); + + auto last_token_id = macro_end_token (invoc_token_tree, parser); + std::unique_ptr error_expr = nullptr; + std::unique_ptr lit_expr = nullptr; + bool has_error = false; + + auto start = lex.get_offs (); + auto expanded_expr = try_expand_many_expr (parser, last_token_id, + invoc.get_expander (), has_error); + auto end = lex.get_offs (); + + auto tokens = lex.get_token_slice (start, end); + + if (has_error) + return AST::Fragment::create_error (); + + auto pending = check_for_eager_invocations (expanded_expr); + if (!pending.empty ()) + return make_eager_builtin_invocation (BuiltinMacro::Env, invoc_locus, + invoc_token_tree, + std::move (pending)); + + if (expanded_expr.size () < 1 || expanded_expr.size () > 2) + { + rust_error_at (invoc_locus, "env! takes 1 or 2 arguments"); + return AST::Fragment::create_error (); + } + if (expanded_expr.size () > 0) + { + if (!(lit_expr + = try_extract_string_literal_from_fragment (invoc_locus, + expanded_expr[0]))) + { + return AST::Fragment::create_error (); + } + } + if (expanded_expr.size () > 1) + { + if (!(error_expr + = try_extract_string_literal_from_fragment (invoc_locus, + expanded_expr[1]))) + { + return AST::Fragment::create_error (); + } + } + + parser.skip_token (last_token_id); + + auto env_value = getenv (lit_expr->as_string ().c_str ()); + + if (env_value == nullptr) + { + if (error_expr == nullptr) + rust_error_at (invoc_locus, "environment variable %qs not defined", + lit_expr->as_string ().c_str ()); + else + rust_error_at (invoc_locus, "%s", error_expr->as_string ().c_str ()); + return AST::Fragment::create_error (); + } + + auto node = AST::SingleASTNode (make_string (invoc_locus, env_value)); + auto tok + = make_token (Token::make_string (invoc_locus, std::move (env_value))); + + return AST::Fragment ({node}, std::move (tok)); +} + +tl::optional +MacroBuiltin::cfg_handler (location_t invoc_locus, AST::MacroInvocData &invoc) +{ + // only parse if not already parsed + if (!invoc.is_parsed ()) + { + std::unique_ptr converted_input ( + invoc.get_delim_tok_tree ().parse_to_meta_item ()); + + if (converted_input == nullptr) + { + rust_debug ("DEBUG: failed to parse macro to meta item"); + // TODO: do something now? is this an actual error? + } + else + { + std::vector> meta_items ( + std::move (converted_input->get_items ())); + invoc.set_meta_item_output (std::move (meta_items)); + } + } + + /* TODO: assuming that cfg! macros can only have one meta item inner, like cfg + * attributes */ + if (invoc.get_meta_items ().size () != 1) + return AST::Fragment::create_error (); + + bool result = invoc.get_meta_items ()[0]->check_cfg_predicate ( + Session::get_instance ()); + auto literal_exp = AST::SingleASTNode (std::unique_ptr ( + new AST::LiteralExpr (result ? "true" : "false", AST::Literal::BOOL, + PrimitiveCoreType::CORETYPE_BOOL, {}, invoc_locus))); + auto tok = make_token ( + Token::make (result ? TRUE_LITERAL : FALSE_LITERAL, invoc_locus)); + + return AST::Fragment ({literal_exp}, std::move (tok)); +} + +tl::optional +MacroBuiltin::stringify_handler (location_t invoc_locus, + AST::MacroInvocData &invoc) +{ + std::string content; + auto invoc_token_tree = invoc.get_delim_tok_tree (); + auto tokens = invoc_token_tree.to_token_stream (); + + // Tokens stream includes the first and last delimiter + // which we need to skip. + for (auto token = tokens.cbegin () + 1; token < tokens.cend () - 1; token++) + { + // Rust stringify format has no garantees but the reference compiler + // removes spaces before some tokens depending on the lexer's behavior, + // let's mimick some of those behaviors. + auto token_id = (*token)->get_id (); + if (token_id != RIGHT_PAREN && token_id != EXCLAM + && token != tokens.cbegin () + 1) + { + content.push_back (' '); + } + content += (*token)->as_string (); + } + + auto node = AST::SingleASTNode (make_string (invoc_locus, content)); + auto token + = make_token (Token::make_string (invoc_locus, std::move (content))); + return AST::Fragment ({node}, std::move (token)); +} + +} // namespace Rust \ No newline at end of file diff --git a/gcc/rust/expand/rust-macro-builtins.cc b/gcc/rust/expand/rust-macro-builtins.cc index a33a57752da..bc5bc944e77 100644 --- a/gcc/rust/expand/rust-macro-builtins.cc +++ b/gcc/rust/expand/rust-macro-builtins.cc @@ -40,6 +40,7 @@ #include "rust-fmt.h" #include "rust-token.h" +#include "rust-macro-builtins-helpers.h" namespace Rust { const BiMap MacroBuiltin::builtins = {{ @@ -149,986 +150,6 @@ builtin_macro_from_string (const std::string &identifier) return macro; } -namespace { -std::string -make_macro_path_str (BuiltinMacro kind) -{ - auto str = MacroBuiltin::builtins.lookup (kind); - rust_assert (str.has_value ()); - - return str.value (); -} - -static std::vector> -check_for_eager_invocations ( - std::vector> &expressions) -{ - std::vector> pending; - - for (auto &expr : expressions) - if (expr->get_ast_kind () == AST::Kind::MACRO_INVOCATION) - pending.emplace_back (std::unique_ptr ( - static_cast (expr->clone_expr ().release ()))); - - return pending; -} - -/** - * Shorthand function for creating unique_ptr tokens - */ -static std::unique_ptr -make_token (const TokenPtr tok) -{ - return std::unique_ptr (new AST::Token (tok)); -} - -std::unique_ptr -make_string (location_t locus, std::string value) -{ - return std::unique_ptr ( - new AST::LiteralExpr (value, AST::Literal::STRING, - PrimitiveCoreType::CORETYPE_STR, {}, locus)); -} - -// TODO: Is this correct? -static AST::Fragment -make_eager_builtin_invocation ( - BuiltinMacro kind, location_t locus, AST::DelimTokenTree arguments, - std::vector> &&pending_invocations) -{ - auto path_str = make_macro_path_str (kind); - - std::unique_ptr node = AST::MacroInvocation::Builtin ( - kind, - AST::MacroInvocData (AST::SimplePath ( - {AST::SimplePathSegment (path_str, locus)}), - std::move (arguments)), - {}, locus, std::move (pending_invocations)); - - return AST::Fragment ({AST::SingleASTNode (std::move (node))}, - arguments.to_token_stream ()); -} - -/* Match the end token of a macro given the start delimiter of the macro */ -static inline TokenId -macro_end_token (AST::DelimTokenTree &invoc_token_tree, - Parser &parser) -{ - auto last_token_id = TokenId::RIGHT_CURLY; - switch (invoc_token_tree.get_delim_type ()) - { - case AST::DelimType::PARENS: - last_token_id = TokenId::RIGHT_PAREN; - rust_assert (parser.skip_token (LEFT_PAREN)); - break; - - case AST::DelimType::CURLY: - rust_assert (parser.skip_token (LEFT_CURLY)); - break; - - case AST::DelimType::SQUARE: - last_token_id = TokenId::RIGHT_SQUARE; - rust_assert (parser.skip_token (LEFT_SQUARE)); - break; - } - - return last_token_id; -} - -/* Expand and then extract a string literal from the macro */ -static std::unique_ptr -try_extract_string_literal_from_fragment (const location_t &parent_locus, - std::unique_ptr &node) -{ - auto maybe_lit = static_cast (node.get ()); - if (!node || !node->is_literal () - || maybe_lit->get_lit_type () != AST::Literal::STRING) - { - rust_error_at (parent_locus, "argument must be a string literal"); - if (node) - rust_inform (node->get_locus (), "expanded from here"); - return nullptr; - } - return std::unique_ptr ( - static_cast (node->clone_expr ().release ())); -} - -static std::vector> -try_expand_many_expr (Parser &parser, - const TokenId last_token_id, MacroExpander *expander, - bool &has_error) -{ - auto restrictions = Rust::ParseRestrictions (); - // stop parsing when encountered a braces/brackets - restrictions.expr_can_be_null = true; - // we can't use std::optional, so... - auto result = std::vector> (); - auto empty_expr = std::vector> (); - - auto first_token = parser.peek_current_token ()->get_id (); - if (first_token == COMMA) - { - rust_error_at (parser.peek_current_token ()->get_locus (), - "expected expression, found %<,%>"); - has_error = true; - return empty_expr; - } - - while (parser.peek_current_token ()->get_id () != last_token_id - && parser.peek_current_token ()->get_id () != END_OF_FILE) - { - auto expr = parser.parse_expr (AST::AttrVec (), restrictions); - // something must be so wrong that the expression could not be parsed - rust_assert (expr); - result.push_back (std::move (expr)); - - auto next_token = parser.peek_current_token (); - if (!parser.skip_token (COMMA) && next_token->get_id () != last_token_id) - { - rust_error_at (next_token->get_locus (), "expected token: %<,%>"); - // TODO: is this recoverable? to avoid crashing the parser in the next - // fragment we have to exit early here - has_error = true; - return empty_expr; - } - } - - return result; -} - -/* Parse a single string literal from the given delimited token tree, - and return the LiteralExpr for it. Allow for an optional trailing comma, - but otherwise enforce that these are the only tokens. */ - -// FIXME(Arthur): This function needs a rework - it should not emit errors, it -// should probably be smaller -std::unique_ptr -parse_single_string_literal (BuiltinMacro kind, - AST::DelimTokenTree &invoc_token_tree, - location_t invoc_locus, MacroExpander *expander) -{ - MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); - Parser parser (lex); - - auto last_token_id = macro_end_token (invoc_token_tree, parser); - - std::unique_ptr lit_expr = nullptr; - std::unique_ptr macro_invoc = nullptr; - - if (parser.peek_current_token ()->get_id () == STRING_LITERAL) - { - lit_expr = parser.parse_literal_expr (); - parser.maybe_skip_token (COMMA); - if (parser.peek_current_token ()->get_id () != last_token_id) - { - lit_expr = nullptr; - rust_error_at (invoc_locus, "macro takes 1 argument"); - } - } - else if (parser.peek_current_token ()->get_id () == last_token_id) - rust_error_at (invoc_locus, "macro takes 1 argument"); - else - { - macro_invoc = parser.parse_macro_invocation (AST::AttrVec ()); - - parser.maybe_skip_token (COMMA); - if (parser.peek_current_token ()->get_id () != last_token_id) - { - lit_expr = nullptr; - rust_error_at (invoc_locus, "macro takes 1 argument"); - } - - if (macro_invoc != nullptr) - { - auto path_str = make_macro_path_str (kind); - - auto pending_invocations - = std::vector> (); - - pending_invocations.push_back (std::move (macro_invoc)); - - return AST::MacroInvocation::Builtin ( - kind, - AST::MacroInvocData (AST::SimplePath ({AST::SimplePathSegment ( - path_str, invoc_locus)}), - std::move (invoc_token_tree)), - {}, invoc_locus, std::move (pending_invocations)); - } - else - { - rust_error_at (invoc_locus, "argument must be a string literal or a " - "macro which expands to a string"); - } - } - - parser.skip_token (last_token_id); - - return std::unique_ptr (std::move (lit_expr)); -} - -/* Treat PATH as a path relative to the source file currently being - compiled, and return the absolute path for it. */ - -std::string -source_relative_path (std::string path, location_t locus) -{ - std::string compile_fname = LOCATION_FILE (locus); - - auto dir_separator_pos = compile_fname.rfind (file_separator); - - /* If there is no file_separator in the path, use current dir ('.'). */ - std::string dirname; - if (dir_separator_pos == std::string::npos) - dirname = std::string (".") + file_separator; - else - dirname = compile_fname.substr (0, dir_separator_pos) + file_separator; - - return dirname + path; -} - -/* Read the full contents of the file FILENAME and return them in a vector. - FIXME: platform specific. */ - -tl::optional> -load_file_bytes (location_t invoc_locus, const char *filename) -{ - RAIIFile file_wrap (filename); - if (file_wrap.get_raw () == nullptr) - { - rust_error_at (invoc_locus, "cannot open filename %s: %m", filename); - return tl::nullopt; - } - - FILE *f = file_wrap.get_raw (); - fseek (f, 0L, SEEK_END); - long fsize = ftell (f); - fseek (f, 0L, SEEK_SET); - - std::vector buf (fsize); - - if (fsize > 0 && fread (&buf[0], fsize, 1, f) != 1) - { - rust_error_at (invoc_locus, "error reading file %s: %m", filename); - return std::vector (); - } - - return buf; -} -} // namespace - -tl::optional -MacroBuiltin::assert_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - rust_debug ("assert!() called"); - - return AST::Fragment::create_error (); -} - -tl::optional -MacroBuiltin::file_handler (location_t invoc_locus, AST::MacroInvocData &) -{ - auto current_file = LOCATION_FILE (invoc_locus); - auto file_str = AST::SingleASTNode (make_string (invoc_locus, current_file)); - auto str_token - = make_token (Token::make_string (invoc_locus, std::move (current_file))); - - return AST::Fragment ({file_str}, std::move (str_token)); -} - -tl::optional -MacroBuiltin::column_handler (location_t invoc_locus, AST::MacroInvocData &) -{ - auto current_column = LOCATION_COLUMN (invoc_locus); - - auto column_tok = make_token ( - Token::make_int (invoc_locus, std::to_string (current_column))); - auto column_no = AST::SingleASTNode (std::unique_ptr ( - new AST::LiteralExpr (std::to_string (current_column), AST::Literal::INT, - PrimitiveCoreType::CORETYPE_U32, {}, invoc_locus))); - - return AST::Fragment ({column_no}, std::move (column_tok)); -} - -/* Expand builtin macro include_bytes!("filename"), which includes the contents - of the given file as reference to a byte array. Yields an expression of type - &'static [u8; N]. */ - -tl::optional -MacroBuiltin::include_bytes_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - /* Get target filename from the macro invocation, which is treated as a path - relative to the include!-ing file (currently being compiled). */ - auto lit_expr - = parse_single_string_literal (BuiltinMacro::IncludeBytes, - invoc.get_delim_tok_tree (), invoc_locus, - invoc.get_expander ()); - if (lit_expr == nullptr) - return AST::Fragment::create_error (); - - rust_assert (lit_expr->is_literal ()); - - std::string target_filename - = source_relative_path (lit_expr->as_string (), invoc_locus); - - auto maybe_bytes = load_file_bytes (invoc_locus, target_filename.c_str ()); - - if (!maybe_bytes.has_value ()) - return AST::Fragment::create_error (); - - std::vector bytes = maybe_bytes.value (); - - /* Is there a more efficient way to do this? */ - std::vector> elts; - - // We create the tokens for a borrow expression of a byte array, so - // & [ , , ... ] - std::vector> toks; - toks.emplace_back (make_token (Token::make (AMP, invoc_locus))); - toks.emplace_back (make_token (Token::make (LEFT_SQUARE, invoc_locus))); - - for (uint8_t b : bytes) - { - elts.emplace_back ( - new AST::LiteralExpr (std::string (1, (char) b), AST::Literal::BYTE, - PrimitiveCoreType::CORETYPE_U8, - {} /* outer_attrs */, invoc_locus)); - toks.emplace_back (make_token (Token::make_byte_char (invoc_locus, b))); - toks.emplace_back (make_token (Token::make (COMMA, invoc_locus))); - } - - toks.emplace_back (make_token (Token::make (RIGHT_SQUARE, invoc_locus))); - - auto elems = std::unique_ptr ( - new AST::ArrayElemsValues (std::move (elts), invoc_locus)); - - auto array = std::unique_ptr ( - new AST::ArrayExpr (std::move (elems), {}, {}, invoc_locus)); - - auto borrow = std::unique_ptr ( - new AST::BorrowExpr (std::move (array), false, false, {}, invoc_locus)); - - auto node = AST::SingleASTNode (std::move (borrow)); - - return AST::Fragment ({node}, std::move (toks)); -} // namespace Rust - -/* Expand builtin macro include_str!("filename"), which includes the contents - of the given file as a string. The file must be UTF-8 encoded. Yields an - expression of type &'static str. */ - -tl::optional -MacroBuiltin::include_str_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - /* Get target filename from the macro invocation, which is treated as a path - relative to the include!-ing file (currently being compiled). */ - auto lit_expr - = parse_single_string_literal (BuiltinMacro::IncludeStr, - invoc.get_delim_tok_tree (), invoc_locus, - invoc.get_expander ()); - if (lit_expr == nullptr) - return AST::Fragment::create_error (); - - if (!lit_expr->is_literal ()) - { - auto token_tree = invoc.get_delim_tok_tree (); - return AST::Fragment ({AST::SingleASTNode (std::move (lit_expr))}, - token_tree.to_token_stream ()); - } - - std::string target_filename - = source_relative_path (lit_expr->as_string (), invoc_locus); - - auto maybe_bytes = load_file_bytes (invoc_locus, target_filename.c_str ()); - - if (!maybe_bytes.has_value ()) - return AST::Fragment::create_error (); - - std::vector bytes = maybe_bytes.value (); - - /* FIXME: reuse lexer */ - int expect_single = 0; - for (uint8_t b : bytes) - { - if (expect_single) - { - if ((b & 0xC0) != 0x80) - /* character was truncated, exit with expect_single != 0 */ - break; - expect_single--; - } - else if (b & 0x80) - { - if (b >= 0xF8) - { - /* more than 4 leading 1s */ - expect_single = 1; - break; - } - else if (b >= 0xF0) - { - /* 4 leading 1s */ - expect_single = 3; - } - else if (b >= 0xE0) - { - /* 3 leading 1s */ - expect_single = 2; - } - else if (b >= 0xC0) - { - /* 2 leading 1s */ - expect_single = 1; - } - else - { - /* only 1 leading 1 */ - expect_single = 1; - break; - } - } - } - - std::string str; - if (expect_single) - rust_error_at (invoc_locus, "%s was not a valid utf-8 file", - target_filename.c_str ()); - else - str = std::string ((const char *) bytes.data (), bytes.size ()); - - auto node = AST::SingleASTNode (make_string (invoc_locus, str)); - auto str_tok = make_token (Token::make_string (invoc_locus, std::move (str))); - - return AST::Fragment ({node}, std::move (str_tok)); -} - -/* Expand builtin macro compile_error!("error"), which forces a compile error - during the compile time. */ -tl::optional -MacroBuiltin::compile_error_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - auto lit_expr - = parse_single_string_literal (BuiltinMacro::CompileError, - invoc.get_delim_tok_tree (), invoc_locus, - invoc.get_expander ()); - if (lit_expr == nullptr) - return AST::Fragment::create_error (); - - rust_assert (lit_expr->is_literal ()); - - std::string error_string = lit_expr->as_string (); - rust_error_at (invoc_locus, "%s", error_string.c_str ()); - - return AST::Fragment::create_error (); -} - -/* Expand builtin macro concat!(), which joins all the literal parameters - into a string with no delimiter. */ - -// This is a weird one. We want to do something where, if something cannot be -// expanded yet (i.e. macro invocation?) we return the whole MacroInvocation -// node again but expanded as much as possible. -// Is that possible? How do we do that? -// -// Let's take a few examples: -// -// 1. concat!(1, 2, true); -// 2. concat!(a!(), 2, true); -// 3. concat!(concat!(1, false), 2, true); -// 4. concat!(concat!(1, a!()), 2, true); -// -// 1. We simply want to return the new fragment: "12true" -// 2. We want to return `concat!(a_expanded, 2, true)` as a fragment -// 3. We want to return `concat!(1, false, 2, true)` -// 4. We want to return `concat!(concat!(1, a_expanded), 2, true); -// -// How do we do that? -// -// For each (un)expanded fragment: we check if it is expanded fully -// -// 1. What is expanded fully? -// 2. How to check? -// -// If it is expanded fully and not a literal, then we error out. -// Otherwise we simply emplace it back and keep going. -// -// In the second case, we must mark that this concat invocation still has some -// expansion to do: This allows us to return a `MacroInvocation { ... }` as an -// AST fragment, instead of a completed string. -// -// This means that we must change all the `try_expand_many_*` APIs and so on to -// return some sort of index or way to signify that we might want to reuse some -// bits and pieces of the original token tree. -// -// Now, before that: How do we resolve the names used in a builtin macro -// invocation? -// Do we split the two passes of parsing the token tree and then expanding it? -// Can we do that easily? -tl::optional -MacroBuiltin::concat_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - auto invoc_token_tree = invoc.get_delim_tok_tree (); - MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); - Parser parser (lex); - - auto str = std::string (); - bool has_error = false; - - auto last_token_id = macro_end_token (invoc_token_tree, parser); - - auto start = lex.get_offs (); - /* NOTE: concat! could accept no argument, so we don't have any checks here */ - auto expanded_expr = try_expand_many_expr (parser, last_token_id, - invoc.get_expander (), has_error); - auto end = lex.get_offs (); - - auto tokens = lex.get_token_slice (start, end); - - auto pending_invocations = check_for_eager_invocations (expanded_expr); - if (!pending_invocations.empty ()) - return make_eager_builtin_invocation (BuiltinMacro::Concat, invoc_locus, - invoc.get_delim_tok_tree (), - std::move (pending_invocations)); - - for (auto &expr : expanded_expr) - { - if (!expr->is_literal () - && expr->get_ast_kind () != AST::Kind::MACRO_INVOCATION) - { - has_error = true; - rust_error_at (expr->get_locus (), "expected a literal"); - // diagnostics copied from rustc - rust_inform (expr->get_locus (), - "only literals (like %<\"foo\"%>, %<42%> and " - "%<3.14%>) can be passed to %"); - continue; - } - auto *literal = static_cast (expr.get ()); - if (literal->get_lit_type () == AST::Literal::BYTE - || literal->get_lit_type () == AST::Literal::BYTE_STRING) - { - has_error = true; - rust_error_at (expr->get_locus (), - "cannot concatenate a byte string literal"); - continue; - } - str += literal->as_string (); - } - - parser.skip_token (last_token_id); - - if (has_error) - return AST::Fragment::create_error (); - - auto node = AST::SingleASTNode (make_string (invoc_locus, str)); - auto str_tok = make_token (Token::make_string (invoc_locus, std::move (str))); - - return AST::Fragment ({node}, std::move (str_tok)); -} - -/* Expand builtin macro env!(), which inspects an environment variable at - compile time. */ -tl::optional -MacroBuiltin::env_handler (location_t invoc_locus, AST::MacroInvocData &invoc) -{ - auto invoc_token_tree = invoc.get_delim_tok_tree (); - MacroInvocLexer lex (invoc_token_tree.to_token_stream ()); - Parser parser (lex); - - auto last_token_id = macro_end_token (invoc_token_tree, parser); - std::unique_ptr error_expr = nullptr; - std::unique_ptr lit_expr = nullptr; - bool has_error = false; - - auto start = lex.get_offs (); - auto expanded_expr = try_expand_many_expr (parser, last_token_id, - invoc.get_expander (), has_error); - auto end = lex.get_offs (); - - auto tokens = lex.get_token_slice (start, end); - - if (has_error) - return AST::Fragment::create_error (); - - auto pending = check_for_eager_invocations (expanded_expr); - if (!pending.empty ()) - return make_eager_builtin_invocation (BuiltinMacro::Env, invoc_locus, - invoc_token_tree, - std::move (pending)); - - if (expanded_expr.size () < 1 || expanded_expr.size () > 2) - { - rust_error_at (invoc_locus, "env! takes 1 or 2 arguments"); - return AST::Fragment::create_error (); - } - if (expanded_expr.size () > 0) - { - if (!(lit_expr - = try_extract_string_literal_from_fragment (invoc_locus, - expanded_expr[0]))) - { - return AST::Fragment::create_error (); - } - } - if (expanded_expr.size () > 1) - { - if (!(error_expr - = try_extract_string_literal_from_fragment (invoc_locus, - expanded_expr[1]))) - { - return AST::Fragment::create_error (); - } - } - - parser.skip_token (last_token_id); - - auto env_value = getenv (lit_expr->as_string ().c_str ()); - - if (env_value == nullptr) - { - if (error_expr == nullptr) - rust_error_at (invoc_locus, "environment variable %qs not defined", - lit_expr->as_string ().c_str ()); - else - rust_error_at (invoc_locus, "%s", error_expr->as_string ().c_str ()); - return AST::Fragment::create_error (); - } - - auto node = AST::SingleASTNode (make_string (invoc_locus, env_value)); - auto tok - = make_token (Token::make_string (invoc_locus, std::move (env_value))); - - return AST::Fragment ({node}, std::move (tok)); -} - -tl::optional -MacroBuiltin::cfg_handler (location_t invoc_locus, AST::MacroInvocData &invoc) -{ - // only parse if not already parsed - if (!invoc.is_parsed ()) - { - std::unique_ptr converted_input ( - invoc.get_delim_tok_tree ().parse_to_meta_item ()); - - if (converted_input == nullptr) - { - rust_debug ("DEBUG: failed to parse macro to meta item"); - // TODO: do something now? is this an actual error? - } - else - { - std::vector> meta_items ( - std::move (converted_input->get_items ())); - invoc.set_meta_item_output (std::move (meta_items)); - } - } - - /* TODO: assuming that cfg! macros can only have one meta item inner, like cfg - * attributes */ - if (invoc.get_meta_items ().size () != 1) - return AST::Fragment::create_error (); - - bool result = invoc.get_meta_items ()[0]->check_cfg_predicate ( - Session::get_instance ()); - auto literal_exp = AST::SingleASTNode (std::unique_ptr ( - new AST::LiteralExpr (result ? "true" : "false", AST::Literal::BOOL, - PrimitiveCoreType::CORETYPE_BOOL, {}, invoc_locus))); - auto tok = make_token ( - Token::make (result ? TRUE_LITERAL : FALSE_LITERAL, invoc_locus)); - - return AST::Fragment ({literal_exp}, std::move (tok)); -} - -/* Expand builtin macro include!(), which includes a source file at the current - scope compile time. */ - -tl::optional -MacroBuiltin::include_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - /* Get target filename from the macro invocation, which is treated as a path - relative to the include!-ing file (currently being compiled). */ - auto lit_expr - = parse_single_string_literal (BuiltinMacro::Include, - invoc.get_delim_tok_tree (), invoc_locus, - invoc.get_expander ()); - if (lit_expr == nullptr) - return AST::Fragment::create_error (); - - rust_assert (lit_expr->is_literal ()); - - std::string filename - = source_relative_path (lit_expr->as_string (), invoc_locus); - auto target_filename - = Rust::Session::get_instance ().include_extra_file (std::move (filename)); - - RAIIFile target_file (target_filename); - Linemap *linemap = Session::get_instance ().linemap; - - if (!target_file.ok ()) - { - rust_error_at (lit_expr->get_locus (), - "cannot open included file %qs: %m", target_filename); - return AST::Fragment::create_error (); - } - - rust_debug ("Attempting to parse included file %s", target_filename); - - Lexer lex (target_filename, std::move (target_file), linemap); - Parser parser (lex); - - auto parsed_items = parser.parse_items (); - bool has_error = !parser.get_errors ().empty (); - - for (const auto &error : parser.get_errors ()) - error.emit (); - - if (has_error) - { - // inform the user that the errors above are from a included file - rust_inform (invoc_locus, "included from here"); - return AST::Fragment::create_error (); - } - - std::vector nodes{}; - for (auto &item : parsed_items) - { - AST::SingleASTNode node (std::move (item)); - nodes.push_back (node); - } - - // FIXME: This returns an empty vector of tokens and works fine, but is that - // the expected behavior? `include` macros are a bit harder to reason about - // since they include tokens. Furthermore, our lexer has no easy way to return - // a slice of tokens like the MacroInvocLexer. So it gets even harder to - // extrac tokens from here. For now, let's keep it that way and see if it - // eventually breaks, but I don't expect it to cause many issues since the - // list of tokens is only used when a macro invocation mixes eager - // macro invocations and already expanded tokens. Think - // `concat!(a!(), 15, b!())`. We need to be able to expand a!(), expand b!(), - // and then insert the `15` token in between. In the case of `include!()`, we - // only have one argument. So it's either going to be a macro invocation or a - // string literal. - return AST::Fragment (nodes, std::vector> ()); -} - -tl::optional -MacroBuiltin::line_handler (location_t invoc_locus, AST::MacroInvocData &) -{ - auto current_line = LOCATION_LINE (invoc_locus); - - auto line_no = AST::SingleASTNode (std::unique_ptr ( - new AST::LiteralExpr (std::to_string (current_line), AST::Literal::INT, - PrimitiveCoreType::CORETYPE_U32, {}, invoc_locus))); - auto tok - = make_token (Token::make_int (invoc_locus, std::to_string (current_line))); - - return AST::Fragment ({line_no}, std::move (tok)); -} - -tl::optional -MacroBuiltin::stringify_handler (location_t invoc_locus, - AST::MacroInvocData &invoc) -{ - std::string content; - auto invoc_token_tree = invoc.get_delim_tok_tree (); - auto tokens = invoc_token_tree.to_token_stream (); - - // Tokens stream includes the first and last delimiter - // which we need to skip. - for (auto token = tokens.cbegin () + 1; token < tokens.cend () - 1; token++) - { - // Rust stringify format has no garantees but the reference compiler - // removes spaces before some tokens depending on the lexer's behavior, - // let's mimick some of those behaviors. - auto token_id = (*token)->get_id (); - if (token_id != RIGHT_PAREN && token_id != EXCLAM - && token != tokens.cbegin () + 1) - { - content.push_back (' '); - } - content += (*token)->as_string (); - } - - auto node = AST::SingleASTNode (make_string (invoc_locus, content)); - auto token - = make_token (Token::make_string (invoc_locus, std::move (content))); - return AST::Fragment ({node}, std::move (token)); -} - -struct FormatArgsInput -{ - std::string format_str; - AST::FormatArguments args; - // bool is_literal? -}; - -struct FormatArgsParseError -{ - enum class Kind - { - MissingArguments - } kind; -}; - -static tl::expected -format_args_parse_arguments (AST::MacroInvocData &invoc) -{ - MacroInvocLexer lex (invoc.get_delim_tok_tree ().to_token_stream ()); - Parser parser (lex); - - // TODO: check if EOF - return that format_args!() requires at least one - // argument - - auto args = AST::FormatArguments (); - auto last_token_id = macro_end_token (invoc.get_delim_tok_tree (), parser); - std::unique_ptr format_expr = nullptr; - - // TODO: Handle the case where we're not parsing a string literal (macro - // invocation for e.g.) - if (parser.peek_current_token ()->get_id () == STRING_LITERAL) - format_expr = parser.parse_literal_expr (); - - // TODO(Arthur): Clean this up - if we haven't parsed a string literal but a - // macro invocation, what do we do here? return a tl::unexpected? - auto format_str = static_cast (*format_expr) - .get_literal () - .as_string (); - - // TODO: Allow implicit captures ONLY if the the first arg is a string literal - // and not a macro invocation - - // TODO: How to consume all of the arguments until the delimiter? - - // TODO: What we then want to do is as follows: - // for each token, check if it is an identifier - // yes? is the next token an equal sign (=) - // yes? - // -> if that identifier is already present in our map, error - // out - // -> parse an expression, return a FormatArgument::Named - // no? - // -> if there have been named arguments before, error out - // (positional after named error) - // -> parse an expression, return a FormatArgument::Normal - while (parser.peek_current_token ()->get_id () != last_token_id) - { - parser.skip_token (COMMA); - - if (parser.peek_current_token ()->get_id () == IDENTIFIER - && parser.peek (1)->get_id () == EQUAL) - { - // FIXME: This is ugly - just add a parser.parse_identifier()? - auto ident_tok = parser.peek_current_token (); - auto ident = Identifier (ident_tok); - - parser.skip_token (IDENTIFIER); - parser.skip_token (EQUAL); - - auto expr = parser.parse_expr (); - - // TODO: Handle graciously - if (!expr) - rust_unreachable (); - - args.push (AST::FormatArgument::named (ident, std::move (expr))); - } - else - { - auto expr = parser.parse_expr (); - - // TODO: Handle graciously - if (!expr) - rust_unreachable (); - - args.push (AST::FormatArgument::normal (std::move (expr))); - } - // we need to skip commas, don't we? - } - - return FormatArgsInput{std::move (format_str), std::move (args)}; -} - -tl::optional -MacroBuiltin::format_args_handler (location_t invoc_locus, - AST::MacroInvocData &invoc, - AST::FormatArgs::Newline nl) -{ - auto input = format_args_parse_arguments (invoc); - - if (!input) - { - rust_error_at (invoc_locus, - "could not parse arguments to %"); - return tl::nullopt; - } - - // TODO(Arthur): We need to handle this - // // if it is not a literal, it's an eager macro invocation - return it - // if (!fmt_expr->is_literal ()) - // { - // auto token_tree = invoc.get_delim_tok_tree (); - // return AST::Fragment ({AST::SingleASTNode (std::move (fmt_expr))}, - // token_tree.to_token_stream ()); - // } - - // TODO(Arthur): Handle this as well - raw strings are special for the - // format_args parser auto fmt_str = static_cast - // (*fmt_arg.get ()); Switch on the format string to know if the string is raw - // or cooked switch (fmt_str.get_lit_type ()) - // { - // // case AST::Literal::RAW_STRING: - // case AST::Literal::STRING: - // break; - // case AST::Literal::CHAR: - // case AST::Literal::BYTE: - // case AST::Literal::BYTE_STRING: - // case AST::Literal::INT: - // case AST::Literal::FLOAT: - // case AST::Literal::BOOL: - // case AST::Literal::ERROR: - // rust_unreachable (); - // } - - bool append_newline = nl == AST::FormatArgs::Newline::Yes; - - auto fmt_str = std::move (input->format_str); - if (append_newline) - fmt_str += '\n'; - - auto pieces = Fmt::Pieces::collect (fmt_str, append_newline); - - // TODO: - // do the transformation into an AST::FormatArgs node - // return that - // expand it during lowering - - // TODO: we now need to take care of creating `unfinished_literal`? this is - // for creating the `template` - - auto fmt_args_node = AST::FormatArgs (invoc_locus, std::move (pieces), - std::move (input->args)); - - auto expanded - = Fmt::expand_format_args (fmt_args_node, - invoc.get_delim_tok_tree ().to_token_stream ()); - - if (!expanded.has_value ()) - return AST::Fragment::create_error (); - - return *expanded; - - // auto node = std::unique_ptr (fmt_args_node); - // auto single_node = AST::SingleASTNode (std::move (node)); - - // return AST::Fragment ({std::move (single_node)}, - // invoc.get_delim_tok_tree ().to_token_stream ()); -} - tl::optional MacroBuiltin::sorry (location_t invoc_locus, AST::MacroInvocData &invoc) { diff --git a/gcc/rust/expand/rust-macro-builtins.h b/gcc/rust/expand/rust-macro-builtins.h index 62961561716..ea1919ba619 100644 --- a/gcc/rust/expand/rust-macro-builtins.h +++ b/gcc/rust/expand/rust-macro-builtins.h @@ -31,9 +31,9 @@ namespace Rust { // transcriber and extra info if necessary // then make a global map -/** - * All builtin macros possible - */ +// +// All builtin macros possible +// enum class BuiltinMacro { Assert, @@ -79,42 +79,43 @@ enum class BuiltinMacro tl::optional builtin_macro_from_string (const std::string &identifier); -/** - * This class provides a list of builtin macros implemented by the compiler. - * The functions defined are called "builtin transcribers" in that they replace - * the transcribing part of a macro definition. - * - * Like regular macro transcribers, they are responsible for building and - * returning an AST fragment: basically a vector of AST nodes put together. - * - * Unlike regular declarative macros where each match arm has its own associated - * transcriber, builtin transcribers are responsible for handling all match arms - * of the macro. This means that you should take extra care when implementing a - * builtin containing multiple match arms: You will probably need to do some - * lookahead in order to determine which match arm the user intended to use. - * - * An example of this is the `assert!()` macro: - * - * ``` - * macro_rules! assert { - * ($cond:expr $(,)?) => {{ ... }}; - * ($cond : expr, $ ($arg : tt) +) = > {{ ... }}; - * } - * ``` - * - * If more tokens exist beyond the optional comma, they need to be handled as - * a token-tree for a custom panic message. - * - * These builtin macros with empty transcribers are defined in the standard - * library. They are marked with a special attribute, `#[rustc_builtin_macro]`. - * When this attribute is present on a macro definition, the compiler should - * look for an associated transcriber in the mappings. Meaning that you must - * remember to insert your transcriber in the `builtin_macros` map of the - *`Mappings`. - * - * This map is built as a static variable in the `insert_macro_def()` method - * of the `Mappings` class. - */ +// +// This class provides a list of builtin macros implemented by the compiler. +// The functions defined are called "builtin transcribers" in that they +// replace the transcribing part of a macro definition. +// +// Like regular macro transcribers, they are responsible for building and +// returning an AST fragment: basically a vector of AST nodes put together. +// +// Unlike regular declarative macros where each match arm has its own +// associated transcriber, builtin transcribers are responsible for handling +// all match arms of the macro. This means that you should take extra care +// when implementing a builtin containing multiple match arms: You will +// probably need to do some lookahead in order to determine which match arm +// the user intended to use. +// +// An example of this is the `assert!()` macro: +// +// ``` +// macro_rules! assert { +// ($cond:expr $(,)?) => {{ ... }}; +// ($cond : expr, $ ($arg : tt) +) = > {{ ... }}; +// } +// ``` +// +// If more tokens exist beyond the optional comma, they need to be handled as +// a token-tree for a custom panic message. +// +// These builtin macros with empty transcribers are defined in the standard +// library. They are marked with a special attribute, +// `#[rustc_builtin_macro]`. When this attribute is present on a macro +// definition, the compiler should look for an associated transcriber in the +// mappings. Meaning that you must remember to insert your transcriber in the +// `builtin_macros` map of the `Mappings`. +// +// This map is built as a static variable in the `insert_macro_def()` method +// of the `Mappings` class. + class MacroBuiltin { public: