From patchwork Thu Aug 1 14:56:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arthur Cohen X-Patchwork-Id: 1967727 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.a=rsa-sha256 header.s=google header.b=BPN1t3Jk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZXLS6K24z1yZv for ; Fri, 2 Aug 2024 01:04:32 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F3470385E836 for ; Thu, 1 Aug 2024 15:04:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by sourceware.org (Postfix) with ESMTPS id 2F0433858C48 for ; Thu, 1 Aug 2024 14:58:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F0433858C48 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F0433858C48 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::52a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524325; cv=none; b=CM7COnTugoy51eHh9fFRbOr9SHQlGgxOBsOUDh88dC4Redxe+mh0xs0/DjLSd0oxOUxIriEVKAh9lSYFzrEp05sSwRb6XonGG7M828fy9tjfp8VqtwRiia5Pm8f8zx6dG4tq7Yspl6TCzYmrV1YtzHoz9TFBMJrd9V+el4C6hu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524325; c=relaxed/simple; bh=6j1HVRHDX6rsiWbqCkVAOTBHiNwPuDTo4Vbiaq/Zfhg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=TK98x/KXr9hpaA3F1MWJtJmkgyIS8+Vkronq0jfSOjhpqG0a632uTQS1Q+whqFma6CLzjPjrZfSmzqdbpnS6m1MObFGWRMKTgfCfpcP97W00QnkpByXr5aW38dfcYA8Z9tjnI2RYQfncYK0UCushez8uPu8H+aMnYZ0v0tmtEAQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-5b7b6a30454so850981a12.2 for ; Thu, 01 Aug 2024 07:58:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1722524320; x=1723129120; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IPXPODuEbRqXKtZ6fMYCUx+tgTgBGzUWEjCgtmPxOeA=; b=BPN1t3JkhoRtNqBOxSiGUkMuC6+m/mPw48G8eCApX4JEJqlY1UvhZSADk3ymJ5LvZn Qe4EyC5LQUqurw24N/KZEErwIW2lbihpC7gvMnAvHM7LEdgiIl34wxXQn1zIbHvTBGDb A6ULCpUhVsu4/P4DFF2fi2ahREaCxQG08VzOYne6TEgvrLX7lti/2dmAo0I7fZqAPCdM xvBdHmFoBlMj7jCA4gSL/ytvrFnJvdgYPpYH5zFZp3UHw48/q4UA78uHZ/ysrkz4qB6h pv6449LmB1mBDtQy9Wjgm+AM2xlwlE6/f2LeVqGZP2OnYxSKGaOX6P6NTvFYAu9tbx6s yTKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722524320; x=1723129120; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IPXPODuEbRqXKtZ6fMYCUx+tgTgBGzUWEjCgtmPxOeA=; b=hy0FQrX2I7dfYBAVEisDuJ5S4BL1mqoG6O1HMpBCJ+76S2FhdTXr6Cu/040yeX/pd0 T+NPFlv0gVwa1tzSDxxNMd9pzFwi46JFkizQ71kJxdFK1Fx3l068nYgo7kvm8UwKmnWj +q5jmpbR+pj/NuN/O3e0a0qs1jEr8URlcg90/+KXSQk6gqHWmQBRM02YpFG86l0T9Nxz WVyDjTzkSnojhASrr8y1S+jwelRF+jxPbXiSVS6Ia2He/uMrUDK6B6BBI7dI6mloVz27 BeuEGvmlSbc0MuxHjPXGS1w9QUCAxdUCSqWZqqvuAX7Ns72K3caQrxej8ZsteOnc+YuM /Hyw== X-Gm-Message-State: AOJu0YxAVaG8hWvYIN1y7F+fcTzuN5WO7m8eUfvl53FXuJ3Hcq4n13kT HK03HXime+D1X99v+sLeYAWDoWCPepRcpQvWitjvSfvYF3LkJ6M/w1JGiJyaOoXelbfoJ7wY96i ndSC+ X-Google-Smtp-Source: AGHT+IEwoG+03h47lETsHh8gxD5HvAvDJVB+sXUhHzO2L0NWvwb4uIFmmOQz+qzs1FEMCYpuOYm5Ag== X-Received: by 2002:a50:eac1:0:b0:5a3:b45:3970 with SMTP id 4fb4d7f45d1cf-5b7f0bd6f5cmr435605a12.0.1722524319460; Thu, 01 Aug 2024 07:58:39 -0700 (PDT) Received: from platypus.lan ([2a04:cec2:9:dc84:3622:6733:ff49:ee91]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5ac63590592sm10252456a12.25.2024.08.01.07.58.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Aug 2024 07:58:38 -0700 (PDT) From: Arthur Cohen To: gcc-patches@gcc.gnu.org Cc: gcc-rust@gcc.gnu.org, Arthur Cohen Subject: [PATCH 011/125] gccrs: libformat_parser: Add FFI safe interface Date: Thu, 1 Aug 2024 16:56:07 +0200 Message-ID: <20240801145809.366388-13-arthur.cohen@embecosm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240801145809.366388-2-arthur.cohen@embecosm.com> References: <20240801145809.366388-2-arthur.cohen@embecosm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org libgrust/ChangeLog: * libformat_parser/generic_format_parser/src/lib.rs: Add generic library. * libformat_parser/src/lib.rs: Add base for FFI interface. --- .../generic_format_parser/src/lib.rs | 2 +- libgrust/libformat_parser/src/lib.rs | 301 +++++++++++++++++- 2 files changed, 298 insertions(+), 5 deletions(-) diff --git a/libgrust/libformat_parser/generic_format_parser/src/lib.rs b/libgrust/libformat_parser/generic_format_parser/src/lib.rs index f42c9d8dffb..87a20dc18c5 100644 --- a/libgrust/libformat_parser/generic_format_parser/src/lib.rs +++ b/libgrust/libformat_parser/generic_format_parser/src/lib.rs @@ -1099,4 +1099,4 @@ fn unescape_string(string: &str) -> Option { // rustc_index::static_assert_size!(Piece<'_>, 16); // #[cfg(test)] -// mod tests; \ No newline at end of file +// mod tests; diff --git a/libgrust/libformat_parser/src/lib.rs b/libgrust/libformat_parser/src/lib.rs index e6dc16eeb49..49821e7cd2f 100644 --- a/libgrust/libformat_parser/src/lib.rs +++ b/libgrust/libformat_parser/src/lib.rs @@ -5,8 +5,298 @@ use std::ffi::CStr; -// TODO: Use rustc's version here #3 -use generic_format_parser::Piece; +mod ffi { + use std::ops::Deref; + + // Note: copied from rustc_span + /// Range inside of a `Span` used for diagnostics when we only have access to relative positions. + #[derive(Copy, Clone, PartialEq, Eq, Debug)] + #[repr(C)] + pub struct InnerSpan { + pub start: usize, + pub end: usize, + } + + // impl InnerSpan { + // pub fn new(start: usize, end: usize) -> InnerSpan { + // InnerSpan { start, end } + // } + // } + + /// The location and before/after width of a character whose width has changed from its source code + /// representation + #[derive(Copy, Clone, PartialEq, Eq)] + #[repr(C)] + pub struct InnerWidthMapping { + /// Index of the character in the source + pub position: usize, + /// The inner width in characters + pub before: usize, + /// The transformed width in characters + pub after: usize, + } + + // impl InnerWidthMapping { + // pub fn new(position: usize, before: usize, after: usize) -> InnerWidthMapping { + // InnerWidthMapping { + // position, + // before, + // after, + // } + // } + // } + + /// Whether the input string is a literal. If yes, it contains the inner width mappings. + #[derive(Clone, PartialEq, Eq)] + #[repr(C)] + enum InputStringKind { + NotALiteral, + Literal { + width_mappings: Vec, + }, + } + + /// The type of format string that we are parsing. + #[derive(Copy, Clone, Debug, Eq, PartialEq)] + #[repr(C)] + pub enum ParseMode { + /// A normal format string as per `format_args!`. + Format, + /// An inline assembly template string for `asm!`. + InlineAsm, + } + + #[derive(Copy, Clone)] + #[repr(C)] + struct InnerOffset(usize); + + /// A piece is a portion of the format string which represents the next part + /// to emit. These are emitted as a stream by the `Parser` class. + #[derive(Clone, Debug, PartialEq)] + #[repr(C)] + pub enum Piece<'a> { + /// A literal string which should directly be emitted + String(&'a str), + /// This describes that formatting should process the next argument (as + /// specified inside) for emission. + NextArgument(Box>), + } + + /// Representation of an argument specification. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub struct Argument<'a> { + /// Where to find this argument + pub position: Position<'a>, + /// The span of the position indicator. Includes any whitespace in implicit + /// positions (`{ }`). + pub position_span: InnerSpan, + /// How to format the argument + pub format: FormatSpec<'a>, + } + + /// Specification for the formatting of an argument in the format string. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub struct FormatSpec<'a> { + /// Optionally specified character to fill alignment with. + pub fill: Option, + /// Span of the optionally specified fill character. + pub fill_span: Option, + /// Optionally specified alignment. + pub align: Alignment, + /// The `+` or `-` flag. + pub sign: Option, + /// The `#` flag. + pub alternate: bool, + /// The `0` flag. + pub zero_pad: bool, + /// The `x` or `X` flag. (Only for `Debug`.) + pub debug_hex: Option, + /// The integer precision to use. + pub precision: Count<'a>, + /// The span of the precision formatting flag (for diagnostics). + pub precision_span: Option, + /// The string width requested for the resulting format. + pub width: Count<'a>, + /// The span of the width formatting flag (for diagnostics). + pub width_span: Option, + /// The descriptor string representing the name of the format desired for + /// this argument, this can be empty or any number of characters, although + /// it is required to be one word. + pub ty: &'a str, + /// The span of the descriptor string (for diagnostics). + pub ty_span: Option, + } + + /// Enum describing where an argument for a format can be located. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub enum Position<'a> { + /// The argument is implied to be located at an index + ArgumentImplicitlyIs(usize), + /// The argument is located at a specific index given in the format, + ArgumentIs(usize), + /// The argument has a name. + ArgumentNamed(&'a str), + } + + /// Enum of alignments which are supported. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub enum Alignment { + /// The value will be aligned to the left. + AlignLeft, + /// The value will be aligned to the right. + AlignRight, + /// The value will be aligned in the center. + AlignCenter, + /// The value will take on a default alignment. + AlignUnknown, + } + + /// Enum for the sign flags. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub enum Sign { + /// The `+` flag. + Plus, + /// The `-` flag. + Minus, + } + + /// Enum for the debug hex flags. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub enum DebugHex { + /// The `x` flag in `{:x?}`. + Lower, + /// The `X` flag in `{:X?}`. + Upper, + } + + /// A count is used for the precision and width parameters of an integer, and + /// can reference either an argument or a literal integer. + #[derive(Copy, Clone, Debug, PartialEq)] + #[repr(C)] + pub enum Count<'a> { + /// The count is specified explicitly. + CountIs(usize), + /// The count is specified by the argument with the given name. + CountIsName(&'a str, InnerSpan), + /// The count is specified by the argument at the given index. + CountIsParam(usize), + /// The count is specified by a star (like in `{:.*}`) that refers to the argument at the given index. + CountIsStar(usize), + /// The count is implied and cannot be explicitly specified. + CountImplied, + } + + impl<'a> From> for Piece<'a> { + fn from(old: generic_format_parser::Piece<'a>) -> Self { + match old { + generic_format_parser::Piece::String(x) => Piece::String(x), + generic_format_parser::Piece::NextArgument(x) => { + Piece::NextArgument(Box::new(Into::::into(*x))) + } + } + } + } + + impl<'a> From> for Argument<'a> { + fn from(old: generic_format_parser::Argument<'a>) -> Self { + Argument { + position: old.position.into(), + position_span: old.position_span.into(), + format: old.format.into(), + } + } + } + + impl<'a> From> for Position<'a> { + fn from(old: generic_format_parser::Position<'a>) -> Self { + match old { + generic_format_parser::Position::ArgumentImplicitlyIs(x) => { + Position::ArgumentImplicitlyIs(x.into()) + } + generic_format_parser::Position::ArgumentIs(x) => Position::ArgumentIs(x.into()), + generic_format_parser::Position::ArgumentNamed(x) => { + Position::ArgumentNamed(x.into()) + } + } + } + } + + impl From for InnerSpan { + fn from(old: generic_format_parser::InnerSpan) -> Self { + InnerSpan { + start: old.start, + end: old.end, + } + } + } + + impl<'a> From> for FormatSpec<'a> { + fn from(old: generic_format_parser::FormatSpec<'a>) -> Self { + FormatSpec { + fill: old.fill, + fill_span: old.fill_span.map(Into::into), + align: old.align.into(), + sign: old.sign.map(Into::into), + alternate: old.alternate, + zero_pad: old.zero_pad, + debug_hex: old.debug_hex.map(Into::into), + precision: old.precision.into(), + precision_span: old.precision_span.map(Into::into), + width: old.width.into(), + width_span: old.width_span.map(Into::into), + ty: old.ty, + ty_span: old.ty_span.map(Into::into), + } + } + } + + impl From for DebugHex { + fn from(old: generic_format_parser::DebugHex) -> Self { + match old { + generic_format_parser::DebugHex::Lower => DebugHex::Lower, + generic_format_parser::DebugHex::Upper => DebugHex::Upper, + } + } + } + + impl<'a> From> for Count<'a> { + fn from(old: generic_format_parser::Count<'a>) -> Self { + match old { + generic_format_parser::Count::CountIs(x) => Count::CountIs(x), + generic_format_parser::Count::CountIsName(x, y) => Count::CountIsName(x, y.into()), + generic_format_parser::Count::CountIsParam(x) => Count::CountIsParam(x), + generic_format_parser::Count::CountIsStar(x) => Count::CountIsStar(x), + generic_format_parser::Count::CountImplied => Count::CountImplied, + } + } + } + + impl From for Sign { + fn from(old: generic_format_parser::Sign) -> Self { + match old { + generic_format_parser::Sign::Plus => Sign::Plus, + generic_format_parser::Sign::Minus => Sign::Minus, + } + } + } + + impl From for Alignment { + fn from(old: generic_format_parser::Alignment) -> Self { + match old { + generic_format_parser::Alignment::AlignLeft => Alignment::AlignLeft, + generic_format_parser::Alignment::AlignRight => Alignment::AlignRight, + generic_format_parser::Alignment::AlignCenter => Alignment::AlignCenter, + generic_format_parser::Alignment::AlignUnknown => Alignment::AlignUnknown, + } + } + } +} // FIXME: Rename? pub mod rust { @@ -22,7 +312,7 @@ pub mod rust { #[repr(C)] pub struct PieceSlice { - base_ptr: *const Piece<'static /* FIXME: That's wrong */>, + base_ptr: *const ffi::Piece<'static /* FIXME: That's wrong */>, len: usize, } @@ -32,7 +322,10 @@ pub extern "C" fn collect_pieces(input: *const libc::c_char) -> PieceSlice { let str = unsafe { CStr::from_ptr(input) }; // FIXME: No unwrap - let pieces = rust::collect_pieces(str.to_str().unwrap()); + let pieces: Vec> = rust::collect_pieces(str.to_str().unwrap()) + .into_iter() + .map(Into::into) + .collect(); PieceSlice { base_ptr: pieces.as_ptr(),