From patchwork Thu Aug  1 14:56:07 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Arthur Cohen <arthur.cohen@embecosm.com>
X-Patchwork-Id: 1967727
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 unprotected) header.d=embecosm.com header.i=@embecosm.com header.a=rsa-sha256
 header.s=google header.b=BPN1t3Jk;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=8.43.85.97; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZXLS6K24z1yZv
	for <incoming@patchwork.ozlabs.org>; Fri,  2 Aug 2024 01:04:32 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id F3470385E836
	for <incoming@patchwork.ozlabs.org>; Thu,  1 Aug 2024 15:04:30 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com
 [IPv6:2a00:1450:4864:20::52a])
 by sourceware.org (Postfix) with ESMTPS id 2F0433858C48
 for <gcc-patches@gcc.gnu.org>; Thu,  1 Aug 2024 14:58:41 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F0433858C48
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=embecosm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F0433858C48
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2a00:1450:4864:20::52a
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524325; cv=none;
 b=CM7COnTugoy51eHh9fFRbOr9SHQlGgxOBsOUDh88dC4Redxe+mh0xs0/DjLSd0oxOUxIriEVKAh9lSYFzrEp05sSwRb6XonGG7M828fy9tjfp8VqtwRiia5Pm8f8zx6dG4tq7Yspl6TCzYmrV1YtzHoz9TFBMJrd9V+el4C6hu8=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1722524325; c=relaxed/simple;
 bh=6j1HVRHDX6rsiWbqCkVAOTBHiNwPuDTo4Vbiaq/Zfhg=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=TK98x/KXr9hpaA3F1MWJtJmkgyIS8+Vkronq0jfSOjhpqG0a632uTQS1Q+whqFma6CLzjPjrZfSmzqdbpnS6m1MObFGWRMKTgfCfpcP97W00QnkpByXr5aW38dfcYA8Z9tjnI2RYQfncYK0UCushez8uPu8H+aMnYZ0v0tmtEAQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: by mail-ed1-x52a.google.com with SMTP id
 4fb4d7f45d1cf-5b7b6a30454so850981a12.2
 for <gcc-patches@gcc.gnu.org>; Thu, 01 Aug 2024 07:58:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=embecosm.com; s=google; t=1722524320; x=1723129120; darn=gcc.gnu.org;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:cc:to:from:from:to:cc:subject:date
 :message-id:reply-to;
 bh=IPXPODuEbRqXKtZ6fMYCUx+tgTgBGzUWEjCgtmPxOeA=;
 b=BPN1t3JkhoRtNqBOxSiGUkMuC6+m/mPw48G8eCApX4JEJqlY1UvhZSADk3ymJ5LvZn
 Qe4EyC5LQUqurw24N/KZEErwIW2lbihpC7gvMnAvHM7LEdgiIl34wxXQn1zIbHvTBGDb
 A6ULCpUhVsu4/P4DFF2fi2ahREaCxQG08VzOYne6TEgvrLX7lti/2dmAo0I7fZqAPCdM
 xvBdHmFoBlMj7jCA4gSL/ytvrFnJvdgYPpYH5zFZp3UHw48/q4UA78uHZ/ysrkz4qB6h
 pv6449LmB1mBDtQy9Wjgm+AM2xlwlE6/f2LeVqGZP2OnYxSKGaOX6P6NTvFYAu9tbx6s
 yTKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1722524320; x=1723129120;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=IPXPODuEbRqXKtZ6fMYCUx+tgTgBGzUWEjCgtmPxOeA=;
 b=hy0FQrX2I7dfYBAVEisDuJ5S4BL1mqoG6O1HMpBCJ+76S2FhdTXr6Cu/040yeX/pd0
 T+NPFlv0gVwa1tzSDxxNMd9pzFwi46JFkizQ71kJxdFK1Fx3l068nYgo7kvm8UwKmnWj
 +q5jmpbR+pj/NuN/O3e0a0qs1jEr8URlcg90/+KXSQk6gqHWmQBRM02YpFG86l0T9Nxz
 WVyDjTzkSnojhASrr8y1S+jwelRF+jxPbXiSVS6Ia2He/uMrUDK6B6BBI7dI6mloVz27
 BeuEGvmlSbc0MuxHjPXGS1w9QUCAxdUCSqWZqqvuAX7Ns72K3caQrxej8ZsteOnc+YuM
 /Hyw==
X-Gm-Message-State: AOJu0YxAVaG8hWvYIN1y7F+fcTzuN5WO7m8eUfvl53FXuJ3Hcq4n13kT
 HK03HXime+D1X99v+sLeYAWDoWCPepRcpQvWitjvSfvYF3LkJ6M/w1JGiJyaOoXelbfoJ7wY96i
 ndSC+
X-Google-Smtp-Source: 
 AGHT+IEwoG+03h47lETsHh8gxD5HvAvDJVB+sXUhHzO2L0NWvwb4uIFmmOQz+qzs1FEMCYpuOYm5Ag==
X-Received: by 2002:a50:eac1:0:b0:5a3:b45:3970 with SMTP id
 4fb4d7f45d1cf-5b7f0bd6f5cmr435605a12.0.1722524319460;
 Thu, 01 Aug 2024 07:58:39 -0700 (PDT)
Received: from platypus.lan ([2a04:cec2:9:dc84:3622:6733:ff49:ee91])
 by smtp.gmail.com with ESMTPSA id
 4fb4d7f45d1cf-5ac63590592sm10252456a12.25.2024.08.01.07.58.38
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 01 Aug 2024 07:58:38 -0700 (PDT)
From: Arthur Cohen <arthur.cohen@embecosm.com>
To: gcc-patches@gcc.gnu.org
Cc: gcc-rust@gcc.gnu.org,
	Arthur Cohen <arthur.cohen@embecosm.com>
Subject: [PATCH 011/125] gccrs: libformat_parser: Add FFI safe interface
Date: Thu,  1 Aug 2024 16:56:07 +0200
Message-ID: <20240801145809.366388-13-arthur.cohen@embecosm.com>
X-Mailer: git-send-email 2.45.2
In-Reply-To: <20240801145809.366388-2-arthur.cohen@embecosm.com>
References: <20240801145809.366388-2-arthur.cohen@embecosm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=unavailable autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

libgrust/ChangeLog:

	* libformat_parser/generic_format_parser/src/lib.rs: Add generic
	library.
	* libformat_parser/src/lib.rs: Add base for FFI interface.
---
 .../generic_format_parser/src/lib.rs          |   2 +-
 libgrust/libformat_parser/src/lib.rs          | 301 +++++++++++++++++-
 2 files changed, 298 insertions(+), 5 deletions(-)

diff --git a/libgrust/libformat_parser/generic_format_parser/src/lib.rs b/libgrust/libformat_parser/generic_format_parser/src/lib.rs
index f42c9d8dffb..87a20dc18c5 100644
--- a/libgrust/libformat_parser/generic_format_parser/src/lib.rs
+++ b/libgrust/libformat_parser/generic_format_parser/src/lib.rs
@@ -1099,4 +1099,4 @@ fn unescape_string(string: &str) -> Option<string::String> {
 // rustc_index::static_assert_size!(Piece<'_>, 16);
 
 // #[cfg(test)]
-// mod tests;
\ No newline at end of file
+// mod tests;
diff --git a/libgrust/libformat_parser/src/lib.rs b/libgrust/libformat_parser/src/lib.rs
index e6dc16eeb49..49821e7cd2f 100644
--- a/libgrust/libformat_parser/src/lib.rs
+++ b/libgrust/libformat_parser/src/lib.rs
@@ -5,8 +5,298 @@
 
 use std::ffi::CStr;
 
-// TODO: Use rustc's version here #3
-use generic_format_parser::Piece;
+mod ffi {
+    use std::ops::Deref;
+
+    // Note: copied from rustc_span
+    /// Range inside of a `Span` used for diagnostics when we only have access to relative positions.
+    #[derive(Copy, Clone, PartialEq, Eq, Debug)]
+    #[repr(C)]
+    pub struct InnerSpan {
+        pub start: usize,
+        pub end: usize,
+    }
+
+    // impl InnerSpan {
+    //     pub fn new(start: usize, end: usize) -> InnerSpan {
+    //         InnerSpan { start, end }
+    //     }
+    // }
+
+    /// The location and before/after width of a character whose width has changed from its source code
+    /// representation
+    #[derive(Copy, Clone, PartialEq, Eq)]
+    #[repr(C)]
+    pub struct InnerWidthMapping {
+        /// Index of the character in the source
+        pub position: usize,
+        /// The inner width in characters
+        pub before: usize,
+        /// The transformed width in characters
+        pub after: usize,
+    }
+
+    // impl InnerWidthMapping {
+    //     pub fn new(position: usize, before: usize, after: usize) -> InnerWidthMapping {
+    //         InnerWidthMapping {
+    //             position,
+    //             before,
+    //             after,
+    //         }
+    //     }
+    // }
+
+    /// Whether the input string is a literal. If yes, it contains the inner width mappings.
+    #[derive(Clone, PartialEq, Eq)]
+    #[repr(C)]
+    enum InputStringKind {
+        NotALiteral,
+        Literal {
+            width_mappings: Vec<InnerWidthMapping>,
+        },
+    }
+
+    /// The type of format string that we are parsing.
+    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
+    #[repr(C)]
+    pub enum ParseMode {
+        /// A normal format string as per `format_args!`.
+        Format,
+        /// An inline assembly template string for `asm!`.
+        InlineAsm,
+    }
+
+    #[derive(Copy, Clone)]
+    #[repr(C)]
+    struct InnerOffset(usize);
+
+    /// A piece is a portion of the format string which represents the next part
+    /// to emit. These are emitted as a stream by the `Parser` class.
+    #[derive(Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum Piece<'a> {
+        /// A literal string which should directly be emitted
+        String(&'a str),
+        /// This describes that formatting should process the next argument (as
+        /// specified inside) for emission.
+        NextArgument(Box<Argument<'a>>),
+    }
+
+    /// Representation of an argument specification.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub struct Argument<'a> {
+        /// Where to find this argument
+        pub position: Position<'a>,
+        /// The span of the position indicator. Includes any whitespace in implicit
+        /// positions (`{  }`).
+        pub position_span: InnerSpan,
+        /// How to format the argument
+        pub format: FormatSpec<'a>,
+    }
+
+    /// Specification for the formatting of an argument in the format string.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub struct FormatSpec<'a> {
+        /// Optionally specified character to fill alignment with.
+        pub fill: Option<char>,
+        /// Span of the optionally specified fill character.
+        pub fill_span: Option<InnerSpan>,
+        /// Optionally specified alignment.
+        pub align: Alignment,
+        /// The `+` or `-` flag.
+        pub sign: Option<Sign>,
+        /// The `#` flag.
+        pub alternate: bool,
+        /// The `0` flag.
+        pub zero_pad: bool,
+        /// The `x` or `X` flag. (Only for `Debug`.)
+        pub debug_hex: Option<DebugHex>,
+        /// The integer precision to use.
+        pub precision: Count<'a>,
+        /// The span of the precision formatting flag (for diagnostics).
+        pub precision_span: Option<InnerSpan>,
+        /// The string width requested for the resulting format.
+        pub width: Count<'a>,
+        /// The span of the width formatting flag (for diagnostics).
+        pub width_span: Option<InnerSpan>,
+        /// The descriptor string representing the name of the format desired for
+        /// this argument, this can be empty or any number of characters, although
+        /// it is required to be one word.
+        pub ty: &'a str,
+        /// The span of the descriptor string (for diagnostics).
+        pub ty_span: Option<InnerSpan>,
+    }
+
+    /// Enum describing where an argument for a format can be located.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum Position<'a> {
+        /// The argument is implied to be located at an index
+        ArgumentImplicitlyIs(usize),
+        /// The argument is located at a specific index given in the format,
+        ArgumentIs(usize),
+        /// The argument has a name.
+        ArgumentNamed(&'a str),
+    }
+
+    /// Enum of alignments which are supported.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum Alignment {
+        /// The value will be aligned to the left.
+        AlignLeft,
+        /// The value will be aligned to the right.
+        AlignRight,
+        /// The value will be aligned in the center.
+        AlignCenter,
+        /// The value will take on a default alignment.
+        AlignUnknown,
+    }
+
+    /// Enum for the sign flags.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum Sign {
+        /// The `+` flag.
+        Plus,
+        /// The `-` flag.
+        Minus,
+    }
+
+    /// Enum for the debug hex flags.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum DebugHex {
+        /// The `x` flag in `{:x?}`.
+        Lower,
+        /// The `X` flag in `{:X?}`.
+        Upper,
+    }
+
+    /// A count is used for the precision and width parameters of an integer, and
+    /// can reference either an argument or a literal integer.
+    #[derive(Copy, Clone, Debug, PartialEq)]
+    #[repr(C)]
+    pub enum Count<'a> {
+        /// The count is specified explicitly.
+        CountIs(usize),
+        /// The count is specified by the argument with the given name.
+        CountIsName(&'a str, InnerSpan),
+        /// The count is specified by the argument at the given index.
+        CountIsParam(usize),
+        /// The count is specified by a star (like in `{:.*}`) that refers to the argument at the given index.
+        CountIsStar(usize),
+        /// The count is implied and cannot be explicitly specified.
+        CountImplied,
+    }
+
+    impl<'a> From<generic_format_parser::Piece<'a>> for Piece<'a> {
+        fn from(old: generic_format_parser::Piece<'a>) -> Self {
+            match old {
+                generic_format_parser::Piece::String(x) => Piece::String(x),
+                generic_format_parser::Piece::NextArgument(x) => {
+                    Piece::NextArgument(Box::new(Into::<Argument>::into(*x)))
+                }
+            }
+        }
+    }
+
+    impl<'a> From<generic_format_parser::Argument<'a>> for Argument<'a> {
+        fn from(old: generic_format_parser::Argument<'a>) -> Self {
+            Argument {
+                position: old.position.into(),
+                position_span: old.position_span.into(),
+                format: old.format.into(),
+            }
+        }
+    }
+
+    impl<'a> From<generic_format_parser::Position<'a>> for Position<'a> {
+        fn from(old: generic_format_parser::Position<'a>) -> Self {
+            match old {
+                generic_format_parser::Position::ArgumentImplicitlyIs(x) => {
+                    Position::ArgumentImplicitlyIs(x.into())
+                }
+                generic_format_parser::Position::ArgumentIs(x) => Position::ArgumentIs(x.into()),
+                generic_format_parser::Position::ArgumentNamed(x) => {
+                    Position::ArgumentNamed(x.into())
+                }
+            }
+        }
+    }
+
+    impl From<generic_format_parser::InnerSpan> for InnerSpan {
+        fn from(old: generic_format_parser::InnerSpan) -> Self {
+            InnerSpan {
+                start: old.start,
+                end: old.end,
+            }
+        }
+    }
+
+    impl<'a> From<generic_format_parser::FormatSpec<'a>> for FormatSpec<'a> {
+        fn from(old: generic_format_parser::FormatSpec<'a>) -> Self {
+            FormatSpec {
+                fill: old.fill,
+                fill_span: old.fill_span.map(Into::into),
+                align: old.align.into(),
+                sign: old.sign.map(Into::into),
+                alternate: old.alternate,
+                zero_pad: old.zero_pad,
+                debug_hex: old.debug_hex.map(Into::into),
+                precision: old.precision.into(),
+                precision_span: old.precision_span.map(Into::into),
+                width: old.width.into(),
+                width_span: old.width_span.map(Into::into),
+                ty: old.ty,
+                ty_span: old.ty_span.map(Into::into),
+            }
+        }
+    }
+
+    impl From<generic_format_parser::DebugHex> for DebugHex {
+        fn from(old: generic_format_parser::DebugHex) -> Self {
+            match old {
+                generic_format_parser::DebugHex::Lower => DebugHex::Lower,
+                generic_format_parser::DebugHex::Upper => DebugHex::Upper,
+            }
+        }
+    }
+
+    impl<'a> From<generic_format_parser::Count<'a>> for Count<'a> {
+        fn from(old: generic_format_parser::Count<'a>) -> Self {
+            match old {
+                generic_format_parser::Count::CountIs(x) => Count::CountIs(x),
+                generic_format_parser::Count::CountIsName(x, y) => Count::CountIsName(x, y.into()),
+                generic_format_parser::Count::CountIsParam(x) => Count::CountIsParam(x),
+                generic_format_parser::Count::CountIsStar(x) => Count::CountIsStar(x),
+                generic_format_parser::Count::CountImplied => Count::CountImplied,
+            }
+        }
+    }
+
+    impl From<generic_format_parser::Sign> for Sign {
+        fn from(old: generic_format_parser::Sign) -> Self {
+            match old {
+                generic_format_parser::Sign::Plus => Sign::Plus,
+                generic_format_parser::Sign::Minus => Sign::Minus,
+            }
+        }
+    }
+
+    impl From<generic_format_parser::Alignment> for Alignment {
+        fn from(old: generic_format_parser::Alignment) -> Self {
+            match old {
+                generic_format_parser::Alignment::AlignLeft => Alignment::AlignLeft,
+                generic_format_parser::Alignment::AlignRight => Alignment::AlignRight,
+                generic_format_parser::Alignment::AlignCenter => Alignment::AlignCenter,
+                generic_format_parser::Alignment::AlignUnknown => Alignment::AlignUnknown,
+            }
+        }
+    }
+}
 
 // FIXME: Rename?
 pub mod rust {
@@ -22,7 +312,7 @@ pub mod rust {
 
 #[repr(C)]
 pub struct PieceSlice {
-    base_ptr: *const Piece<'static /* FIXME: That's wrong */>,
+    base_ptr: *const ffi::Piece<'static /* FIXME: That's wrong */>,
     len: usize,
 }
 
@@ -32,7 +322,10 @@ pub extern "C" fn collect_pieces(input: *const libc::c_char) -> PieceSlice {
     let str = unsafe { CStr::from_ptr(input) };
 
     // FIXME: No unwrap
-    let pieces = rust::collect_pieces(str.to_str().unwrap());
+    let pieces: Vec<ffi::Piece<'_>> = rust::collect_pieces(str.to_str().unwrap())
+        .into_iter()
+        .map(Into::into)
+        .collect();
 
     PieceSlice {
         base_ptr: pieces.as_ptr(),