From patchwork Thu Aug 1 14:56:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arthur Cohen X-Patchwork-Id: 1967780 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.a=rsa-sha256 header.s=google header.b=TAyhFh9N; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZXkf6CPnz1ybX for ; Fri, 2 Aug 2024 01:22:02 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 167453860763 for ; Thu, 1 Aug 2024 15:22:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id B64003860C39 for ; Thu, 1 Aug 2024 14:59:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B64003860C39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B64003860C39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524380; cv=none; b=o66TdOiQrH+ifvvJ/zKS0rSzEeZn5o1tkMpqsCp1BF/++F+ZrJktSRtdKTRA16ueBqwrwjIfF+LGWbuppcPuwcSScX9Kn7IYhEcL2vEFkb9IrRp9cGbSB8QYtxuYZHD9iteSi7IIdQvTpxYgGvuLiGaHOtlcB+/Xbo+JweBJO88= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722524380; c=relaxed/simple; bh=Cw5Weu5r1eHAOv352HBM0A/aXdPmRWqYRznbeAEb9gE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=IzhCWZ5nUq5Rd3H7tmtCFdBwMMz4Fz/SX3icsvQg0nsNv0Ez2T22NZRNCDIkfgH0UxEbmCZZzMI8+/pPo0DFAOjFCCxXjV5OTXpRGGKhIvUO+0Y1aGausHj8Xr+N6sveusRX+z9CUnEEgI1bn3Yy+NGblG2096M3oWeeX+lhbZU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-52f00ad303aso10944006e87.2 for ; Thu, 01 Aug 2024 07:59:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1722524349; x=1723129149; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2vfUjCg2Ha6/nTod4LN4vRVMWkPVpSihuCwZQh8ZcDg=; b=TAyhFh9NjHvhztjIel6WeOwMqKccVcAG1GziElf14VVRiAhfElRvmzvY7MzAvy7KJr PGvBl+8jbqASecBGFrcLnIeFTOqCDw2xulTICXImWFdfmmUKHHzG3aTWcWpSpEx58La2 hrbgwvaAiRiS9SI2vj7VOQDuKaALLVE2G3K5cKdwHKZrmspW1MBZOWM+wTp+qNS4Wlm7 bAmT02V0yJ1/4twgA9V+RFkgT4CWJHs6aA+kizmuR+3KdSD+AU4OGGLZEgTshtWswLW0 vO7iaxQh5S6899Jnn9HBrV6aJTHxQIx97Lt/lcWTvGpDefbWt8eBklMothZwp54Q/2VY CNpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722524349; x=1723129149; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2vfUjCg2Ha6/nTod4LN4vRVMWkPVpSihuCwZQh8ZcDg=; b=PCJrHGZv0Qv3SbxcaFgjl2CCd9oMLa1VCbavmsWY8SJGTPcy7EmqsbkOumTg276wQn zdNr4P2ZEnNbF6c5jo2HAbx9EksXme9njngyBcDlUB7RevbS6qlVj96Cj6Kg1r+g3GVL fbDUMTnlxbRpfGE/wAlwl7nPWIk+manqmwv3P/+mceJtl/MEHUhk7NfbH+QVb4EML0TV nCGmEWV3DEzur1CbSqXQZNfHBCqdZtkF2X+noHp+LsIS66iSsx5iPOPoSI3At2EIk/gY gBPqTZY3frElFH8iTkvxTG6FoxyiUoVLCpVejflRI5Wfjqb4NJjYF8h9BllrKSUESQQT cFCA== X-Gm-Message-State: AOJu0Ywq+xH7Tu+AaTcBEKshK8qqPizWX7k0sv3b2VQtCUpF+m8AEYTm NGGjdHe+aEKW9Q+qfM6HbnEMElqo2qrN4yWrAdvcJMc32e60yvawSzTx5fPJt1NtHWNk/voXwEt dd+cM X-Google-Smtp-Source: AGHT+IGpHD5hqaPsoFd2M9BEMiZ/8llcaSpvPH7TDEgBbrMyz04Ny+mi2qAa10u8wg7dDvotxNPdXw== X-Received: by 2002:a05:6512:ac3:b0:52e:999b:7c01 with SMTP id 2adb3069b0e04-530bb3d54damr82439e87.48.1722524349238; Thu, 01 Aug 2024 07:59:09 -0700 (PDT) Received: from platypus.lan ([2a04:cec2:9:dc84:3622:6733:ff49:ee91]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5ac63590592sm10252456a12.25.2024.08.01.07.59.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Aug 2024 07:59:08 -0700 (PDT) From: Arthur Cohen To: gcc-patches@gcc.gnu.org Cc: gcc-rust@gcc.gnu.org, Arthur Cohen Subject: [PATCH 048/125] gccrs: format-args: Start storing string in Rust memory Date: Thu, 1 Aug 2024 16:56:44 +0200 Message-ID: <20240801145809.366388-50-arthur.cohen@embecosm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240801145809.366388-2-arthur.cohen@embecosm.com> References: <20240801145809.366388-2-arthur.cohen@embecosm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/rust/ChangeLog: * ast/rust-fmt.cc (ffi::RustHamster::to_string): New. (Pieces::collect): Adapt to use new handle API. (Pieces::~Pieces): Likewise. (Pieces::Pieces): Likewise. (Pieces::operator=): Likewise. * ast/rust-fmt.h (struct RustString): Add members. (struct FormatArgsHandle): New. (clone_pieces): Adapt for new FFI API. (destroy_pieces): Likewise. (struct Pieces): Store new FormatArgsHandle type. * expand/rust-expand-format-args.cc (expand_format_args): Use proper namespace. * resolve/rust-ast-resolve-base.cc (ResolverBase::visit): FormatArgs nodes are already resolved, so do nothing. libgrust/ChangeLog: * libformat_parser/src/lib.rs: Use new Handle struct and expose it. --- gcc/rust/ast/rust-fmt.cc | 32 ++++--- gcc/rust/ast/rust-fmt.h | 46 ++++++--- gcc/rust/expand/rust-expand-format-args.cc | 4 +- gcc/rust/resolve/rust-ast-resolve-base.cc | 5 +- libgrust/libformat_parser/src/lib.rs | 105 ++++++++++++++++----- 5 files changed, 133 insertions(+), 59 deletions(-) diff --git a/gcc/rust/ast/rust-fmt.cc b/gcc/rust/ast/rust-fmt.cc index b82e089fc41..fda02e530fd 100644 --- a/gcc/rust/ast/rust-fmt.cc +++ b/gcc/rust/ast/rust-fmt.cc @@ -22,44 +22,48 @@ namespace Rust { namespace Fmt { +std::string +ffi::RustHamster::to_string () const +{ + return std::string (ptr, len); +} + Pieces -Pieces::collect (std::string &&to_parse, bool append_newline) +Pieces::collect (const std::string &to_parse, bool append_newline) { - auto piece_slice = collect_pieces (to_parse.c_str (), append_newline); + auto handle = ffi::collect_pieces (to_parse.c_str (), append_newline); // this performs multiple copies, can we avoid them maybe? // TODO: Instead of just creating a vec of, basically, `ffi::Piece`s, we // should transform them into the proper C++ type which we can work with. so // transform all the strings into C++ strings? all the Option into // tl::optional? - auto pieces = std::vector (piece_slice.base_ptr, - piece_slice.base_ptr + piece_slice.len); + auto pieces_vector = std::vector (handle.piece_slice.base_ptr, + handle.piece_slice.base_ptr + + handle.piece_slice.len); - return Pieces (std::move (pieces), piece_slice, std::move (to_parse)); + return Pieces (handle, std::move (pieces_vector)); } -Pieces::~Pieces () { destroy_pieces (slice); } +Pieces::~Pieces () { ffi::destroy_pieces (handle); } -Pieces::Pieces (const Pieces &other) - : pieces_vector (other.pieces_vector), to_parse (other.to_parse) +Pieces::Pieces (const Pieces &other) : pieces_vector (other.pieces_vector) { - slice = clone_pieces (other.slice.base_ptr, other.slice.len, other.slice.cap); + handle = ffi::clone_pieces (other.handle); } Pieces & Pieces::operator= (const Pieces &other) { - slice = clone_pieces (other.slice.base_ptr, other.slice.len, other.slice.cap); - to_parse = other.to_parse; + handle = ffi::clone_pieces (other.handle); + pieces_vector = other.pieces_vector; return *this; } Pieces::Pieces (Pieces &&other) : pieces_vector (std::move (other.pieces_vector)), - slice ( - clone_pieces (other.slice.base_ptr, other.slice.len, other.slice.cap)), - to_parse (std::move (other.to_parse)) + handle (clone_pieces (other.handle)) {} } // namespace Fmt diff --git a/gcc/rust/ast/rust-fmt.h b/gcc/rust/ast/rust-fmt.h index ba412f9958c..6a0c116b841 100644 --- a/gcc/rust/ast/rust-fmt.h +++ b/gcc/rust/ast/rust-fmt.h @@ -20,15 +20,21 @@ #define RUST_FMT_H #include "rust-system.h" +#include // FIXME: How to encode Option? namespace Rust { namespace Fmt { +namespace ffi { + struct RustHamster { - // hehe + const char *ptr; + size_t len; + + std::string to_string () const; }; /// Enum of alignments which are supported. @@ -240,21 +246,36 @@ struct PieceSlice size_t cap; }; +struct RustString +{ + const unsigned char *ptr; + size_t len; + size_t cap; +}; + +struct FormatArgsHandle +{ + PieceSlice piece_slice; + RustString rust_string; +}; + extern "C" { -PieceSlice +FormatArgsHandle collect_pieces (const char *input, bool append_newline); -PieceSlice -clone_pieces (const Piece *base_ptr, size_t len, size_t cap); +FormatArgsHandle +clone_pieces (const FormatArgsHandle &); -void destroy_pieces (PieceSlice); +void destroy_pieces (FormatArgsHandle); } // extern "C" +} // namespace ffi + struct Pieces { - static Pieces collect (std::string &&to_parse, bool append_newline); + static Pieces collect (const std::string &to_parse, bool append_newline); ~Pieces (); Pieces (const Pieces &other); @@ -262,7 +283,7 @@ struct Pieces Pieces (Pieces &&other); - const std::vector &get_pieces () const { return pieces_vector; } + const std::vector &get_pieces () const { return pieces_vector; } // { // slice = clone_pieces (&other.slice); @@ -272,19 +293,16 @@ struct Pieces // } private: - Pieces (std::vector &&pieces_vector, PieceSlice slice, - std::string &&to_parse) - : pieces_vector (std::move (pieces_vector)), slice (slice), - to_parse (std::move (to_parse)) + Pieces (ffi::FormatArgsHandle handle, std::vector &&pieces_vector) + : pieces_vector (std::move (pieces_vector)), handle (handle) {} - std::vector pieces_vector; + std::vector pieces_vector; // this memory is held for FFI reasons - it needs to be released and cloned // precisely, so try to not access it/modify it if possible. you should // instead work with `pieces_vector` - PieceSlice slice; - std::string to_parse; + ffi::FormatArgsHandle handle; }; } // namespace Fmt diff --git a/gcc/rust/expand/rust-expand-format-args.cc b/gcc/rust/expand/rust-expand-format-args.cc index 52249cb17e3..276ffd58c50 100644 --- a/gcc/rust/expand/rust-expand-format-args.cc +++ b/gcc/rust/expand/rust-expand-format-args.cc @@ -28,10 +28,10 @@ expand_format_args (AST::FormatArgs &fmt) { switch (node.tag) { - case Fmt::Piece::Tag::String: + case Fmt::ffi::Piece::Tag::String: // rust_debug ("[ARTHUR]: %s", node.string._0.c_str ()); - case Fmt::Piece::Tag::NextArgument: + case Fmt::ffi::Piece::Tag::NextArgument: rust_debug ("[ARTHUR]: NextArgument"); break; } diff --git a/gcc/rust/resolve/rust-ast-resolve-base.cc b/gcc/rust/resolve/rust-ast-resolve-base.cc index 1939a2056e4..fa1be197941 100644 --- a/gcc/rust/resolve/rust-ast-resolve-base.cc +++ b/gcc/rust/resolve/rust-ast-resolve-base.cc @@ -648,10 +648,7 @@ ResolverBase::visit (AST::FunctionParam &) void ResolverBase::visit (AST::FormatArgs &fmt) -{ - rust_sorry_at (fmt.get_locus (), "%s:%u: unimplemented FormatArgs visitor", - __FILE__, __LINE__); -} +{} } // namespace Resolver } // namespace Rust diff --git a/libgrust/libformat_parser/src/lib.rs b/libgrust/libformat_parser/src/lib.rs index 6bf78c4f2a8..84fac38e224 100644 --- a/libgrust/libformat_parser/src/lib.rs +++ b/libgrust/libformat_parser/src/lib.rs @@ -27,6 +27,23 @@ where mod ffi { use super::IntoFFI; + // FIXME: We need to ensure we deal with memory properly - whether it's owned by the C++ side or the Rust side + #[derive(Copy, Clone, PartialEq, Eq, Debug)] + #[repr(C)] + pub struct RustHamster { + ptr: *const u8, + len: usize, + } + + impl<'a> From<&'a str> for RustHamster { + fn from(s: &'a str) -> RustHamster { + RustHamster { + ptr: s.as_ptr(), + len: s.len(), + } + } + } + // Note: copied from rustc_span /// Range inside of a `Span` used for diagnostics when we only have access to relative positions. #[derive(Copy, Clone, PartialEq, Eq, Debug)] @@ -81,7 +98,7 @@ mod ffi { #[repr(C)] pub enum Piece<'a> { /// A literal string which should directly be emitted - String(&'a str), + String(RustHamster), /// This describes that formatting should process the next argument (as /// specified inside) for emission. // do we need a pointer here? we're doing big cloning anyway @@ -201,7 +218,7 @@ mod ffi { impl<'a> From> for Piece<'a> { fn from(old: generic_format_parser::Piece<'a>) -> Self { match old { - generic_format_parser::Piece::String(x) => Piece::String(x), + generic_format_parser::Piece::String(x) => Piece::String(x.into()), generic_format_parser::Piece::NextArgument(x) => { // FIXME: This is problematic - if we do this, then we probably run into the issue that the Box // is freed at the end of the call to collect_pieces. if we just .leak() it, then we have @@ -336,53 +353,91 @@ pub struct PieceSlice { cap: usize, } -#[no_mangle] -pub extern "C" fn collect_pieces(input: *const libc::c_char, append_newline: bool) -> PieceSlice { - dbg!(input); +#[repr(C)] +// FIXME: we should probably use FFIString here +pub struct RustString { + ptr: *const u8, + len: usize, + cap: usize, +} + +#[repr(C)] +pub struct FormatArgsHandle(PieceSlice, RustString); +#[no_mangle] +pub extern "C" fn collect_pieces( + input: *const libc::c_char, + append_newline: bool, +) -> FormatArgsHandle { // FIXME: Add comment let str = unsafe { CStr::from_ptr(input) }; + let str = str.to_str().unwrap().to_owned(); - // FIXME: No unwrap - let pieces: Vec> = - rust::collect_pieces(str.to_str().unwrap(), None, None, append_newline) - .into_iter() - .map(Into::into) - .collect(); + // we are never going to free this string here (we leak it later on), so we can extend its lifetime + // to send it across an FFI boundary. + // FIXME: Is that correct? + let s = &str; + let s = unsafe { std::mem::transmute::<&'_ str, &'static str>(s) }; - println!("[ARTHUR]: debug: {:?}, {:?}", pieces.as_ptr(), pieces.len()); + // FIXME: No unwrap + let pieces: Vec> = rust::collect_pieces(s, None, None, append_newline) + .into_iter() + .map(Into::into) + .collect(); - PieceSlice { + let piece_slice = PieceSlice { len: pieces.len(), cap: pieces.capacity(), base_ptr: pieces.leak().as_mut_ptr(), - } + }; + let rust_string = RustString { + len: str.len(), + cap: str.capacity(), + ptr: str.leak().as_ptr(), + }; + + FormatArgsHandle(piece_slice, rust_string) } #[no_mangle] -pub unsafe extern "C" fn destroy_pieces(PieceSlice { base_ptr, len, cap }: PieceSlice) { - eprintln!("[ARTHUR] destroying pieces: {base_ptr:?} {len} {cap}"); +pub unsafe extern "C" fn destroy_pieces(FormatArgsHandle(piece_slice, s): FormatArgsHandle) { + let PieceSlice { base_ptr, len, cap } = piece_slice; drop(Vec::from_raw_parts(base_ptr, len, cap)); + + let RustString { ptr, len, cap } = s; + drop(String::from_raw_parts(ptr as *mut u8, len, cap)); } #[no_mangle] pub extern "C" fn clone_pieces( - base_ptr: *mut ffi::Piece<'static>, - len: usize, - cap: usize, -) -> PieceSlice { - eprintln!("[ARTHUR] cloning pieces: {base_ptr:?} {len} {cap}"); + FormatArgsHandle(piece_slice, s): &FormatArgsHandle, +) -> FormatArgsHandle { + let PieceSlice { base_ptr, len, cap } = *piece_slice; let v = unsafe { Vec::from_raw_parts(base_ptr, len, cap) }; - let cloned_v = v.clone(); // FIXME: Add documentation v.leak(); - PieceSlice { + let piece_slice = PieceSlice { len: cloned_v.len(), cap: cloned_v.capacity(), - base_ptr: dbg!(cloned_v.leak().as_mut_ptr()), - } + base_ptr: cloned_v.leak().as_mut_ptr(), + }; + + let RustString { ptr, len, cap } = *s; + let s = unsafe { String::from_raw_parts(ptr as *mut u8, len, cap) }; + let cloned_s = s.clone(); + + // FIXME: Documentation + s.leak(); + + let rust_string = RustString { + len: cloned_s.len(), + cap: cloned_s.capacity(), + ptr: cloned_s.leak().as_ptr(), + }; + + FormatArgsHandle(piece_slice, rust_string) }