From patchwork Thu Aug 8 15:56:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 1970609 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=eyOvJu6h; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wfs9S578Vz1yf8 for ; Fri, 9 Aug 2024 01:56:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 94BE43861933 for ; Thu, 8 Aug 2024 15:56:42 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 6130A3861015 for ; Thu, 8 Aug 2024 15:56:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6130A3861015 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6130A3861015 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723132587; cv=none; b=kvuOoiY1HTYFG5iz+P2QZIr2annqtXbOhZN7R9NL7mh2vQyY5FsnyXwcgLdZfGMGoS0opoQe11LSuy2fwXTDgEdPYqJXvt0OXDsLhK3aUFwvqcBqeNBmc8M9wsRjJmS14SJWSslySZ3HI96PGZ0sUSgQk17JHvTIvoS1Nf2lxMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723132587; c=relaxed/simple; bh=yNzIWQ5qCfftpvnVySj4CXhSm4H6AWMsoxIgoKFYg3k=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=UC3b2TSzGL4W5XWCHsqjcgndHib+GS5TIURBeLZ6S10sgndWjSuBtamccHGAkWpfkJhyu5p5/FhLWD6ag9L/pgAnut5BeII2aDlcK/D+ifz3sUK8PjSdJUWpzlsGRqNCTdpOJHie9n8HyKKmxrFcWzCqQ1+N7caoCUQAqD7/fKg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723132585; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=YL5GD2U0SbziW3fjlPtTA9Tz6K2Aq/vS729qKKdNU8k=; b=eyOvJu6hFXhaF/FsKSj5zylwjpl7bhHxNh54wzuCKjZCA4uKmwhgbjWz/seME6czr+S758 PFTsLrPbEibr6AVpnrAXvo+uM0JIUANO6FEOa2HeNrocYgLO+MVtRDARGrxFhSeL+MnRT1 Jr+shOiIBDVgvy8iOtK/GgNrVqxaRyU= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-458-UuXLmSsqPl-Wv2hm2QfxsQ-1; Thu, 08 Aug 2024 11:56:19 -0400 X-MC-Unique: UuXLmSsqPl-Wv2hm2QfxsQ-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id ABE741944B29; Thu, 8 Aug 2024 15:56:17 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.45.224.76]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B7CBB19560A3; Thu, 8 Aug 2024 15:56:15 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Cc: Jonathon Anderson , Ben Woodard , John Mellor-Crummey Subject: [PATCH v2] manual: Describe struct link_map, support link maps with dlinfo Date: Thu, 08 Aug 2024 17:56:12 +0200 Message-ID: <87jzgrm2jn.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org This does not describe how to use RTLD_DI_ORIGIN and l_name to reconstruct a full path for the an object. The reason is that I think we should not recommend further use of RTLD_DI_ORIGIN due to its buffer overflow potential. This should be covered by another dlinfo extension. It would also obsolete the need for the dladdr approach to obtain the file name for the main executable. Obtaining the lowest address from load segments in program headers is quite clumsy and should be provided directly via dlinfo. --- manual/dynlink.texi | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 2 deletions(-) base-commit: 9446351dac4cb995828488b59a1e0292bdd50c5d diff --git a/manual/dynlink.texi b/manual/dynlink.texi index 1500a53de6..f978395220 100644 --- a/manual/dynlink.texi +++ b/manual/dynlink.texi @@ -352,16 +352,119 @@ support the XGETBV instruction. @node Dynamic Linker Introspection @section Dynamic Linker Introspection -@Theglibc{} provides various functions for querying information from the +@Theglibc{} provides various facilities for querying information from the dynamic linker. +@deftp {Data Type} {struct link_map} + +@cindex link map +A @dfn{link map} is associated with the main executable and each shared +object. Some fields of the link map are accesible to applications and +exposed through the @code{stuct link_map}. Applications must not modify +the link map directly. + +Pointers to link maps can be obtained from the @code{_r_debug} variable, +from the @code{RTLD_DI_LINKMAP} request for @code{dlinfo}, and from the +@code{_dl_find_object} function. See below for details. + +@table @code +@item l_addr +@cindex load address +This field contains the @dfn{load address} of the object. This is the +offset that needs to be applied to unrelocated addresses in the object +image (as to stored on disk) to form an address that can be used at run +time for accessing data or running code. For position-dependent +executables, the load address is typically zero, and no adjustment is +required. For position-independent objects, the @code{l_addr} field +usually contains the address of the object's ELF header in the process +image. However, this correspondence is not guaranteed because the ELF +header might not be mapped at all, and the ELF file as stored on disk +might use zero as the lowest virtual address. Due to the second +variable, values of the @code{l_addr} field do not necessarily uniquely +identify a shared object. + +On Linux, to obtain the lowest loaded address of the main program, use +@code{getauxval} to obtain the @code{AT_PHDR} and @code{AT_PHNUM} values +for the current process. Alternatively, call +@samp{dlinfo (_r_debug.r_map, &@var{phdr})} +to obtain the number of program headers, and the address of the program +header array will be stored in @var{phdr} +(of type @code{const ElfW(Phdr) *}, as explained below). +These values allow processing the array of program headers and the +address information in the @code{PT_LOAD} entries among them. +This works even when the program was started with an explicit loader +invocation. + +@item l_name +For a shared object, this field contains the file name that the +@theglibc{} dynamic loader used when opening the object. This can be a +relative path (relative the current directory at process start, or when +the object was loaded later, via @code{dlopen} or @code{dlmopen}). +Symbolic links are not necessarily resolved. + +For the main executable, @code{l_name} is @samp{""} (the empty string). +(The main executable is not loaded by @theglibc{}, so its file name is +not available.) On Linux, the main executable is available as +@file{/proc/self/exe} (unless an explicit loader invocation was used to +start the program). The file name @file{/proc/self/exe} continues to +resolve to the same file even if it is moved within or deleted from the +file system. Its current location can be read using @code{readlink}. +@xref{Symbolic Links}. (Although @file{/proc/self/exe} is not actually +a symbol link, it is only presented as one.) Note that @file{/proc} may +not be mounted, in which case @file{/proc/self/exe} is not available. + +If an explicit loader invocation is used (such as @samp{ld.so +/usr/bin/emacs}), the @file{/proc/self/exe} approach does not work +because the file name refers to the dynamic linker @code{ld.so}, and not +the @code{/usr/bin/emacs} program. An approximation to the executable +path is still available in the @code{@var{info{.dli_fname}} member after +calling @samp{dladdr (_r_debug.r_map->l_ld, &@var{info})}. Note that +this could be a relative path, and it is supplied by the process that +created the current process, not the kernel, so it could be inaccurate. + +@item l_ld +This is a pointer to the ELF dynamic segment, an array of tag/value +pairs that provide various pieces of information that the dynamic +linking process uses. On most architectures, addreses in the dynamic +segment are relocated at run time, but on some architectures and in some +run-time configurations, it is necessary to add the @code{l_addr} field +value to obtain a proper address. + +@item l_prev +@itemx l_next +These fields are used to main a double-linked linked list of all link +maps within on @code{dlmopen} namespace. Note that there is currently +no thread-safe way to iteratoe over this list. The callback-based +@code{dl_iterate_phdr} interface can be used instead. +@end table +@end deftp + +@strong{Portability note:} It is not possible to create a @code{struct +link_map} object and pass a pointer to a function that expects a +@code{struct link_map *} argument. Only link map pointers initially +supplied by @theglibc{} are permitted as arguments. In current versions +of @theglibc{}, handles returned by @code{dlopen} and @code{dlmopen} are +pointers to link maps. However, this is not a portable assumption, and +may even change in future versions of @theglibc{}. To obtain the link +map associated with a handle, see @code{dlinfo} and +@code{RTLD_DI_LINKMAP} below. If a function accepts both +@code{dlopen}/@code{dlmopen} handles and @code{struct link_map} pointers +in its @code{void *} argument, that is documented explicitly. + +@subsection Querying information for loaded objects + +The @code{dlinfo} function provides access to internal information +associated with @code{dlopen}/@code{dlmopen} handles and link maps. + @deftypefun {int} dlinfo (void *@var{handle}, int @var{request}, void *@var{arg}) @safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{}}} @standards{GNU, dlfcn.h} This function returns information about @var{handle} in the memory location @var{arg}, based on @var{request}. The @var{handle} argument must be a pointer returned by @code{dlopen} or @code{dlmopen}; it must -not have been closed by @code{dlclose}. +not have been closed by @code{dlclose}. Alternatively, @var{handle} +can be a @code{struct link_map *} value for a link map of an object +that has not been closed. On success, @code{dlinfo} returns 0 for most request types; exceptions are noted below. If there is an error, the function returns @math{-1},