diff mbox series

[RFC,v3,01/10] docs: Add specification for native library calls

Message ID 20230625212707.1078951-2-fufuyqqqqqq@gmail.com
State New
Headers show
Series Native Library Calls | expand

Commit Message

Yeqi Fu June 25, 2023, 9:26 p.m. UTC
Signed-off-by: Yeqi Fu <fufuyqqqqqq@gmail.com>
---
 docs/native_calls.txt | 70 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)
 create mode 100644 docs/native_calls.txt

Comments

Alex Bennée July 3, 2023, 3:04 p.m. UTC | #1
Yeqi Fu <fufuyqqqqqq@gmail.com> writes:

> Signed-off-by: Yeqi Fu <fufuyqqqqqq@gmail.com>
> ---
>  docs/native_calls.txt | 70 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 70 insertions(+)
>  create mode 100644 docs/native_calls.txt
>
> diff --git a/docs/native_calls.txt b/docs/native_calls.txt
> new file mode 100644
> index 0000000000..8906566b13
> --- /dev/null
> +++ b/docs/native_calls.txt

We try not to add plain text files to docs now. The preferred format is
rst so we can render the docs:

  https://qemu.readthedocs.io/en/latest/

I would suggest putting this in docs/user/native_calls.rst and adding a
reference in docs/user/index.rst

> @@ -0,0 +1,70 @@
> +Native Library Calls Optimization for QEMU Linux-User
> +====================
> +
> +Description
> +===========

newline and also for rst a different underline is needed to make
description a sub-heading (e.g. -----------)

> +When running under the linux-user mode in QEMU, the entire program,
> +including all library calls, is translated. Many well-understood
> +library functions are usually optimized for the processor they run
> +on.

Maybe instead of "Many..."

  Because many library functions are tuned specifically for the guest
  architecture the result of translating them will not be as efficient
  as it could be.

  When the semantics of a library function are well defined we can take
  advantage of that fact by replacing the translated version with a call
  to the native equivalent function. As the runtime of library functions....


> For example, the semantics of memcpy are well-defined and
> +optimized. Instead of translating these library functions, we can
> +call their native versions, as the runtime of library functions
> +is generally biased towards a few core functions. Thus, only a
> +small subset of functions (such as mem* and str*) would need to
> +be hooked to be useful.
> +
> +
> +Implementation
> +==============

newline

> +This feature introduces a set of specialized instructions for native
> +calls and provides helpers to translate these instructions to
> +corresponding native functions. A shared library is also implemented,
> +where native functions are rewritten as specialized instructions.
> +At runtime, user programs load the shared library, and specialized
> +instructions are executed when native functions are called.

I would switch it around. Describe the LD_PRELOAD library first and then
describe how the library uses a special sequence of instructions to
encode the ABI data in a way the translator can pick it up during
execution.

> +
> +The specialized instructions are implemented using architecture-
> +specific macros. These macros utilize unused or invalid opcodes or
> +instruction fields to embed the necessary information for native
> +function calls. This approach ensures that the specialized
> +instructions do not conflict with existing instructions.
> +
> +For x86 and x86_64, the implementation uses an unused opcode.
> +For arm and aarch64, the HLT instruction is used, as it is invalid in
> +userspace and has 16 bits of spare immediate data.
> +For mips and mips64, the implementation takes advantage of unused
> +bytes in the syscall instruction.

I think this paragraph can be split into each architecture and listed as
a sub-heading for each. This can replace the "Supported Architectures" heading.

> +
> +Supported Architectures
> +=======================
> +This feature is applicable to user programs with the following
> +architectures now:
> +- x86
> +- x86_64
> +- arm
> +- aarch64
> +- mips
> +- mips64
> +
> +
> +Usage
> +=====
> +1. Install Cross-Compilation Tools
> +Cross-compilation tools are required to build the shared libraries
> +that can hook the necessary library functions. For example, a viable
> +command on Ubuntu is:
> +```
> +apt install libc6:i386 gcc-arm-linux-gnueabihf \
> +gcc-aarch64-linux-gnu gcc-mips-linux-gnu gcc-mips64-linux-gnuabi64
> +```

I guess libc6:i386 is there because the default x86 host compiler
already supports building 32 bit but we just need the lib bits?

> +2. Locate the Compiled libnative.so
> +After compilation, the libnative.so file can be found in the
> +`./build/common-user/native/<target>-linux-user` directory.
> +
> +3. Run the Program with the `--native-bypass` Option
> +To run your program with native library bypass, use the
> +`--native-bypass` option to import libnative.so:
> +```
> +./build/qemu-<target> --native-bypass \

You can drop the ./build here as there should be a qemu-<target> in the
top level build directory.

> +./build/common-user/native/<target>-linux-user/libnative.so ./program
> +```
diff mbox series

Patch

diff --git a/docs/native_calls.txt b/docs/native_calls.txt
new file mode 100644
index 0000000000..8906566b13
--- /dev/null
+++ b/docs/native_calls.txt
@@ -0,0 +1,70 @@ 
+Native Library Calls Optimization for QEMU Linux-User
+====================
+
+Description
+===========
+When running under the linux-user mode in QEMU, the entire program,
+including all library calls, is translated. Many well-understood
+library functions are usually optimized for the processor they run
+on. For example, the semantics of memcpy are well-defined and
+optimized. Instead of translating these library functions, we can
+call their native versions, as the runtime of library functions
+is generally biased towards a few core functions. Thus, only a
+small subset of functions (such as mem* and str*) would need to
+be hooked to be useful.
+
+
+Implementation
+==============
+This feature introduces a set of specialized instructions for native
+calls and provides helpers to translate these instructions to
+corresponding native functions. A shared library is also implemented,
+where native functions are rewritten as specialized instructions.
+At runtime, user programs load the shared library, and specialized
+instructions are executed when native functions are called.
+
+The specialized instructions are implemented using architecture-
+specific macros. These macros utilize unused or invalid opcodes or
+instruction fields to embed the necessary information for native
+function calls. This approach ensures that the specialized
+instructions do not conflict with existing instructions.
+
+For x86 and x86_64, the implementation uses an unused opcode.
+For arm and aarch64, the HLT instruction is used, as it is invalid in
+userspace and has 16 bits of spare immediate data.
+For mips and mips64, the implementation takes advantage of unused
+bytes in the syscall instruction.
+
+Supported Architectures
+=======================
+This feature is applicable to user programs with the following
+architectures now:
+- x86
+- x86_64
+- arm
+- aarch64
+- mips
+- mips64
+
+
+Usage
+=====
+1. Install Cross-Compilation Tools
+Cross-compilation tools are required to build the shared libraries
+that can hook the necessary library functions. For example, a viable
+command on Ubuntu is:
+```
+apt install libc6:i386 gcc-arm-linux-gnueabihf \
+gcc-aarch64-linux-gnu gcc-mips-linux-gnu gcc-mips64-linux-gnuabi64
+```
+2. Locate the Compiled libnative.so
+After compilation, the libnative.so file can be found in the
+`./build/common-user/native/<target>-linux-user` directory.
+
+3. Run the Program with the `--native-bypass` Option
+To run your program with native library bypass, use the
+`--native-bypass` option to import libnative.so:
+```
+./build/qemu-<target> --native-bypass \
+./build/common-user/native/<target>-linux-user/libnative.so ./program
+```