diff mbox series

nptl: add RSEQ_SIG for RISC-V

Message ID 20240914142652.8970-1-mjeanson@efficios.com
State New
Headers show
Series nptl: add RSEQ_SIG for RISC-V | expand

Commit Message

Michael Jeanson Sept. 14, 2024, 2:26 p.m. UTC
Enable RSEQ for RISC-V, support was added in Linux 5.18.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
---
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Darius Rad <darius@bluespec.com>
---
 sysdeps/unix/sysv/linux/riscv/bits/rseq.h | 44 +++++++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/rseq.h

Comments

Florian Weimer Oct. 2, 2024, 8:38 a.m. UTC | #1
* Michael Jeanson:

> Enable RSEQ for RISC-V, support was added in Linux 5.18.
>
> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
> ---
> Cc: Florian Weimer <fweimer@redhat.com>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> Cc: Darius Rad <darius@bluespec.com>
> ---
>  sysdeps/unix/sysv/linux/riscv/bits/rseq.h | 44 +++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>
> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/rseq.h b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
> new file mode 100644
> index 0000000000..dfc1fc9315
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
> @@ -0,0 +1,44 @@
> +/* Restartable Sequences Linux riscv architecture header.
> +   Copyright (C) 2021-2024 Free Software Foundation, Inc.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <bits/endian.h>
> +
> +#ifndef _SYS_RSEQ_H
> +# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
> +#endif
> +
> +/* RSEQ_SIG is a signature required before each abort handler code.
> +
> +   It is a 32-bit value that maps to actual architecture code compiled
> +   into applications and libraries.  It needs to be defined for each
> +   architecture.  When choosing this value, it needs to be taken into
> +   account that generating invalid instructions may have ill effects on
> +   tools like objdump, and may also have impact on the CPU speculative
> +   execution efficiency in some cases.
> +
> +   Select the instruction "csrw mhartid, x0" as the RSEQ_SIG. Unlike
> +   other architectures, the ebreak instruction has no immediate field for
> +   distinguishing purposes. Hence, ebreak is not suitable as RSEQ_SIG.
> +   "csrw mhartid, x0" can also satisfy the RSEQ requirement because it
> +   is an uncommon instruction and will raise an illegal instruction
> +   exception when executed in all modes.  */
> +
> +#if __BYTE_ORDER == __LITTLE_ENDIAN
> +#define RSEQ_SIG	0xf1401073
> +#else
> +/* RSEQ is currently only supported on Little-Endian.  */
> +#endif

Jeff, Kito,

would you be able to verify that the choice of instruction is
appropriate for this purpose? It should be something that never appears
among compiler-generated instructions (or anything else in the .text
segment).  It does not necessarily have to trap because it is never
executed.

Thanks,
Florian
Palmer Dabbelt Oct. 2, 2024, 11:11 a.m. UTC | #2
On Wed, 02 Oct 2024 01:38:20 PDT (-0700), fweimer@redhat.com wrote:
> * Michael Jeanson:
>
>> Enable RSEQ for RISC-V, support was added in Linux 5.18.
>>
>> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
>> ---
>> Cc: Florian Weimer <fweimer@redhat.com>
>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
>> Cc: Darius Rad <darius@bluespec.com>
>> ---
>>  sysdeps/unix/sysv/linux/riscv/bits/rseq.h | 44 +++++++++++++++++++++++
>>  1 file changed, 44 insertions(+)
>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>>
>> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/rseq.h b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>> new file mode 100644
>> index 0000000000..dfc1fc9315
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>> @@ -0,0 +1,44 @@
>> +/* Restartable Sequences Linux riscv architecture header.
>> +   Copyright (C) 2021-2024 Free Software Foundation, Inc.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <bits/endian.h>
>> +
>> +#ifndef _SYS_RSEQ_H
>> +# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
>> +#endif
>> +
>> +/* RSEQ_SIG is a signature required before each abort handler code.
>> +
>> +   It is a 32-bit value that maps to actual architecture code compiled
>> +   into applications and libraries.  It needs to be defined for each
>> +   architecture.  When choosing this value, it needs to be taken into
>> +   account that generating invalid instructions may have ill effects on
>> +   tools like objdump, and may also have impact on the CPU speculative
>> +   execution efficiency in some cases.
>> +
>> +   Select the instruction "csrw mhartid, x0" as the RSEQ_SIG. Unlike
>> +   other architectures, the ebreak instruction has no immediate field for
>> +   distinguishing purposes. Hence, ebreak is not suitable as RSEQ_SIG.
>> +   "csrw mhartid, x0" can also satisfy the RSEQ requirement because it
>> +   is an uncommon instruction and will raise an illegal instruction
>> +   exception when executed in all modes.  */
>> +
>> +#if __BYTE_ORDER == __LITTLE_ENDIAN
>> +#define RSEQ_SIG	0xf1401073
>> +#else
>> +/* RSEQ is currently only supported on Little-Endian.  */
>> +#endif
>
> Jeff, Kito,
>
> would you be able to verify that the choice of instruction is
> appropriate for this purpose? It should be something that never appears
> among compiler-generated instructions (or anything else in the .text
> segment).  It does not necessarily have to trap because it is never
> executed.

The compiler won't generate these.  There's no rules there or anything, 
it's just not a useful instruction for compilers to generate and thus 
there's no reason to do so.  If it's going to end up as a defacto ABI we 
should write that down somewhere, and I guess we could go scrub through 
a bunch of binaries to make sure that's true.

There's also no guarntees this traps, but it's pretty likely to do so.  
There's a bunch of possible edge cases like NOMMU Linux entering 
userspace in M-mode (IIRC it doesn't right now, but no strict reason for 
that) or vendors aliasing the standard M-mode instructions in U-mode.  
I'd also bet that these are emulated in a bunch of systems I wouldn't 
be surprised if someone gets the emulation wrong and this works in 
U-mode.

For semihosting we avoided those issues by using some of the NOP space 
to wrap the ebreak with magic values.  That results in a 3-instruction 
sequence, but it's made up of much more common instructions and thus 
less likely to trip up these sorts of edge cases.  Is something like 
that possible for rseq?

> Thanks,
> Florian
Florian Weimer Oct. 2, 2024, 11:58 a.m. UTC | #3
* Palmer Dabbelt:

> On Wed, 02 Oct 2024 01:38:20 PDT (-0700), fweimer@redhat.com wrote:
>> * Michael Jeanson:
>>
>>> Enable RSEQ for RISC-V, support was added in Linux 5.18.
>>>
>>> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
>>> ---
>>> Cc: Florian Weimer <fweimer@redhat.com>
>>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
>>> Cc: Darius Rad <darius@bluespec.com>
>>> ---
>>>  sysdeps/unix/sysv/linux/riscv/bits/rseq.h | 44 +++++++++++++++++++++++
>>>  1 file changed, 44 insertions(+)
>>>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>>>
>>> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/rseq.h b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>>> new file mode 100644
>>> index 0000000000..dfc1fc9315
>>> --- /dev/null
>>> +++ b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
>>> @@ -0,0 +1,44 @@
>>> +/* Restartable Sequences Linux riscv architecture header.
>>> +   Copyright (C) 2021-2024 Free Software Foundation, Inc.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#include <bits/endian.h>
>>> +
>>> +#ifndef _SYS_RSEQ_H
>>> +# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
>>> +#endif
>>> +
>>> +/* RSEQ_SIG is a signature required before each abort handler code.
>>> +
>>> +   It is a 32-bit value that maps to actual architecture code compiled
>>> +   into applications and libraries.  It needs to be defined for each
>>> +   architecture.  When choosing this value, it needs to be taken into
>>> +   account that generating invalid instructions may have ill effects on
>>> +   tools like objdump, and may also have impact on the CPU speculative
>>> +   execution efficiency in some cases.
>>> +
>>> +   Select the instruction "csrw mhartid, x0" as the RSEQ_SIG. Unlike
>>> +   other architectures, the ebreak instruction has no immediate field for
>>> +   distinguishing purposes. Hence, ebreak is not suitable as RSEQ_SIG.
>>> +   "csrw mhartid, x0" can also satisfy the RSEQ requirement because it
>>> +   is an uncommon instruction and will raise an illegal instruction
>>> +   exception when executed in all modes.  */
>>> +
>>> +#if __BYTE_ORDER == __LITTLE_ENDIAN
>>> +#define RSEQ_SIG	0xf1401073
>>> +#else
>>> +/* RSEQ is currently only supported on Little-Endian.  */
>>> +#endif
>>
>> Jeff, Kito,
>>
>> would you be able to verify that the choice of instruction is
>> appropriate for this purpose? It should be something that never appears
>> among compiler-generated instructions (or anything else in the .text
>> segment).  It does not necessarily have to trap because it is never
>> executed.
>
> The compiler won't generate these.  There's no rules there or
> anything, it's just not a useful instruction for compilers to generate
> and thus there's no reason to do so.  If it's going to end up as a
> defacto ABI we should write that down somewhere, and I guess we could
> go scrub through a bunch of binaries to make sure that's true.

It's a glibc-specific ABI.  We tell the kernel which value to use.
Applications targeting glibc need to use RSEQ_SIG from <sys/rseq.h>.  In
this regard, it's no different from the size of pthread_key_t, or the
value of the RTLD_NODELETE constant.  For those, applications need to be
consistent with glibc, too.

Scanning binaries might be a reasonable thing to do, but I currently do
not have a good way to do that.

> For semihosting we avoided those issues by using some of the NOP space
> to wrap the ebreak with magic values.  That results in a 3-instruction
> sequence, but it's made up of much more common instructions and thus
> less likely to trip up these sorts of edge cases.  Is something like
> that possible for rseq?

The kernel only checks the preceding four bytes.  Trapping is not
required.

Thanks,
Florian
Andrew Waterman Oct. 2, 2024, 10:44 p.m. UTC | #4
On Wed, Oct 2, 2024 at 4:11 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Wed, 02 Oct 2024 01:38:20 PDT (-0700), fweimer@redhat.com wrote:
> > * Michael Jeanson:
> >
> >> Enable RSEQ for RISC-V, support was added in Linux 5.18.
> >>
> >> Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
> >> ---
> >> Cc: Florian Weimer <fweimer@redhat.com>
> >> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >> Cc: Darius Rad <darius@bluespec.com>
> >> ---
> >>  sysdeps/unix/sysv/linux/riscv/bits/rseq.h | 44 +++++++++++++++++++++++
> >>  1 file changed, 44 insertions(+)
> >>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/rseq.h
> >>
> >> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/rseq.h b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
> >> new file mode 100644
> >> index 0000000000..dfc1fc9315
> >> --- /dev/null
> >> +++ b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
> >> @@ -0,0 +1,44 @@
> >> +/* Restartable Sequences Linux riscv architecture header.
> >> +   Copyright (C) 2021-2024 Free Software Foundation, Inc.
> >> +
> >> +   The GNU C Library is free software; you can redistribute it and/or
> >> +   modify it under the terms of the GNU Lesser General Public
> >> +   License as published by the Free Software Foundation; either
> >> +   version 2.1 of the License, or (at your option) any later version.
> >> +
> >> +   The GNU C Library is distributed in the hope that it will be useful,
> >> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >> +   Lesser General Public License for more details.
> >> +
> >> +   You should have received a copy of the GNU Lesser General Public
> >> +   License along with the GNU C Library; if not, see
> >> +   <https://www.gnu.org/licenses/>.  */
> >> +
> >> +#include <bits/endian.h>
> >> +
> >> +#ifndef _SYS_RSEQ_H
> >> +# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
> >> +#endif
> >> +
> >> +/* RSEQ_SIG is a signature required before each abort handler code.
> >> +
> >> +   It is a 32-bit value that maps to actual architecture code compiled
> >> +   into applications and libraries.  It needs to be defined for each
> >> +   architecture.  When choosing this value, it needs to be taken into
> >> +   account that generating invalid instructions may have ill effects on
> >> +   tools like objdump, and may also have impact on the CPU speculative
> >> +   execution efficiency in some cases.
> >> +
> >> +   Select the instruction "csrw mhartid, x0" as the RSEQ_SIG. Unlike
> >> +   other architectures, the ebreak instruction has no immediate field for
> >> +   distinguishing purposes. Hence, ebreak is not suitable as RSEQ_SIG.
> >> +   "csrw mhartid, x0" can also satisfy the RSEQ requirement because it
> >> +   is an uncommon instruction and will raise an illegal instruction
> >> +   exception when executed in all modes.  */
> >> +
> >> +#if __BYTE_ORDER == __LITTLE_ENDIAN
> >> +#define RSEQ_SIG    0xf1401073
> >> +#else
> >> +/* RSEQ is currently only supported on Little-Endian.  */
> >> +#endif
> >
> > Jeff, Kito,
> >
> > would you be able to verify that the choice of instruction is
> > appropriate for this purpose? It should be something that never appears
> > among compiler-generated instructions (or anything else in the .text
> > segment).  It does not necessarily have to trap because it is never
> > executed.
>
> The compiler won't generate these.  There's no rules there or anything,
> it's just not a useful instruction for compilers to generate and thus
> there's no reason to do so.  If it's going to end up as a defacto ABI we
> should write that down somewhere, and I guess we could go scrub through
> a bunch of binaries to make sure that's true.
>
> There's also no guarntees this traps, but it's pretty likely to do so.

Since it's a write to a read-only CSR, it's guaranteed to raise an
illegal-instruction exception (regardless of privilege mode).

> There's a bunch of possible edge cases like NOMMU Linux entering
> userspace in M-mode (IIRC it doesn't right now, but no strict reason for
> that) or vendors aliasing the standard M-mode instructions in U-mode.
> I'd also bet that these are emulated in a bunch of systems I wouldn't
> be surprised if someone gets the emulation wrong and this works in
> U-mode.
>
> For semihosting we avoided those issues by using some of the NOP space
> to wrap the ebreak with magic values.  That results in a 3-instruction
> sequence, but it's made up of much more common instructions and thus
> less likely to trip up these sorts of edge cases.  Is something like
> that possible for rseq?
>
> > Thanks,
> > Florian
diff mbox series

Patch

diff --git a/sysdeps/unix/sysv/linux/riscv/bits/rseq.h b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
new file mode 100644
index 0000000000..dfc1fc9315
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/bits/rseq.h
@@ -0,0 +1,44 @@ 
+/* Restartable Sequences Linux riscv architecture header.
+   Copyright (C) 2021-2024 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <bits/endian.h>
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries.  It needs to be defined for each
+   architecture.  When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   Select the instruction "csrw mhartid, x0" as the RSEQ_SIG. Unlike
+   other architectures, the ebreak instruction has no immediate field for
+   distinguishing purposes. Hence, ebreak is not suitable as RSEQ_SIG.
+   "csrw mhartid, x0" can also satisfy the RSEQ requirement because it
+   is an uncommon instruction and will raise an illegal instruction
+   exception when executed in all modes.  */
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define RSEQ_SIG	0xf1401073
+#else
+/* RSEQ is currently only supported on Little-Endian.  */
+#endif