diff mbox series

aio: Wire up compat_sys_io_pgetevents_time64 for x86

Message ID 20210921130127.24131-1-rpalethorpe@suse.com
State Not Applicable
Headers show
Series aio: Wire up compat_sys_io_pgetevents_time64 for x86 | expand

Commit Message

Richard Palethorpe Sept. 21, 2021, 1:01 p.m. UTC
The LTP test io_pgetevents02 fails in 32bit compat mode because an
nr_max of -1 appears to be treated as a large positive integer. This
causes pgetevents_time64 to return an event. The test expects the call
to fail and errno to be set to EINVAL.

Using the compat syscall fixes the issue.

Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec")
Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 arch/x86/entry/syscalls/syscall_32.tbl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Arnd Bergmann Sept. 21, 2021, 1:32 p.m. UTC | #1
On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe <rpalethorpe@suse.com> wrote:
>
> The LTP test io_pgetevents02 fails in 32bit compat mode because an
> nr_max of -1 appears to be treated as a large positive integer. This
> causes pgetevents_time64 to return an event. The test expects the call
> to fail and errno to be set to EINVAL.
>
> Using the compat syscall fixes the issue.
>
> Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec")
> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>

Thanks a lot for finding this, indeed there is definitely a mistake that
this function is defined and not used, but I don't yet see how it would
get to the specific failure you report.

Between the two implementations, I can see a difference in the
handling of the signal mask, but that should only affect architectures
with incompatible compat_sigset_t, i.e. big-endian or
_COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is
never true for currently supported architectures. On x86, there is
no difference in the sigset at all.

The negative 'nr' and 'min_nr' arguments that you list as causing
the problem /should/ be converted by the magic
SYSCALL_DEFINE6() definition. If this is currently broken, I would
expect other syscalls to be affected as well.

Have you tried reproducing this on non-x86 architectures? If I
misremembered how the compat conversion in SYSCALL_DEFINE6()
works, then all architectures that support CONFIG_COMPAT have
to be fixed.

         Arnd
Petr Vorel Sept. 21, 2021, 1:34 p.m. UTC | #2
Hi Richie,

Reviewed-by: Petr Vorel <pvorel@suse.cz>

Thanks!

Kind regards,
Petr
Petr Vorel Sept. 21, 2021, 1:40 p.m. UTC | #3
Hi,

I also wondered, if it should be added for other archs like the other compat
functions.

Kind regards,
Petr
Richard Palethorpe Sept. 21, 2021, 2:05 p.m. UTC | #4
Hello Arnd,

Arnd Bergmann <arnd@arndb.de> writes:

> On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe <rpalethorpe@suse.com> wrote:
>>
>> The LTP test io_pgetevents02 fails in 32bit compat mode because an
>> nr_max of -1 appears to be treated as a large positive integer. This
>> causes pgetevents_time64 to return an event. The test expects the call
>> to fail and errno to be set to EINVAL.
>>
>> Using the compat syscall fixes the issue.
>>
>> Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec")
>> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>
> Thanks a lot for finding this, indeed there is definitely a mistake that
> this function is defined and not used, but I don't yet see how it would
> get to the specific failure you report.
>
> Between the two implementations, I can see a difference in the
> handling of the signal mask, but that should only affect architectures
> with incompatible compat_sigset_t, i.e. big-endian or
> _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is
> never true for currently supported architectures. On x86, there is
> no difference in the sigset at all.
>
> The negative 'nr' and 'min_nr' arguments that you list as causing
> the problem /should/ be converted by the magic
> SYSCALL_DEFINE6() definition. If this is currently broken, I would
> expect other syscalls to be affected as well.

That is what I thought, but I couldn't think of another explanation for
it.

>
> Have you tried reproducing this on non-x86 architectures? If I
> misremembered how the compat conversion in SYSCALL_DEFINE6()
> works, then all architectures that support CONFIG_COMPAT have
> to be fixed.
>
>          Arnd

No, but I suppose I can try it on ARM or PowerPC. I suppose printing the
arguments would be a good idea too.
Richard Palethorpe Sept. 21, 2021, 3:44 p.m. UTC | #5
Richard Palethorpe <rpalethorpe@suse.de> writes:

> Hello Arnd,
>
> Arnd Bergmann <arnd@arndb.de> writes:
>
>> On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe <rpalethorpe@suse.com> wrote:
>>>
>>> The LTP test io_pgetevents02 fails in 32bit compat mode because an
>>> nr_max of -1 appears to be treated as a large positive integer. This
>>> causes pgetevents_time64 to return an event. The test expects the call
>>> to fail and errno to be set to EINVAL.
>>>
>>> Using the compat syscall fixes the issue.
>>>
>>> Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec")
>>> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>>
>> Thanks a lot for finding this, indeed there is definitely a mistake that
>> this function is defined and not used, but I don't yet see how it would
>> get to the specific failure you report.
>>
>> Between the two implementations, I can see a difference in the
>> handling of the signal mask, but that should only affect architectures
>> with incompatible compat_sigset_t, i.e. big-endian or
>> _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is
>> never true for currently supported architectures. On x86, there is
>> no difference in the sigset at all.
>>
>> The negative 'nr' and 'min_nr' arguments that you list as causing
>> the problem /should/ be converted by the magic
>> SYSCALL_DEFINE6() definition. If this is currently broken, I would
>> expect other syscalls to be affected as well.
>
> That is what I thought, but I couldn't think of another explanation for
> it.
>
>>
>> Have you tried reproducing this on non-x86 architectures? If I
>> misremembered how the compat conversion in SYSCALL_DEFINE6()
>> works, then all architectures that support CONFIG_COMPAT have
>> to be fixed.
>>
>>          Arnd
>
> No, but I suppose I can try it on ARM or PowerPC. I suppose printing the
> arguments would be a good idea too.

It appears it really is failing to sign extend the s32 to s64. I added
the following printks

modified   fs/aio.c
@@ -2054,6 +2054,7 @@ static long do_io_getevents(aio_context_t ctx_id,
 	long ret = -EINVAL;
 
 	if (likely(ioctx)) {
+		printk("comparing %ld <= %ld\n", min_nr, nr);
 		if (likely(min_nr <= nr && min_nr >= 0))
 			ret = read_events(ioctx, min_nr, nr, events, until);
 		percpu_ref_put(&ioctx->users);
@@ -2114,6 +2115,8 @@ SYSCALL_DEFINE6(io_pgetevents,
 	bool interrupted;
 	int ret;
 
+	printk("io_pgetevents(%lx, %ld, %ld, ...)\n", ctx_id, min_nr, nr);
+
 	if (timeout && unlikely(get_timespec64(&ts, timeout)))
 		return -EFAULT;

Then the output is:

[   11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...)
[   11.252401] comparing 4294967295 <= 1
io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22)
[   11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...)
[   11.252748] comparing 1 <= 4294967295
io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
Richard Palethorpe Sept. 22, 2021, 8:45 a.m. UTC | #6
Richard Palethorpe <rpalethorpe@suse.de> writes:

> Richard Palethorpe <rpalethorpe@suse.de> writes:
>
>> Hello Arnd,
>>
>> Arnd Bergmann <arnd@arndb.de> writes:
>>
>>> On Tue, Sep 21, 2021 at 3:01 PM Richard Palethorpe <rpalethorpe@suse.com> wrote:
>>>>
>>>> The LTP test io_pgetevents02 fails in 32bit compat mode because an
>>>> nr_max of -1 appears to be treated as a large positive integer. This
>>>> causes pgetevents_time64 to return an event. The test expects the call
>>>> to fail and errno to be set to EINVAL.
>>>>
>>>> Using the compat syscall fixes the issue.
>>>>
>>>> Fixes: 7a35397f8c06 ("io_pgetevents: use __kernel_timespec")
>>>> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>>>
>>> Thanks a lot for finding this, indeed there is definitely a mistake that
>>> this function is defined and not used, but I don't yet see how it would
>>> get to the specific failure you report.
>>>
>>> Between the two implementations, I can see a difference in the
>>> handling of the signal mask, but that should only affect architectures
>>> with incompatible compat_sigset_t, i.e. big-endian or
>>> _COMPAT_NSIG_WORDS!=_NSIG_WORDS, and the latter is
>>> never true for currently supported architectures. On x86, there is
>>> no difference in the sigset at all.
>>>
>>> The negative 'nr' and 'min_nr' arguments that you list as causing
>>> the problem /should/ be converted by the magic
>>> SYSCALL_DEFINE6() definition. If this is currently broken, I would
>>> expect other syscalls to be affected as well.
>>
>> That is what I thought, but I couldn't think of another explanation for
>> it.
>>
>>>
>>> Have you tried reproducing this on non-x86 architectures? If I
>>> misremembered how the compat conversion in SYSCALL_DEFINE6()
>>> works, then all architectures that support CONFIG_COMPAT have
>>> to be fixed.
>>>
>>>          Arnd
>>
>> No, but I suppose I can try it on ARM or PowerPC. I suppose printing the
>> arguments would be a good idea too.
>
> It appears it really is failing to sign extend the s32 to s64. I added
> the following printks
>
> modified   fs/aio.c
> @@ -2054,6 +2054,7 @@ static long do_io_getevents(aio_context_t ctx_id,
>  	long ret = -EINVAL;
>  
>  	if (likely(ioctx)) {
> +		printk("comparing %ld <= %ld\n", min_nr, nr);
>  		if (likely(min_nr <= nr && min_nr >= 0))
>  			ret = read_events(ioctx, min_nr, nr, events, until);
>  		percpu_ref_put(&ioctx->users);
> @@ -2114,6 +2115,8 @@ SYSCALL_DEFINE6(io_pgetevents,
>  	bool interrupted;
>  	int ret;
>  
> +	printk("io_pgetevents(%lx, %ld, %ld, ...)\n", ctx_id, min_nr, nr);
> +
>  	if (timeout && unlikely(get_timespec64(&ts, timeout)))
>  		return -EFAULT;
>
> Then the output is:
>
> [   11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...)
> [   11.252401] comparing 4294967295 <= 1
> io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22)
> [   11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...)
> [   11.252748] comparing 1 <= 4294967295
> io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly

and below is the macro expansion for the automatically generated 32bit to
64bit io_pgetevents. I believe it is casting u32 to s64, which appears
to mean there is no sign extension. I don't know if this is the expected
behaviour?

For the manually written compat version we cast back to s32 which is
what fixes the issue.

long __ia32_sys_io_pgetevents(const struct pt_regs *regs) {
  return __se_sys_io_pgetevents((unsigned int)regs->bx, (unsigned int)regs->cx,
                                (unsigned int)regs->dx, (unsigned int)regs->si,
                                (unsigned int)regs->di, (unsigned int)regs->bp);
}
static long __se_sys_io_pgetevents(
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((aio_context_t)0), typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((aio_context_t)0), typeof(0ULL))),
        0LL, 0L)) ctx_id,
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
        0LL, 0L)) min_nr,
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
        0LL, 0L)) nr,
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((struct io_event *)0),
                                      typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((struct io_event *)0),
                                      typeof(0ULL))),
        0LL, 0L)) events,
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((struct __kernel_timespec *)0),
                                      typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((struct __kernel_timespec *)0),
                                      typeof(0ULL))),
        0LL, 0L)) timeout,
    __typeof(__builtin_choose_expr(
        (__builtin_types_compatible_p(typeof((const struct __aio_sigset *)0),
                                      typeof(0LL)) ||
         __builtin_types_compatible_p(typeof((const struct __aio_sigset *)0),
                                      typeof(0ULL))),
        0LL, 0L)) usig)
{
  long ret = __do_sys_io_pgetevents(
      (aio_context_t)ctx_id, (long)min_nr, (long)nr, (struct io_event *)events,
      (struct __kernel_timespec *)timeout, (const struct __aio_sigset
  *)usig);

  ...
}
Arnd Bergmann Sept. 22, 2021, 9:35 a.m. UTC | #7
On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe <rpalethorpe@suse.de> wrote:
> Richard Palethorpe <rpalethorpe@suse.de> writes:

> >
> > Then the output is:
> >
> > [   11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...)
> > [   11.252401] comparing 4294967295 <= 1
> > io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22)
> > [   11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...)
> > [   11.252748] comparing 1 <= 4294967295
> > io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
>
> and below is the macro expansion for the automatically generated 32bit to
> 64bit io_pgetevents. I believe it is casting u32 to s64, which appears
> to mean there is no sign extension. I don't know if this is the expected
> behaviour?

Thank you for digging through this, I meant to already reply once more yesterday
but didn't get around to that.

>     __typeof(__builtin_choose_expr(
>         (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
>          __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
>         0LL, 0L)) min_nr,
>     __typeof(__builtin_choose_expr(
>         (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
>          __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
>         0LL, 0L)) nr,

The part that I remembered is in arch/s390/include/asm/syscall_wrapper.h,
which uses this version instead:

#define __SC_COMPAT_CAST(t, a)                                          \
({                                                                      \
        long __ReS = a;                                                 \
                                                                        \
        BUILD_BUG_ON((sizeof(t) > 4) && !__TYPE_IS_L(t) &&              \
                     !__TYPE_IS_UL(t) && !__TYPE_IS_PTR(t) &&           \
                     !__TYPE_IS_LL(t));                                 \
        if (__TYPE_IS_L(t))                                             \
                __ReS = (s32)a;                                         \
        if (__TYPE_IS_UL(t))                                            \
                __ReS = (u32)a;                                         \
        if (__TYPE_IS_PTR(t))                                           \
                __ReS = a & 0x7fffffff;                                 \
        if (__TYPE_IS_LL(t))                                            \
                return -ENOSYS;                                         \
        (t)__ReS;                                                       \
})

This also takes care of s390-specific pointer conversion, which is the
reason for needing an architecture-specific wrapper, but I suppose the
handling of signed arguments as done in s390 should also be done
everywhere else.

I also noticed that only x86 and s390 even have separate entry
points for normal syscalls when called in compat mode, while
the others all just zero the upper halves of the registers in the
low-level entry code and then call the native entry point.

        Arnd
Richard Palethorpe Sept. 23, 2021, 10:01 a.m. UTC | #8
Hello Arnd,

Arnd Bergmann <arnd@arndb.de> writes:

> On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe <rpalethorpe@suse.de> wrote:
>> Richard Palethorpe <rpalethorpe@suse.de> writes:
>
>> >
>> > Then the output is:
>> >
>> > [   11.252268] io_pgetevents(f7f19000, 4294967295, 1, ...)
>> > [   11.252401] comparing 4294967295 <= 1
>> > io_pgetevents02.c:114: TPASS: invalid min_nr: io_pgetevents() failed as expected: EINVAL (22)
>> > [   11.252610] io_pgetevents(f7f19000, 1, 4294967295, ...)
>> > [   11.252748] comparing 1 <= 4294967295
>> > io_pgetevents02.c:103: TFAIL: invalid max_nr: io_pgetevents() passed unexpectedly
>>
>> and below is the macro expansion for the automatically generated 32bit to
>> 64bit io_pgetevents. I believe it is casting u32 to s64, which appears
>> to mean there is no sign extension. I don't know if this is the expected
>> behaviour?
>
> Thank you for digging through this, I meant to already reply once more yesterday
> but didn't get around to that.

Thanks, no problem. I suppose this will effect other systemcalls as
well. Which if nothing else is a pain for testing.

>
>>     __typeof(__builtin_choose_expr(
>>         (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
>>          __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
>>         0LL, 0L)) min_nr,
>>     __typeof(__builtin_choose_expr(
>>         (__builtin_types_compatible_p(typeof((long)0), typeof(0LL)) ||
>>          __builtin_types_compatible_p(typeof((long)0), typeof(0ULL))),
>>         0LL, 0L)) nr,
>
> The part that I remembered is in arch/s390/include/asm/syscall_wrapper.h,
> which uses this version instead:
>
> #define __SC_COMPAT_CAST(t, a)                                          \
> ({                                                                      \
>         long __ReS = a;                                                 \
>                                                                         \
>         BUILD_BUG_ON((sizeof(t) > 4) && !__TYPE_IS_L(t) &&              \
>                      !__TYPE_IS_UL(t) && !__TYPE_IS_PTR(t) &&           \
>                      !__TYPE_IS_LL(t));                                 \
>         if (__TYPE_IS_L(t))                                             \
>                 __ReS = (s32)a;                                         \
>         if (__TYPE_IS_UL(t))                                            \
>                 __ReS = (u32)a;                                         \
>         if (__TYPE_IS_PTR(t))                                           \
>                 __ReS = a & 0x7fffffff;                                 \
>         if (__TYPE_IS_LL(t))                                            \
>                 return -ENOSYS;                                         \
>         (t)__ReS;                                                       \
> })
>
> This also takes care of s390-specific pointer conversion, which is the
> reason for needing an architecture-specific wrapper, but I suppose the
> handling of signed arguments as done in s390 should also be done
> everywhere else.
>
> I also noticed that only x86 and s390 even have separate entry
> points for normal syscalls when called in compat mode, while
> the others all just zero the upper halves of the registers in the
> low-level entry code and then call the native entry point.
>
>         Arnd

It looks to me like aarch64 also has something similar? At any rate, I
can try to fix it for x86 and investigate what else might be effected.
Arnd Bergmann Sept. 23, 2021, 10:25 a.m. UTC | #9
On Thu, Sep 23, 2021 at 12:01 PM Richard Palethorpe <rpalethorpe@suse.de> wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
> > On Wed, Sep 22, 2021 at 10:46 AM Richard Palethorpe <rpalethorpe@suse.de> wrote:
> >> Richard Palethorpe <rpalethorpe@suse.de> writes:
> >
> > I also noticed that only x86 and s390 even have separate entry
> > points for normal syscalls when called in compat mode, while
> > the others all just zero the upper halves of the registers in the
> > low-level entry code and then call the native entry point.
>
> It looks to me like aarch64 also has something similar? At any rate, I
> can try to fix it for x86 and investigate what else might be effected.

arm64 also has a custom asm/syscall_wrapper.h, but it only does
this for accessing pt_regs (as x86 does), not for doing any
argument conversion. x86 does the 32-to-64 widening in the
wrapper, arm64 relies on the pt_regs already having the upper
halves zeroed.

        Arnd
diff mbox series

Patch

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 960a021d543e..0985d8333368 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -420,7 +420,7 @@ 
 412	i386	utimensat_time64	sys_utimensat
 413	i386	pselect6_time64		sys_pselect6			compat_sys_pselect6_time64
 414	i386	ppoll_time64		sys_ppoll			compat_sys_ppoll_time64
-416	i386	io_pgetevents_time64	sys_io_pgetevents
+416	i386	io_pgetevents_time64	sys_io_pgetevents		compat_sys_io_pgetevents_time64
 417	i386	recvmmsg_time64		sys_recvmmsg			compat_sys_recvmmsg_time64
 418	i386	mq_timedsend_time64	sys_mq_timedsend
 419	i386	mq_timedreceive_time64	sys_mq_timedreceive