diff mbox series

Linux: Inhibit tail calls in cancellable system calls

Message ID 874izmtu4w.fsf@oldenburg.str.redhat.com
State New
Headers show
Series Linux: Inhibit tail calls in cancellable system calls | expand

Commit Message

Florian Weimer March 21, 2025, 1:01 p.m. UTC
Without this change, the system call wrapper function is not visible
on the stack at the time of the system call, which causes problems
for interception tools such as valgrind.

Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
Race conditions in pthread cancellation [BZ#12683]").

Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
(We're still discussing if valgrind needs this, but if it does, here's a
patch.)

---
 sysdeps/unix/sysdep.h | 57 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)


base-commit: 3e2be87832781a29ed67f38f87c1ce3dd4c1b866

Comments

Carlos O'Donell March 24, 2025, 1:37 p.m. UTC | #1
On 3/21/25 9:01 AM, Florian Weimer wrote:
> Without this change, the system call wrapper function is not visible
> on the stack at the time of the system call, which causes problems
> for interception tools such as valgrind.
> 
> Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> Race conditions in pthread cancellation [BZ#12683]").
> 
> Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> (We're still discussing if valgrind needs this, but if it does, here's a
> patch.)

We discussed this on the weekly patch queue review.

Glad to see the Valgrind discussion is ongoing.

This has cross-architecture implications, and can add extra function calls.

I'm marking the patch in patchwork as changes requested since we should review
and discuss Adhemerval's suggestion.
Mark Wielaard March 28, 2025, 6:02 p.m. UTC | #2
Hi Florian,

On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> Without this change, the system call wrapper function is not visible
> on the stack at the time of the system call, which causes problems
> for interception tools such as valgrind.
> 
> Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> Race conditions in pthread cancellation [BZ#12683]").
> 
> Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> (We're still discussing if valgrind needs this, but if it does, here's a
> patch.)

I implemented the valgrind part of skipping the syscall_cancel frames
here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
And there is a valgrind package build for fedora rawhide:
https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393

For ppc64le, s390x and x86_64 that patch seems enough.

For i686 and aarch64 there does seem to be an issue with missing the
glibc calling function because of a tail call.

Also on i686 there is another extra frame on top __libc_do_syscall.

I haven't yet tried on a system with a glibc with this patch applied. 

Cheers,

Mark

> ---
>  sysdeps/unix/sysdep.h | 57 ++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 36 insertions(+), 21 deletions(-)
> 
> diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
> index 2cc98725c3..8f1e4fc4f9 100644
> --- a/sysdeps/unix/sysdep.h
> +++ b/sysdeps/unix/sysdep.h
> @@ -166,30 +166,45 @@ long int __syscall_cancel (__syscall_arg_t arg1, __syscall_arg_t arg2,
>  			   __SYSCALL_CANCEL7_ARG_DEF
>  			   __syscall_arg_t nr) attribute_hidden;
>  
> +/* Inhibit tail call optimization, so that the stack frame of the
> +   system-call-implementing function is visible at the time of the
> +   system call.  */
> +#define __syscall_cancel_barrier(...)					\
> +  ({									\
> +     long int __syscall_cancel_result = __syscall_cancel (__VA_ARGS__); \
> +     asm volatile ("" ::: "memory");					\
> +     __syscall_cancel_result;						\
> +  })
> +
>  #define __SYSCALL_CANCEL0(name)						\
> -  __syscall_cancel (0, 0, 0, 0, 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
> +  __syscall_cancel_barrier (0, 0, 0, 0, 0, 0,				\
> +			    __SYSCALL_CANCEL7_ARG __NR_##name)
>  #define __SYSCALL_CANCEL1(name, a1)					\
> -  __syscall_cancel (__SSC (a1), 0, 0, 0, 0, 0,				\
> -		    __SYSCALL_CANCEL7_ARG __NR_##name)
> -#define __SYSCALL_CANCEL2(name, a1, a2) \
> -  __syscall_cancel (__SSC (a1), __SSC (a2), 0, 0, 0, 0,			\
> -		    __SYSCALL_CANCEL7_ARG __NR_##name)
> -#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
> -  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), 0, 0, 0,	\
> -		    __SYSCALL_CANCEL7_ARG __NR_##name)
> -#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
> -  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),			\
> -		    __SSC(a4), 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
> -#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
> -  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC(a4),	\
> -		    __SSC (a5), 0, __SYSCALL_CANCEL7_ARG __NR_##name)
> -#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
> -  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC (a4),	\
> -		    __SSC (a5), __SSC (a6), __SYSCALL_CANCEL7_ARG	\
> -		    __NR_##name)
> +  __syscall_cancel_barrier (__SSC (a1), 0, 0, 0, 0, 0,			\
> +			    __SYSCALL_CANCEL7_ARG __NR_##name)
> +#define __SYSCALL_CANCEL2(name, a1, a2)					\
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), 0, 0, 0, 0,		\
> +			    __SYSCALL_CANCEL7_ARG __NR_##name)
> +#define __SYSCALL_CANCEL3(name, a1, a2, a3)				\
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
> +			    0, 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
> +#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4)				\
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
> +			    __SSC(a4), 0, 0,				\
> +			    __SYSCALL_CANCEL7_ARG __NR_##name)
> +#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5)			\
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
> +			    __SSC(a4), __SSC (a5), 0,			\
> +			    __SYSCALL_CANCEL7_ARG __NR_##name)
> +#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6)			\
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
> +			    __SSC (a4), __SSC (a5), __SSC (a6),		\
> +			    __SYSCALL_CANCEL7_ARG			\
> +			    __NR_##name)
>  #define __SYSCALL_CANCEL7(name, a1, a2, a3, a4, a5, a6, a7)		\
> -  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC (a4),	\
> -		    __SSC (a5), __SSC (a6), __SSC (a7), __NR_##name)
> +  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
> +			    __SSC (a4), __SSC (a5), __SSC (a6),		\
> +			    __SSC (a7), __NR_##name)
>  
>  #define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
>  #define __SYSCALL_CANCEL_NARGS(...) \
> 
> base-commit: 3e2be87832781a29ed67f38f87c1ce3dd4c1b866
>
Mark Wielaard March 31, 2025, 9:29 a.m. UTC | #3
Hi,

On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
> On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> > Without this change, the system call wrapper function is not visible
> > on the stack at the time of the system call, which causes problems
> > for interception tools such as valgrind.
> > 
> > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> > Race conditions in pthread cancellation [BZ#12683]").
> > 
> > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> > (We're still discussing if valgrind needs this, but if it does, here's a
> > patch.)
> 
> I implemented the valgrind part of skipping the syscall_cancel frames
> here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
> And there is a valgrind package build for fedora rawhide:
> https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
> 
> For ppc64le, s390x and x86_64 that patch seems enough.
> 
> For i686 and aarch64 there does seem to be an issue with missing the
> glibc calling function because of a tail call.
> 
> Also on i686 there is another extra frame on top __libc_do_syscall.

I extended the patch to cover some extra sycall wrapper function
symbols on i386 and armhf and pushed it to valgrind trunk and
VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
f42. This does seem to show that only on arm64 the tail calls
obscure observing the full call stack.

Cheers,

Mark
Mark Wielaard April 6, 2025, 1:23 p.m. UTC | #4
Hi,

On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote:
> On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
> > On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> > > Without this change, the system call wrapper function is not visible
> > > on the stack at the time of the system call, which causes problems
> > > for interception tools such as valgrind.
> > > 
> > > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> > > Race conditions in pthread cancellation [BZ#12683]").
> > > 
> > > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> > > (We're still discussing if valgrind needs this, but if it does, here's a
> > > patch.)
> > 
> > I implemented the valgrind part of skipping the syscall_cancel frames
> > here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
> > And there is a valgrind package build for fedora rawhide:
> > https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
> > 
> > For ppc64le, s390x and x86_64 that patch seems enough.
> > 
> > For i686 and aarch64 there does seem to be an issue with missing the
> > glibc calling function because of a tail call.
> > 
> > Also on i686 there is another extra frame on top __libc_do_syscall.
> 
> I extended the patch to cover some extra sycall wrapper function
> symbols on i386 and armhf and pushed it to valgrind trunk and
> VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
> f42. This does seem to show that only on arm64 the tail calls
> obscure observing the full call stack.

This has now landed in fedora rawhide and f42. Test results look good,
except for some if the arm64 tests where the tail calls obscure
observing the full call stack. Please let me know if you need any more
input from us to get this fix in glibc.

Cheers,

Mark
Mark Wielaard April 14, 2025, 12:06 p.m. UTC | #5
Hi,

On Sun, Apr 06, 2025 at 03:23:54PM +0200, Mark Wielaard wrote:
> On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote:
> > On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
> > > On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> > > > Without this change, the system call wrapper function is not visible
> > > > on the stack at the time of the system call, which causes problems
> > > > for interception tools such as valgrind.
> > > > 
> > > > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> > > > Race conditions in pthread cancellation [BZ#12683]").
> > > > 
> > > > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> > > > (We're still discussing if valgrind needs this, but if it does, here's a
> > > > patch.)
> > > 
> > > I implemented the valgrind part of skipping the syscall_cancel frames
> > > here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
> > > And there is a valgrind package build for fedora rawhide:
> > > https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
> > > 
> > > For ppc64le, s390x and x86_64 that patch seems enough.
> > > 
> > > For i686 and aarch64 there does seem to be an issue with missing the
> > > glibc calling function because of a tail call.
> > > 
> > > Also on i686 there is another extra frame on top __libc_do_syscall.
> > 
> > I extended the patch to cover some extra sycall wrapper function
> > symbols on i386 and armhf and pushed it to valgrind trunk and
> > VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
> > f42. This does seem to show that only on arm64 the tail calls
> > obscure observing the full call stack.
> 
> This has now landed in fedora rawhide and f42. Test results look good,
> except for some if the arm64 tests where the tail calls obscure
> observing the full call stack. Please let me know if you need any more
> input from us to get this fix in glibc.

Please let me know. Valgrind test results for syscall backtraces on
anything except arm64 look good. We are working on valgrind 3.25.0
now, to be released around April 24.

Thanks,

Mark
Mark Wielaard April 18, 2025, 3:46 p.m. UTC | #6
Hi all,

On Mon, 2025-04-14 at 14:06 +0200, Mark Wielaard wrote:
> On Sun, Apr 06, 2025 at 03:23:54PM +0200, Mark Wielaard wrote:
> > On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote:
> > > On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
> > > > On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> > > > > Without this change, the system call wrapper function is not visible
> > > > > on the stack at the time of the system call, which causes problems
> > > > > for interception tools such as valgrind.
> > > > > 
> > > > > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> > > > > Race conditions in pthread cancellation [BZ#12683]").
> > > > > 
> > > > > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> > > > > (We're still discussing if valgrind needs this, but if it does, here's a
> > > > > patch.)
> > > > 
> > > > I implemented the valgrind part of skipping the syscall_cancel frames
> > > > here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
> > > > And there is a valgrind package build for fedora rawhide:
> > > > https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
> > > > 
> > > > For ppc64le, s390x and x86_64 that patch seems enough.
> > > > 
> > > > For i686 and aarch64 there does seem to be an issue with missing the
> > > > glibc calling function because of a tail call.
> > > > 
> > > > Also on i686 there is another extra frame on top __libc_do_syscall.
> > > 
> > > I extended the patch to cover some extra sycall wrapper function
> > > symbols on i386 and armhf and pushed it to valgrind trunk and
> > > VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
> > > f42. This does seem to show that only on arm64 the tail calls
> > > obscure observing the full call stack.
> > 
> > This has now landed in fedora rawhide and f42. Test results look good,
> > except for some if the arm64 tests where the tail calls obscure
> > observing the full call stack. Please let me know if you need any more
> > input from us to get this fix in glibc.
> 
> Please let me know. Valgrind test results for syscall backtraces on
> anything except arm64 look good. We are working on valgrind 3.25.0
> now, to be released around April 24.

valgrind 3.25.0-RC1 has been released and test results look good on
most arches. arm64 does show the issue described above where the tail
calls obscure observing the full call stack when doing system calls.

Let me know what would be needed to get the above patch reviewed.

Thanks,

Mark
Mark Wielaard April 25, 2025, 10:06 p.m. UTC | #7
Hi all,

On Fri, Apr 18, 2025 at 05:46:54PM +0200, Mark Wielaard wrote:
> On Mon, 2025-04-14 at 14:06 +0200, Mark Wielaard wrote:
> > On Sun, Apr 06, 2025 at 03:23:54PM +0200, Mark Wielaard wrote:
> > > On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote:
> > > > On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
> > > > > On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
> > > > > > Without this change, the system call wrapper function is not visible
> > > > > > on the stack at the time of the system call, which causes problems
> > > > > > for interception tools such as valgrind.
> > > > > > 
> > > > > > Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
> > > > > > Race conditions in pthread cancellation [BZ#12683]").
> > > > > > 
> > > > > > Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
> > > > > > (We're still discussing if valgrind needs this, but if it does, here's a
> > > > > > patch.)
> > > > > 
> > > > > I implemented the valgrind part of skipping the syscall_cancel frames
> > > > > here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
> > > > > And there is a valgrind package build for fedora rawhide:
> > > > > https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
> > > > > 
> > > > > For ppc64le, s390x and x86_64 that patch seems enough.
> > > > > 
> > > > > For i686 and aarch64 there does seem to be an issue with missing the
> > > > > glibc calling function because of a tail call.
> > > > > 
> > > > > Also on i686 there is another extra frame on top __libc_do_syscall.
> > > > 
> > > > I extended the patch to cover some extra sycall wrapper function
> > > > symbols on i386 and armhf and pushed it to valgrind trunk and
> > > > VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
> > > > f42. This does seem to show that only on arm64 the tail calls
> > > > obscure observing the full call stack.
> > > 
> > > This has now landed in fedora rawhide and f42. Test results look good,
> > > except for some if the arm64 tests where the tail calls obscure
> > > observing the full call stack. Please let me know if you need any more
> > > input from us to get this fix in glibc.
> > 
> > Please let me know. Valgrind test results for syscall backtraces on
> > anything except arm64 look good. We are working on valgrind 3.25.0
> > now, to be released around April 24.
> 
> valgrind 3.25.0-RC1 has been released and test results look good on
> most arches. arm64 does show the issue described above where the tail
> calls obscure observing the full call stack when doing system calls.

valgrind 3.25.0 have been released and is now in Fedora rawhide and
Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
aarch64 still seem to be a problem for observability of the syscall
call stack.

> Let me know what would be needed to get the above patch reviewed.

Thanks,

Mark
Adhemerval Zanella Netto May 9, 2025, 2:08 p.m. UTC | #8
On 25/04/25 19:06, Mark Wielaard wrote:
> Hi all,
> 
> On Fri, Apr 18, 2025 at 05:46:54PM +0200, Mark Wielaard wrote:
>> On Mon, 2025-04-14 at 14:06 +0200, Mark Wielaard wrote:
>>> On Sun, Apr 06, 2025 at 03:23:54PM +0200, Mark Wielaard wrote:
>>>> On Mon, Mar 31, 2025 at 11:29:41AM +0200, Mark Wielaard wrote:
>>>>> On Fri, Mar 28, 2025 at 07:02:28PM +0100, Mark Wielaard wrote:
>>>>>> On Fri, 2025-03-21 at 14:01 +0100, Florian Weimer wrote:
>>>>>>> Without this change, the system call wrapper function is not visible
>>>>>>> on the stack at the time of the system call, which causes problems
>>>>>>> for interception tools such as valgrind.
>>>>>>>
>>>>>>> Enhances commit 89b53077d2a58f00e7debdfe58afabe953dac60d ("nptl: Fix
>>>>>>> Race conditions in pthread cancellation [BZ#12683]").
>>>>>>>
>>>>>>> Tested on i686-linux-gnu, powerpc64le-linux-gnu, x86_64-linux-gnu.
>>>>>>> (We're still discussing if valgrind needs this, but if it does, here's a
>>>>>>> patch.)
>>>>>>
>>>>>> I implemented the valgrind part of skipping the syscall_cancel frames
>>>>>> here: https://bugs.kde.org/show_bug.cgi?id=502126#c2
>>>>>> And there is a valgrind package build for fedora rawhide:
>>>>>> https://koji.fedoraproject.org/koji/buildinfo?buildID=2687393
>>>>>>
>>>>>> For ppc64le, s390x and x86_64 that patch seems enough.
>>>>>>
>>>>>> For i686 and aarch64 there does seem to be an issue with missing the
>>>>>> glibc calling function because of a tail call.
>>>>>>
>>>>>> Also on i686 there is another extra frame on top __libc_do_syscall.
>>>>>
>>>>> I extended the patch to cover some extra sycall wrapper function
>>>>> symbols on i386 and armhf and pushed it to valgrind trunk and
>>>>> VALGRIND_3_24_BRANCH. There are builds for fedora rawhide and
>>>>> f42. This does seem to show that only on arm64 the tail calls
>>>>> obscure observing the full call stack.
>>>>
>>>> This has now landed in fedora rawhide and f42. Test results look good,
>>>> except for some if the arm64 tests where the tail calls obscure
>>>> observing the full call stack. Please let me know if you need any more
>>>> input from us to get this fix in glibc.
>>>
>>> Please let me know. Valgrind test results for syscall backtraces on
>>> anything except arm64 look good. We are working on valgrind 3.25.0
>>> now, to be released around April 24.
>>
>> valgrind 3.25.0-RC1 has been released and test results look good on
>> most arches. arm64 does show the issue described above where the tail
>> calls obscure observing the full call stack when doing system calls.
> 
> valgrind 3.25.0 have been released and is now in Fedora rawhide and
> Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
> aarch64 still seem to be a problem for observability of the syscall
> call stack.
> 

I am trying to check if patch to inline the cancellation wrappers [1]
would help it, but I am not sure how exactly would handle stacktraces
that should be artificial and only represented for debug information.

With the patch applied, both x86_64 and aarch64 should inline the
syscall_cancel and internal_syscall_cancel call, only required an
extra __syscall_cancel_arch call for the case when the process it
multithread.

On x86_64 it does shows as expected:

valgrind-git (x86_64)$ ./coregrind/valgrind memcheck/tests/sendmsg
[...]
==2131145== Syscall param sendmsg(msg) points to uninitialised byte(s)
==2131145==    at 0x4972EE0: syscall_cancel (sysdep-cancel.h:83)
==2131145==    by 0x4972EE0: sendmsg (sendmsg.c:28)
==2131145==    by 0x4001332: main (sendmsg.c:46)
==2131145==  Address 0x1ffefff850 is on thread 1's stack
==2131145==  in frame #1, created by main (sendmsg.c:13)
[...]

But on aarch64 it shows internal_syscall_cancel, which is indeed
inlined:

valgrind-git (aarch64)$ ./coregrind/valgrind memcheck/tests/sendmsg
[...]
==483437== Syscall param sendmsg(msg) points to uninitialised byte(s)
==483437==    at 0x49D250C: internal_syscall_cancel (sysdep-cancel.h:44)
==483437==    by 0x49D250C: syscall_cancel (sysdep-cancel.h:79)
==483437==    by 0x49D250C: sendmsg (sendmsg.c:28)
==483437==    by 0x4000B4B: main (sendmsg.c:46)
==483437==  Address 0x1ffefffaf8 is on thread 1's stack
==483437==  in frame #1, created by main (sendmsg.c:13)

I am not sure if valgrind consider this an error, nor if it should be
valgrind or compiler to handle this correctly.  I am not aware of any
attribute if can properly used to 'hide' internal_syscall_cancel in
this case, or even if it makes sense.

[1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/cancel-wrappers-inline
Mark Wielaard May 9, 2025, 2:24 p.m. UTC | #9
Hi Adhemerval,

On Fri, 2025-05-09 at 11:08 -0300, Adhemerval Zanella Netto wrote:
> On 25/04/25 19:06, Mark Wielaard wrote:
> > valgrind 3.25.0 have been released and is now in Fedora rawhide and
> > Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
> > aarch64 still seem to be a problem for observability of the syscall
> > call stack.
> 
> I am trying to check if patch to inline the cancellation wrappers [1]
> would help it, but I am not sure how exactly would handle stacktraces
> that should be artificial and only represented for debug information.
> 
> With the patch applied, both x86_64 and aarch64 should inline the
> syscall_cancel and internal_syscall_cancel call, only required an
> extra __syscall_cancel_arch call for the case when the process it
> multithread.
> 
> On x86_64 it does shows as expected:
> 
> valgrind-git (x86_64)$ ./coregrind/valgrind memcheck/tests/sendmsg
> [...]
> ==2131145== Syscall param sendmsg(msg) points to uninitialised byte(s)
> ==2131145==    at 0x4972EE0: syscall_cancel (sysdep-cancel.h:83)
> ==2131145==    by 0x4972EE0: sendmsg (sendmsg.c:28)
> ==2131145==    by 0x4001332: main (sendmsg.c:46)
> ==2131145==  Address 0x1ffefff850 is on thread 1's stack
> ==2131145==  in frame #1, created by main (sendmsg.c:13)
> [...]
> 
> But on aarch64 it shows internal_syscall_cancel, which is indeed
> inlined:
> 
> valgrind-git (aarch64)$ ./coregrind/valgrind memcheck/tests/sendmsg
> [...]
> ==483437== Syscall param sendmsg(msg) points to uninitialised byte(s)
> ==483437==    at 0x49D250C: internal_syscall_cancel (sysdep-cancel.h:44)
> ==483437==    by 0x49D250C: syscall_cancel (sysdep-cancel.h:79)
> ==483437==    by 0x49D250C: sendmsg (sendmsg.c:28)
> ==483437==    by 0x4000B4B: main (sendmsg.c:46)
> ==483437==  Address 0x1ffefffaf8 is on thread 1's stack
> ==483437==  in frame #1, created by main (sendmsg.c:13)
> 
> I am not sure if valgrind consider this an error, nor if it should be
> valgrind or compiler to handle this correctly.  I am not aware of any
> attribute if can properly used to 'hide' internal_syscall_cancel in
> this case, or even if it makes sense.

Interesting. I would have assumed valgrind would have stripped out the
syscall_cancel calls. But it looks like we assume they always have a __
prefix (__syscall_cancel_arch, __internal_syscall_cancel and
__syscall_cancel). Apparently with glibc debuginfo installed it picks
up the inlined DWARF name. But we can filter those out too.

I think the above would work without glibc debuginfo installed because
then valgrind won't even know/see the inlined names.

Just to be sure, you are using the current git version of valgrind?
And you haven't explicitly installed that version? If so, could you try
the above with:

 ./vg-in-place memcheck/tests/sendmsg

vg-in-place makes sure the just compiled tools are used.

Thanks,

Mark
Adhemerval Zanella Netto May 9, 2025, 6:02 p.m. UTC | #10
On 09/05/25 11:24, Mark Wielaard wrote:
> Hi Adhemerval,
> 
> On Fri, 2025-05-09 at 11:08 -0300, Adhemerval Zanella Netto wrote:
>> On 25/04/25 19:06, Mark Wielaard wrote:
>>> valgrind 3.25.0 have been released and is now in Fedora rawhide and
>>> Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
>>> aarch64 still seem to be a problem for observability of the syscall
>>> call stack.
>>
>> I am trying to check if patch to inline the cancellation wrappers [1]
>> would help it, but I am not sure how exactly would handle stacktraces
>> that should be artificial and only represented for debug information.
>>
>> With the patch applied, both x86_64 and aarch64 should inline the
>> syscall_cancel and internal_syscall_cancel call, only required an
>> extra __syscall_cancel_arch call for the case when the process it
>> multithread.
>>
>> On x86_64 it does shows as expected:
>>
>> valgrind-git (x86_64)$ ./coregrind/valgrind memcheck/tests/sendmsg
>> [...]
>> ==2131145== Syscall param sendmsg(msg) points to uninitialised byte(s)
>> ==2131145==    at 0x4972EE0: syscall_cancel (sysdep-cancel.h:83)
>> ==2131145==    by 0x4972EE0: sendmsg (sendmsg.c:28)
>> ==2131145==    by 0x4001332: main (sendmsg.c:46)
>> ==2131145==  Address 0x1ffefff850 is on thread 1's stack
>> ==2131145==  in frame #1, created by main (sendmsg.c:13)
>> [...]
>>
>> But on aarch64 it shows internal_syscall_cancel, which is indeed
>> inlined:
>>
>> valgrind-git (aarch64)$ ./coregrind/valgrind memcheck/tests/sendmsg
>> [...]
>> ==483437== Syscall param sendmsg(msg) points to uninitialised byte(s)
>> ==483437==    at 0x49D250C: internal_syscall_cancel (sysdep-cancel.h:44)
>> ==483437==    by 0x49D250C: syscall_cancel (sysdep-cancel.h:79)
>> ==483437==    by 0x49D250C: sendmsg (sendmsg.c:28)
>> ==483437==    by 0x4000B4B: main (sendmsg.c:46)
>> ==483437==  Address 0x1ffefffaf8 is on thread 1's stack
>> ==483437==  in frame #1, created by main (sendmsg.c:13)
>>
>> I am not sure if valgrind consider this an error, nor if it should be
>> valgrind or compiler to handle this correctly.  I am not aware of any
>> attribute if can properly used to 'hide' internal_syscall_cancel in
>> this case, or even if it makes sense.
> 
> Interesting. I would have assumed valgrind would have stripped out the
> syscall_cancel calls. But it looks like we assume they always have a __
> prefix (__syscall_cancel_arch, __internal_syscall_cancel and
> __syscall_cancel). Apparently with glibc debuginfo installed it picks
> up the inlined DWARF name. But we can filter those out too.
> 
> I think the above would work without glibc debuginfo installed because
> then valgrind won't even know/see the inlined names.

I have tested only with a unstriped glibc build, so I am not sure how 
it would play with debuginfo.

> 
> Just to be sure, you are using the current git version of valgrind?
> And you haven't explicitly installed that version? If so, could you try
> the above with:
> 
>  ./vg-in-place memcheck/tests/sendmsg
> 
> vg-in-place makes sure the just compiled tools are used.

I took care to export VALGRIND_LIB to the proper location, but the results
with vg-in-place are the same.

> 
> Thanks,
> 
> Mark
Mark Wielaard May 12, 2025, 11:12 a.m. UTC | #11
Hi Adhemerval,

On Fri, May 09, 2025 at 11:08:42AM -0300, Adhemerval Zanella Netto wrote:
> On 25/04/25 19:06, Mark Wielaard wrote:
> > valgrind 3.25.0 have been released and is now in Fedora rawhide and
> > Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
> > aarch64 still seem to be a problem for observability of the syscall
> > call stack.
> 
> I am trying to check if patch to inline the cancellation wrappers [1]
> would help it, but I am not sure how exactly would handle stacktraces
> that should be artificial and only represented for debug information.

I got that patch [1] working so I could test it myself (only on x86_64
for now). How do you properly test a glibc installed in a non-default
location though? Currently I am doing something like:

export LD_LIBRARY_PATH=/usr/local/glibc/lib
/usr/local/glibc/lib/ld-linux-x86-64.so.2 memcheck/tests/sendmsg

This can also work with valgrind in between because it doesn't use
glibc dynamically itself. But it is not the easiest way to test
things.

> With the patch applied, both x86_64 and aarch64 should inline the
> syscall_cancel and internal_syscall_cancel call, only required an
> extra __syscall_cancel_arch call for the case when the process it
> multithread.
> 
> On x86_64 it does shows as expected:
> 
> valgrind-git (x86_64)$ ./coregrind/valgrind memcheck/tests/sendmsg
> [...]
> ==2131145== Syscall param sendmsg(msg) points to uninitialised byte(s)
> ==2131145==    at 0x4972EE0: syscall_cancel (sysdep-cancel.h:83)
> ==2131145==    by 0x4972EE0: sendmsg (sendmsg.c:28)
> ==2131145==    by 0x4001332: main (sendmsg.c:46)
> ==2131145==  Address 0x1ffefff850 is on thread 1's stack
> ==2131145==  in frame #1, created by main (sendmsg.c:13)
> [...]
> 
> But on aarch64 it shows internal_syscall_cancel, which is indeed
> inlined:
> 
> valgrind-git (aarch64)$ ./coregrind/valgrind memcheck/tests/sendmsg
> [...]
> ==483437== Syscall param sendmsg(msg) points to uninitialised byte(s)
> ==483437==    at 0x49D250C: internal_syscall_cancel (sysdep-cancel.h:44)
> ==483437==    by 0x49D250C: syscall_cancel (sysdep-cancel.h:79)
> ==483437==    by 0x49D250C: sendmsg (sendmsg.c:28)
> ==483437==    by 0x4000B4B: main (sendmsg.c:46)
> ==483437==  Address 0x1ffefffaf8 is on thread 1's stack
> ==483437==  in frame #1, created by main (sendmsg.c:13)
> 
> I am not sure if valgrind consider this an error, nor if it should be
> valgrind or compiler to handle this correctly.  I am not aware of any
> attribute if can properly used to 'hide' internal_syscall_cancel in
> this case, or even if it makes sense.

It isn't an "error" but any extra frames (either "real" or inlined) on
top of the function that does the actual system call causes existing
suppressions and test wrappers to no longer work. So valgrind is
responsible for filtering them out. For actual extra frames we now do
(since 3.25.0). If the extra syscall frame wrappers are inlined they
don't show up if no debuginfo/DWARF is available for glibc. But if it
is available we'll have to filter them out and/or don't look them up
for the top-level syscalls.

Selectively removing these inlined calls is a little fragile at the
moment in valgrind, since it is done at a later time than capturing
the backtrace addresses. So either we have to pass through that we are
handling a syscall at the moment and so don't want to "expand" the
top-level frame with any inlines. And/Or we match on the "magic"
inlined syscall cancel frames (everywhere).

You could use __attribute__(__artificial__)) which should mark the
function with DW_AT_artificial. Valgrind doesn't know about this
attribute yet, but we could probably with some extra work.

Thanks,

Mark

> [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/cancel-wrappers-inline
Adhemerval Zanella Netto May 13, 2025, 8:22 a.m. UTC | #12
On 12/05/25 12:12, Mark Wielaard wrote:
> Hi Adhemerval,
> 
> On Fri, May 09, 2025 at 11:08:42AM -0300, Adhemerval Zanella Netto wrote:
>> On 25/04/25 19:06, Mark Wielaard wrote:
>>> valgrind 3.25.0 have been released and is now in Fedora rawhide and
>>> Fedora 42 with the new glibc syscall_cancel frames. The tail calls on
>>> aarch64 still seem to be a problem for observability of the syscall
>>> call stack.
>>
>> I am trying to check if patch to inline the cancellation wrappers [1]
>> would help it, but I am not sure how exactly would handle stacktraces
>> that should be artificial and only represented for debug information.
> 
> I got that patch [1] working so I could test it myself (only on x86_64
> for now). How do you properly test a glibc installed in a non-default
> location though? Currently I am doing something like:
> 
> export LD_LIBRARY_PATH=/usr/local/glibc/lib
> /usr/local/glibc/lib/ld-linux-x86-64.so.2 memcheck/tests/sendmsg
> 
> This can also work with valgrind in between because it doesn't use
> glibc dynamically itself. But it is not the easiest way to test
> things.

I used -Wl,-dynamic-linker and -Wl,-rpath instead at configure:

  --enable-only64bit \
  LDFLAGS="-Wl,--dynamic-linker,/path/to/glibc-build/testroot.pristine/lib/ld-linux-aarch64.so.1 \
           -Wl,--rpath,/path/to/glibc-build/testroot.pristine/lib64"

It won't use glibc startup object (*ct*.o), neither the headers; but
using a recent systems it should not be a problem (the difference
should be minimal).


> 
>> With the patch applied, both x86_64 and aarch64 should inline the
>> syscall_cancel and internal_syscall_cancel call, only required an
>> extra __syscall_cancel_arch call for the case when the process it
>> multithread.
>>
>> On x86_64 it does shows as expected:
>>
>> valgrind-git (x86_64)$ ./coregrind/valgrind memcheck/tests/sendmsg
>> [...]
>> ==2131145== Syscall param sendmsg(msg) points to uninitialised byte(s)
>> ==2131145==    at 0x4972EE0: syscall_cancel (sysdep-cancel.h:83)
>> ==2131145==    by 0x4972EE0: sendmsg (sendmsg.c:28)
>> ==2131145==    by 0x4001332: main (sendmsg.c:46)
>> ==2131145==  Address 0x1ffefff850 is on thread 1's stack
>> ==2131145==  in frame #1, created by main (sendmsg.c:13)
>> [...]
>>
>> But on aarch64 it shows internal_syscall_cancel, which is indeed
>> inlined:
>>
>> valgrind-git (aarch64)$ ./coregrind/valgrind memcheck/tests/sendmsg
>> [...]
>> ==483437== Syscall param sendmsg(msg) points to uninitialised byte(s)
>> ==483437==    at 0x49D250C: internal_syscall_cancel (sysdep-cancel.h:44)
>> ==483437==    by 0x49D250C: syscall_cancel (sysdep-cancel.h:79)
>> ==483437==    by 0x49D250C: sendmsg (sendmsg.c:28)
>> ==483437==    by 0x4000B4B: main (sendmsg.c:46)
>> ==483437==  Address 0x1ffefffaf8 is on thread 1's stack
>> ==483437==  in frame #1, created by main (sendmsg.c:13)
>>
>> I am not sure if valgrind consider this an error, nor if it should be
>> valgrind or compiler to handle this correctly.  I am not aware of any
>> attribute if can properly used to 'hide' internal_syscall_cancel in
>> this case, or even if it makes sense.
> 
> It isn't an "error" but any extra frames (either "real" or inlined) on
> top of the function that does the actual system call causes existing
> suppressions and test wrappers to no longer work. So valgrind is
> responsible for filtering them out. For actual extra frames we now do
> (since 3.25.0). If the extra syscall frame wrappers are inlined they
> don't show up if no debuginfo/DWARF is available for glibc. But if it
> is available we'll have to filter them out and/or don't look them up
> for the top-level syscalls.
> 
> Selectively removing these inlined calls is a little fragile at the
> moment in valgrind, since it is done at a later time than capturing
> the backtrace addresses. So either we have to pass through that we are
> handling a syscall at the moment and so don't want to "expand" the
> top-level frame with any inlines. And/Or we match on the "magic"
> inlined syscall cancel frames (everywhere).

If I understood correctly these seems to be an issue only for valgrind
regression tests, right? If so, could this be handled solely on test
validation?

Or does valgrind aims to hide such inlined calls on stacktrace reports
as well?

> 
> You could use __attribute__(__artificial__)) which should mark the
> function with DW_AT_artificial. Valgrind doesn't know about this
> attribute yet, but we could probably with some extra work.

Yeah, this is the first thing I tried to see if valgrind could handle
it.  Doing this on glibc should be easy enough if you think it would
be way to handle it on valgrind.

> 
> Thanks,
> 
> Mark
> 
>> [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/cancel-wrappers-inline
Mark Wielaard May 14, 2025, 11:02 p.m. UTC | #13
Hi Adhemerval,

On Tue, May 13, 2025 at 09:22:21AM +0100, Adhemerval Zanella Netto wrote:
> On 12/05/25 12:12, Mark Wielaard wrote:
> >> I am trying to check if patch to inline the cancellation wrappers [1]
> >> would help it, but I am not sure how exactly would handle stacktraces
> >> that should be artificial and only represented for debug information.
> > 
> > I got that patch [1] working so I could test it myself (only on x86_64
> > for now). How do you properly test a glibc installed in a non-default
> > location though? Currently I am doing something like:
> > 
> > export LD_LIBRARY_PATH=/usr/local/glibc/lib
> > /usr/local/glibc/lib/ld-linux-x86-64.so.2 memcheck/tests/sendmsg
> > 
> > This can also work with valgrind in between because it doesn't use
> > glibc dynamically itself. But it is not the easiest way to test
> > things.
> 
> I used -Wl,-dynamic-linker and -Wl,-rpath instead at configure:
> 
>   --enable-only64bit \
>   LDFLAGS="-Wl,--dynamic-linker,/path/to/glibc-build/testroot.pristine/lib/ld-linux-aarch64.so.1 \
>            -Wl,--rpath,/path/to/glibc-build/testroot.pristine/lib64"
>
> It won't use glibc startup object (*ct*.o), neither the headers; but
> using a recent systems it should not be a problem (the difference
> should be minimal).

O, thanks, that makes things much more convenient to test.

> > It isn't an "error" but any extra frames (either "real" or inlined) on
> > top of the function that does the actual system call causes existing
> > suppressions and test wrappers to no longer work. So valgrind is
> > responsible for filtering them out. For actual extra frames we now do
> > (since 3.25.0). If the extra syscall frame wrappers are inlined they
> > don't show up if no debuginfo/DWARF is available for glibc. But if it
> > is available we'll have to filter them out and/or don't look them up
> > for the top-level syscalls.
> > 
> > Selectively removing these inlined calls is a little fragile at the
> > moment in valgrind, since it is done at a later time than capturing
> > the backtrace addresses. So either we have to pass through that we are
> > handling a syscall at the moment and so don't want to "expand" the
> > top-level frame with any inlines. And/Or we match on the "magic"
> > inlined syscall cancel frames (everywhere).
> 
> If I understood correctly these seems to be an issue only for valgrind
> regression tests, right? If so, could this be handled solely on test
> validation?
> 
> Or does valgrind aims to hide such inlined calls on stacktrace reports
> as well?

We do aim to hide these in general since it invalidates any existing
suppressions people might be using. In theory we can of course not
guarantee these work between glibc versions, but for most glibc
syscall wrappers they are surprisingly stable. It also allows people
to generate new suppressions that should work across versions.

> > You could use __attribute__(__artificial__)) which should mark the
> > function with DW_AT_artificial. Valgrind doesn't know about this
> > attribute yet, but we could probably with some extra work.
> 
> Yeah, this is the first thing I tried to see if valgrind could handle
> it.  Doing this on glibc should be easy enough if you think it would
> be way to handle it on valgrind.

I currently have something working based on matching the function
names, but checking for DW_AT_artificial would be nicer. Also because
I think in general users are not interested in inlined artificial
functions in their stack traces. So it should be a general improvement
if we filter those out.

I post when I have a patch that does that (there is one tricky issue
if the abstract origin ends up in a dwz partial units, but that is
completely an internal valgrind DWARF reader issue, nothing to do with
glibc).

> >> [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/cancel-wrappers-inline
> 

Thanks,

Mark
diff mbox series

Patch

diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 2cc98725c3..8f1e4fc4f9 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -166,30 +166,45 @@  long int __syscall_cancel (__syscall_arg_t arg1, __syscall_arg_t arg2,
 			   __SYSCALL_CANCEL7_ARG_DEF
 			   __syscall_arg_t nr) attribute_hidden;
 
+/* Inhibit tail call optimization, so that the stack frame of the
+   system-call-implementing function is visible at the time of the
+   system call.  */
+#define __syscall_cancel_barrier(...)					\
+  ({									\
+     long int __syscall_cancel_result = __syscall_cancel (__VA_ARGS__); \
+     asm volatile ("" ::: "memory");					\
+     __syscall_cancel_result;						\
+  })
+
 #define __SYSCALL_CANCEL0(name)						\
-  __syscall_cancel (0, 0, 0, 0, 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
+  __syscall_cancel_barrier (0, 0, 0, 0, 0, 0,				\
+			    __SYSCALL_CANCEL7_ARG __NR_##name)
 #define __SYSCALL_CANCEL1(name, a1)					\
-  __syscall_cancel (__SSC (a1), 0, 0, 0, 0, 0,				\
-		    __SYSCALL_CANCEL7_ARG __NR_##name)
-#define __SYSCALL_CANCEL2(name, a1, a2) \
-  __syscall_cancel (__SSC (a1), __SSC (a2), 0, 0, 0, 0,			\
-		    __SYSCALL_CANCEL7_ARG __NR_##name)
-#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
-  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), 0, 0, 0,	\
-		    __SYSCALL_CANCEL7_ARG __NR_##name)
-#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
-  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),			\
-		    __SSC(a4), 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
-#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
-  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC(a4),	\
-		    __SSC (a5), 0, __SYSCALL_CANCEL7_ARG __NR_##name)
-#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
-  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC (a4),	\
-		    __SSC (a5), __SSC (a6), __SYSCALL_CANCEL7_ARG	\
-		    __NR_##name)
+  __syscall_cancel_barrier (__SSC (a1), 0, 0, 0, 0, 0,			\
+			    __SYSCALL_CANCEL7_ARG __NR_##name)
+#define __SYSCALL_CANCEL2(name, a1, a2)					\
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), 0, 0, 0, 0,		\
+			    __SYSCALL_CANCEL7_ARG __NR_##name)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3)				\
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
+			    0, 0, 0, __SYSCALL_CANCEL7_ARG __NR_##name)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4)				\
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
+			    __SSC(a4), 0, 0,				\
+			    __SYSCALL_CANCEL7_ARG __NR_##name)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5)			\
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
+			    __SSC(a4), __SSC (a5), 0,			\
+			    __SYSCALL_CANCEL7_ARG __NR_##name)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6)			\
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
+			    __SSC (a4), __SSC (a5), __SSC (a6),		\
+			    __SYSCALL_CANCEL7_ARG			\
+			    __NR_##name)
 #define __SYSCALL_CANCEL7(name, a1, a2, a3, a4, a5, a6, a7)		\
-  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC (a4),	\
-		    __SSC (a5), __SSC (a6), __SSC (a7), __NR_##name)
+  __syscall_cancel_barrier (__SSC (a1), __SSC (a2), __SSC (a3),		\
+			    __SSC (a4), __SSC (a5), __SSC (a6),		\
+			    __SSC (a7), __NR_##name)
 
 #define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
 #define __SYSCALL_CANCEL_NARGS(...) \