mbox series

[RFC,0/3] signal: Move si_trapno into the _si_fault union

Message ID m1zgxfs7zq.fsf_-_@fess.ebiederm.org
Headers show
Series signal: Move si_trapno into the _si_fault union | expand

Message

Eric W. Biederman April 30, 2021, 10:49 p.m. UTC
Eric W. Biederman (3):
      siginfo: Move si_trapno inside the union inside _si_fault
      signal: Implement SIL_FAULT_TRAPNO
      signal: Use dedicated helpers to send signals with si_trapno set

 arch/alpha/kernel/osf_sys.c        |  2 +-
 arch/alpha/kernel/signal.c         |  4 +-
 arch/alpha/kernel/traps.c          | 24 ++++++------
 arch/alpha/mm/fault.c              |  4 +-
 arch/sparc/kernel/process_64.c     |  2 +-
 arch/sparc/kernel/sys_sparc_32.c   |  2 +-
 arch/sparc/kernel/sys_sparc_64.c   |  2 +-
 arch/sparc/kernel/traps_32.c       | 22 +++++------
 arch/sparc/kernel/traps_64.c       | 44 ++++++++++------------
 arch/sparc/kernel/unaligned_32.c   |  2 +-
 arch/sparc/mm/fault_32.c           |  2 +-
 arch/sparc/mm/fault_64.c           |  2 +-
 fs/signalfd.c                      |  7 +---
 include/linux/compat.h             |  4 +-
 include/linux/sched/signal.h       | 12 ++----
 include/linux/signal.h             |  1 +
 include/uapi/asm-generic/siginfo.h |  6 +--
 kernel/signal.c                    | 77 ++++++++++++++++++++++----------------
 18 files changed, 107 insertions(+), 112 deletions(-)

Comments

Eric W. Biederman April 30, 2021, 11:23 p.m. UTC | #1
I am looking at perf_sigtrap and I am confused by the code.


	/*
	 * We'd expect this to only occur if the irq_work is delayed and either
	 * ctx->task or current has changed in the meantime. This can be the
	 * case on architectures that do not implement arch_irq_work_raise().
	 */
	if (WARN_ON_ONCE(event->ctx->task != current))
		return;

	/*
	 * perf_pending_event() can race with the task exiting.
	 */
	if (current->flags & PF_EXITING)
		return;


It performs tests that absolutely can never fail if we are talking about
a synchronous exception.  The code force_sig family of functions only
make sense to use with and are only safe to use with synchronous
exceptions.

Are the tests in perf_sigtrap necessary or is perf_sigtrap not reporting
a synchronous event?

Eric
Eric W. Biederman April 30, 2021, 11:47 p.m. UTC | #2
Well with 7 patches instead of 3 that was a little more than I thought
I was going to send.

However that does demonstrate what I am thinking, and I think most of
the changes are reasonable at this point.

I am very curious how synchronous this all is, because if this code
is truly synchronous updating signalfd to handle this class of signal
doesn't really make sense.

If the code is not synchronous using force_sig is questionable.

Eric W. Biederman (7):
      siginfo: Move si_trapno inside the union inside _si_fault
      signal: Implement SIL_FAULT_TRAPNO
      signal: Use dedicated helpers to send signals with si_trapno set
      signal: Remove __ARCH_SI_TRAPNO
      signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency
      signal: Factor force_sig_perf out of perf_sigtrap
      signal: Deliver all of the perf_data in si_perf

 arch/alpha/include/uapi/asm/siginfo.h |   2 -
 arch/alpha/kernel/osf_sys.c           |   2 +-
 arch/alpha/kernel/signal.c            |   4 +-
 arch/alpha/kernel/traps.c             |  24 ++++----
 arch/alpha/mm/fault.c                 |   4 +-
 arch/mips/include/uapi/asm/siginfo.h  |   2 -
 arch/sparc/include/uapi/asm/siginfo.h |   3 -
 arch/sparc/kernel/process_64.c        |   2 +-
 arch/sparc/kernel/sys_sparc_32.c      |   2 +-
 arch/sparc/kernel/sys_sparc_64.c      |   2 +-
 arch/sparc/kernel/traps_32.c          |  22 +++----
 arch/sparc/kernel/traps_64.c          |  44 ++++++--------
 arch/sparc/kernel/unaligned_32.c      |   2 +-
 arch/sparc/mm/fault_32.c              |   2 +-
 arch/sparc/mm/fault_64.c              |   2 +-
 fs/signalfd.c                         |  13 ++--
 include/linux/compat.h                |   9 +--
 include/linux/sched/signal.h          |  13 ++--
 include/linux/signal.h                |   3 +-
 include/uapi/asm-generic/siginfo.h    |  11 ++--
 include/uapi/linux/signalfd.h         |   4 +-
 kernel/events/core.c                  |  11 +---
 kernel/signal.c                       | 108 ++++++++++++++++++++++------------
 23 files changed, 149 insertions(+), 142 deletions(-)
Marco Elver May 1, 2021, 12:28 a.m. UTC | #3
On Sat, 1 May 2021 at 01:23, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> I am looking at perf_sigtrap and I am confused by the code.
>
>
>         /*
>          * We'd expect this to only occur if the irq_work is delayed and either
>          * ctx->task or current has changed in the meantime. This can be the
>          * case on architectures that do not implement arch_irq_work_raise().
>          */
>         if (WARN_ON_ONCE(event->ctx->task != current))
>                 return;
>
>         /*
>          * perf_pending_event() can race with the task exiting.
>          */
>         if (current->flags & PF_EXITING)
>                 return;
>
>
> It performs tests that absolutely can never fail if we are talking about
> a synchronous exception.  The code force_sig family of functions only
> make sense to use with and are only safe to use with synchronous
> exceptions.
>
> Are the tests in perf_sigtrap necessary or is perf_sigtrap not reporting
> a synchronous event?

Yes it's synchronous, insofar that the user will receive the signal
right when the event happens (I've tested this extensively, also see
tools/testing/selftests/perf_events). Of course, there's some effort
involved from the point where the event triggered to actually safely
delivering the signal. In particular, for HW events, these arrive in
NMI, and we can't do much in NMI, and therefore will queue an
irq_work.

On architectures that properly implement irq_work, it will do a
self-IPI, so that once it is safe to do so, another interrupt is
delivered where we process the event and do the force_sig_info(). The
task where the event occurred never got a chance to run -- except for
bad architectures with broken irq_work, and the first WARN_ON() is
there so we don't crash the kernel if somebody botched their irq_work.

Since we're talking about various HW events, these can still trigger
while the task is exiting, before perf_event_exit_task() being called
during do_exit(). That's why we have the 2nd check.

Thanks,
-- Marco
Marco Elver May 1, 2021, 12:37 a.m. UTC | #4
On Sat, 1 May 2021 at 01:48, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Well with 7 patches instead of 3 that was a little more than I thought
> I was going to send.
>
> However that does demonstrate what I am thinking, and I think most of
> the changes are reasonable at this point.
>
> I am very curious how synchronous this all is, because if this code
> is truly synchronous updating signalfd to handle this class of signal
> doesn't really make sense.
>
> If the code is not synchronous using force_sig is questionable.
>
> Eric W. Biederman (7):
>       siginfo: Move si_trapno inside the union inside _si_fault
>       signal: Implement SIL_FAULT_TRAPNO
>       signal: Use dedicated helpers to send signals with si_trapno set
>       signal: Remove __ARCH_SI_TRAPNO
>       signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency
>       signal: Factor force_sig_perf out of perf_sigtrap
>       signal: Deliver all of the perf_data in si_perf

Thank you for doing this so quickly -- it looks much cleaner. I'll
have a more detailed look next week and also run some tests myself.

At a first glance, you've broken our tests in
tools/testing/selftests/perf_events/ -- needs a
s/si_perf/si_perf.data/, s/si_errno/si_perf.type/

Thanks!

-- Marco
Eric W. Biederman May 1, 2021, 3:16 p.m. UTC | #5
Marco Elver <elver@google.com> writes:

> On Sat, 1 May 2021 at 01:48, Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> Well with 7 patches instead of 3 that was a little more than I thought
>> I was going to send.
>>
>> However that does demonstrate what I am thinking, and I think most of
>> the changes are reasonable at this point.
>>
>> I am very curious how synchronous this all is, because if this code
>> is truly synchronous updating signalfd to handle this class of signal
>> doesn't really make sense.
>>
>> If the code is not synchronous using force_sig is questionable.
>>
>> Eric W. Biederman (7):
>>       siginfo: Move si_trapno inside the union inside _si_fault
>>       signal: Implement SIL_FAULT_TRAPNO
>>       signal: Use dedicated helpers to send signals with si_trapno set
>>       signal: Remove __ARCH_SI_TRAPNO
>>       signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency
>>       signal: Factor force_sig_perf out of perf_sigtrap
>>       signal: Deliver all of the perf_data in si_perf
>
> Thank you for doing this so quickly -- it looks much cleaner. I'll
> have a more detailed look next week and also run some tests myself.
>
> At a first glance, you've broken our tests in
> tools/testing/selftests/perf_events/ -- needs a
> s/si_perf/si_perf.data/, s/si_errno/si_perf.type/

Yeah.  I figured I did, but I couldn't figure out where the tests were
and I didn't have a lot of time.  I just wanted to get this out so we
can do as much as reasonable before the ABI starts being actively used
by userspace and we can't change it.

Eric
Marco Elver May 1, 2021, 4:24 p.m. UTC | #6
On Sat, 1 May 2021 at 17:17, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Marco Elver <elver@google.com> writes:
>
> > On Sat, 1 May 2021 at 01:48, Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>
> >> Well with 7 patches instead of 3 that was a little more than I thought
> >> I was going to send.
> >>
> >> However that does demonstrate what I am thinking, and I think most of
> >> the changes are reasonable at this point.
> >>
> >> I am very curious how synchronous this all is, because if this code
> >> is truly synchronous updating signalfd to handle this class of signal
> >> doesn't really make sense.
> >>
> >> If the code is not synchronous using force_sig is questionable.
> >>
> >> Eric W. Biederman (7):
> >>       siginfo: Move si_trapno inside the union inside _si_fault
> >>       signal: Implement SIL_FAULT_TRAPNO
> >>       signal: Use dedicated helpers to send signals with si_trapno set
> >>       signal: Remove __ARCH_SI_TRAPNO
> >>       signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency
> >>       signal: Factor force_sig_perf out of perf_sigtrap
> >>       signal: Deliver all of the perf_data in si_perf
> >
> > Thank you for doing this so quickly -- it looks much cleaner. I'll
> > have a more detailed look next week and also run some tests myself.
> >
> > At a first glance, you've broken our tests in
> > tools/testing/selftests/perf_events/ -- needs a
> > s/si_perf/si_perf.data/, s/si_errno/si_perf.type/
>
> Yeah.  I figured I did, but I couldn't figure out where the tests were
> and I didn't have a lot of time.  I just wanted to get this out so we
> can do as much as reasonable before the ABI starts being actively used
> by userspace and we can't change it.

No worries, and agreed. I've run tools/testing/selftests/perf_events
tests on x86-64 (native + 32-bit compat), and compile-tested x86-32,
arm64, arm (with my static asserts), m68k, and sparc64. Some trivial
breakages, note comments in other patches.

With the trivial fixes this looks good to me. I'll happily retest v2
when you send it.

Thanks,
-- Marco
Marco Elver May 1, 2021, 4:26 p.m. UTC | #7
On Sat, 1 May 2021 at 02:37, Marco Elver <elver@google.com> wrote:
> On Sat, 1 May 2021 at 01:48, Eric W. Biederman <ebiederm@xmission.com> wrote:
> >
> > Well with 7 patches instead of 3 that was a little more than I thought
> > I was going to send.
> >
> > However that does demonstrate what I am thinking, and I think most of
> > the changes are reasonable at this point.
> >
> > I am very curious how synchronous this all is, because if this code
> > is truly synchronous updating signalfd to handle this class of signal
> > doesn't really make sense.

Just a note on this: the reason for adding signalfd support was based
on the comment at SIL_FAULT_PKUERR:

>                 /*
>                   * Fall through to the SIL_FAULT case.  Both SIL_FAULT_BNDERR
>                   * and SIL_FAULT_PKUERR are only generated by faults that
>                   * deliver them synchronously to userspace.  In case someone
>                   * injects one of these signals and signalfd catches it treat
>                   * it as SIL_FAULT.
>                   */

The same would hold for SIL_FAULT_PERF_EVENT, where somebody injects
(re-injects perhaps?) such an event. But otherwise, yes,
non-synchronous handling of SIGTRAP/TRAP_PERF is pretty useless for
almost all usecases I can think of.

Thanks,
-- Marco