Message ID | 1268403820-13556-2-git-send-email-chase.douglas@canonical.com |
---|---|
State | Accepted |
Delegated to: | Andy Whitcroft |
Headers | show |
On 03/12/2010 07:23 AM, Chase Douglas wrote: > If we hit __schedule_bug, then something bad has happened, and it would > be helpful to get an accurate function trace to see what's going wrong. > The debugfs function tracer works great for this task, but it spits out > a ton of information both before and after the real bug occurs. This > change stops the tracing when we enter __schedule_bug, so the tracing > report is smaller and leaves the developer with the needed information > right at the end of the trace. > > This should not cause a regression or any negative effects unless > someone truly wanted to trace through a __schedule_bug call. In that > case, they can recompile their own kernel without this call. The vast > majority of tracings that hit __schedule_bug would be enhanced by this > change. > > To see how tracing with this change can be used for "scheduling while > atomic" bugs, see http://bugs.launchpad.net/bugs/534549. > > Signed-off-by: Chase Douglas<chase.douglas@canonical.com> > --- > kernel/sched.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 4d01095..0d2ba50 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -5341,6 +5341,8 @@ static noinline void __schedule_bug(struct task_struct *prev) > { > struct pt_regs *regs = get_irq_regs(); > > + tracing_off(); > + > printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n", > prev->comm, prev->pid, preempt_count()); > Seems reasonable enough. tracing_off() is dead simple, so ought not cause a regression. Have you discussed this with upstream as to why this isn't the default behavior? Acked-by: Tim Gardner <tim.gardner@canonical.com>
On Fri, Mar 12, 2010 at 10:09 AM, Tim Gardner <tim.gardner@canonical.com> wrote: > On 03/12/2010 07:23 AM, Chase Douglas wrote: >> >> If we hit __schedule_bug, then something bad has happened, and it would >> be helpful to get an accurate function trace to see what's going wrong. >> The debugfs function tracer works great for this task, but it spits out >> a ton of information both before and after the real bug occurs. This >> change stops the tracing when we enter __schedule_bug, so the tracing >> report is smaller and leaves the developer with the needed information >> right at the end of the trace. >> >> This should not cause a regression or any negative effects unless >> someone truly wanted to trace through a __schedule_bug call. In that >> case, they can recompile their own kernel without this call. The vast >> majority of tracings that hit __schedule_bug would be enhanced by this >> change. >> >> To see how tracing with this change can be used for "scheduling while >> atomic" bugs, see http://bugs.launchpad.net/bugs/534549. >> >> Signed-off-by: Chase Douglas<chase.douglas@canonical.com> >> --- >> kernel/sched.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> diff --git a/kernel/sched.c b/kernel/sched.c >> index 4d01095..0d2ba50 100644 >> --- a/kernel/sched.c >> +++ b/kernel/sched.c >> @@ -5341,6 +5341,8 @@ static noinline void __schedule_bug(struct >> task_struct *prev) >> { >> struct pt_regs *regs = get_irq_regs(); >> >> + tracing_off(); >> + >> printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n", >> prev->comm, prev->pid, preempt_count()); >> > > Seems reasonable enough. tracing_off() is dead simple, so ought not cause a > regression. Have you discussed this with upstream as to why this isn't the > default behavior? I haven't, but I'll send a note out to linux-kernel. I'm guessing the reason is that tracing can be used for many different reasons, and they don't want stray tracing_off calls to kill legitimate tracing in a vanilla kernel. In our case, using tracing_off in __schedule_bug will help us solve scheduling while atomic bugs faster, and we aren't terribly interested in academic uses of tracing in our official kernels.
On Fri, Mar 12, 2010 at 10:23:45AM -0500, Chase Douglas wrote: > >> + tracing_off(); > >> + How about adding a __read_mostly toggle for this set by a boot parameter: if (likely(sched_tracing_stop)) tracing_off(); -apw
On Fri, Mar 12, 2010 at 12:15 PM, Andy Whitcroft <apw@canonical.com> wrote: > On Fri, Mar 12, 2010 at 10:23:45AM -0500, Chase Douglas wrote: > >> >> + tracing_off(); >> >> + > > How about adding a __read_mostly toggle for this set by a boot > parameter: > > if (likely(sched_tracing_stop)) > tracing_off(); > > -apw Possible, but a more invasive patch. If we think people will ever want to disable it, then it's probably worth while. I'm just not sure whether people will ever want to do so in an official Ubuntu kernel.
On Fri, Mar 12, 2010 at 09:23:40AM -0500, Chase Douglas wrote: > If we hit __schedule_bug, then something bad has happened, and it would > be helpful to get an accurate function trace to see what's going wrong. > The debugfs function tracer works great for this task, but it spits out > a ton of information both before and after the real bug occurs. This > change stops the tracing when we enter __schedule_bug, so the tracing > report is smaller and leaves the developer with the needed information > right at the end of the trace. > > This should not cause a regression or any negative effects unless > someone truly wanted to trace through a __schedule_bug call. In that > case, they can recompile their own kernel without this call. The vast > majority of tracings that hit __schedule_bug would be enhanced by this > change. > > To see how tracing with this change can be used for "scheduling while > atomic" bugs, see http://bugs.launchpad.net/bugs/534549. > > Signed-off-by: Chase Douglas <chase.douglas@canonical.com> > --- > kernel/sched.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 4d01095..0d2ba50 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -5341,6 +5341,8 @@ static noinline void __schedule_bug(struct task_struct *prev) > { > struct pt_regs *regs = get_irq_regs(); > > + tracing_off(); > + > printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n", > prev->comm, prev->pid, preempt_count()); > Seems reasonable if it make debugging easier. Acked-by: Andy Whitcroft <apw@canonical.com> -apw
On Wed, Mar 31, 2010 at 6:39 AM, Andy Whitcroft <apw@canonical.com> wrote: > On Fri, Mar 12, 2010 at 09:23:40AM -0500, Chase Douglas wrote: >> If we hit __schedule_bug, then something bad has happened, and it would >> be helpful to get an accurate function trace to see what's going wrong. >> The debugfs function tracer works great for this task, but it spits out >> a ton of information both before and after the real bug occurs. This >> change stops the tracing when we enter __schedule_bug, so the tracing >> report is smaller and leaves the developer with the needed information >> right at the end of the trace. >> >> This should not cause a regression or any negative effects unless >> someone truly wanted to trace through a __schedule_bug call. In that >> case, they can recompile their own kernel without this call. The vast >> majority of tracings that hit __schedule_bug would be enhanced by this >> change. >> >> To see how tracing with this change can be used for "scheduling while >> atomic" bugs, see http://bugs.launchpad.net/bugs/534549. >> >> Signed-off-by: Chase Douglas <chase.douglas@canonical.com> >> --- >> kernel/sched.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> diff --git a/kernel/sched.c b/kernel/sched.c >> index 4d01095..0d2ba50 100644 >> --- a/kernel/sched.c >> +++ b/kernel/sched.c >> @@ -5341,6 +5341,8 @@ static noinline void __schedule_bug(struct task_struct *prev) >> { >> struct pt_regs *regs = get_irq_regs(); >> >> + tracing_off(); >> + >> printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n", >> prev->comm, prev->pid, preempt_count()); >> > > Seems reasonable if it make debugging easier. > > Acked-by: Andy Whitcroft <apw@canonical.com> Just FYI, I have a better set of patches that do the same thing for other areas like BUG() and WARN(), and it's enabled/disabled through a boot option. However, it hasn't been accepted upstream yet. I think it's best to just go with this for now since it's smaller and less likely to cause a regression. -- Chase
diff --git a/kernel/sched.c b/kernel/sched.c index 4d01095..0d2ba50 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5341,6 +5341,8 @@ static noinline void __schedule_bug(struct task_struct *prev) { struct pt_regs *regs = get_irq_regs(); + tracing_off(); + printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n", prev->comm, prev->pid, preempt_count());
If we hit __schedule_bug, then something bad has happened, and it would be helpful to get an accurate function trace to see what's going wrong. The debugfs function tracer works great for this task, but it spits out a ton of information both before and after the real bug occurs. This change stops the tracing when we enter __schedule_bug, so the tracing report is smaller and leaves the developer with the needed information right at the end of the trace. This should not cause a regression or any negative effects unless someone truly wanted to trace through a __schedule_bug call. In that case, they can recompile their own kernel without this call. The vast majority of tracings that hit __schedule_bug would be enhanced by this change. To see how tracing with this change can be used for "scheduling while atomic" bugs, see http://bugs.launchpad.net/bugs/534549. Signed-off-by: Chase Douglas <chase.douglas@canonical.com> --- kernel/sched.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-)