Message ID | 20100413.185931.244224837.davem@davemloft.net |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
On Tue, Apr 13, 2010 at 06:59:31PM -0700, David Miller wrote: > From: David Miller <davem@davemloft.net> > Date: Tue, 13 Apr 2010 16:56:39 -0700 (PDT) > > > I'll dig further. > > Ok Frederic, I think I figured it out. We only save 128 bytes > of stack space in the mcount stubs, but that's not enough. > > 128 bytes is enough for the register window, but any function we call > is allowed to save the first six incoming arguments to pre-assigned > stack frame slots. > > That means we have to allocate, at a minimum, 176 bytes of stack space > if we call any C functions. (176 == 128 + (8 * 6)) > > These arg save stores only happen in functions where there is a > bit of register pressure, that's why we only sometimes see this. > > The net result is that when we hit this, the parent's stack frame gets > writtenw ith garbage. And that's why the crashes look so random. > > I'll commit this to sparc-2.6 and push it out to Linus. I'm beating > up a machine pretty hard with various tracers enabled and I haven't > see any suspicious symptoms yet :-) > > -------------------- > sparc64: Allocate sufficient stack space in ftrace stubs. > > 128 bytes is sufficient for the register window save area, but the > calling conventions allow the callee to save up to 6 incoming argument > registers into the stack frame after the register window save area. > > This means a minimal stack frame is 176 bytes (128 + (6 * 8)). > > This fixes random crashes when using the function tracer. > > Reported-by: Frederic Weisbecker <fweisbec@gmail.com> > Signed-off-by: David S. Miller <davem@davemloft.net> Ok, I've applied the two patches and it fixes the problems partially. The crashes happen less likely as run into a manual loop of echo function_graph > current_tracer echo nop > current_tracer But after some trials: [ 711.854617] hrtimer: interrupt took 8050861 ns [ 746.124873] ------------[ cut here ]------------ [ 746.410622] WARNING: at kernel/sched.c:3568 ftrace_graph_caller+0x10/0x18() [ 746.697698] Modules linked in: loop usbhid ehci_hcd ohci_hcd e1000e sg sr_mod cdrom [ 747.034140] Call Trace: [ 747.036503] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 747.040395] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 747.044273] ---[ end trace 5213554e436b8b78 ]--- [ 747.059057] Unable to handle kernel NULL pointer dereference [ 747.062799] tsk->{mm,active_mm}->context = 0000000000000661 Or another kind: (note the hrtimer warnings are normals. This is a hanging prevention that has been added because of the function graph tracer first but eventually serves as a general protection for hrtimer. It's about similar to the balancing problem scheme: the time to service timers is so slow that timers re-expire before we exit the servicing loop, so we risk an endless loop). [ 100.159892] hrtimer: interrupt took 8747191 ns [ 128.534074] huh, entered softirq 1 TIMER 000000000047ed70 preempt_count 00000103, exited with 00000100? [ 129.128339] ------------[ cut here ]------------ [ 129.450705] Unable to handle kernel NULL pointer dereference [ 129.599834] tsk->{mm,active_mm}->context = 000000000000067b [ 129.741548] tsk->{mm,active_mm}->pgd = fffff803f0fac000 [ 129.880609] \|/ ____ \|/ [ 129.881840] "@'/ .. \`@" [ 129.883098] /_| \__/ |_\ [ 129.884410] \__U_/ [ 130.433709] ksoftirqd/35(109): Oops [#1] [ 130.568083] TSTATE: 0000000080e01607 TPC: 0000000000476570 TNPC: 0000000000476574 Y: 00000000 Not tainted [ 130.846883] TPC: <irq_exit+0xd4/0xdc> [ 130.980036] g0: 000000000049c19c g1: 0000000000000000 g2: 00000000000012a8 g3: 00000000008ff7b8 [ 131.250929] g4: fffff803f7fad900 g5: fffff80011244000 g6: fffff803f7fd4000 g7: 0000000000000002 [ 131.543092] o0: 0000000000000001 o1: 0000000000000000 o2: 0000000000000001 o3: fffff803ff9533e8 l NULL pointer dereference [ 132.598525] tsk->{mm,active_mm}->context = 0000000000000513 [ 132.598549] tsk->{mm,active_mm}->pgd = fffff803f0ba8000 [ 132.598573] \|/ ____ \|/ [ 132.598585] "@'/ .. \`@" [ 132.598597] /_| \__/ |_\ [ 132.598611] \__U_/ [ 132.598637] events/0(195): Oops [#3] [ 132.598683] TSTATE: 0000004480e01606 TPC: 0000000000466424 TNPC: 0000000000466428 Y: 00000000 Tainted: G D [ 132.598772] TPC: <set_next_entity+0x10/0xa8> [ 132.598798] g0: 00000000008e7aa0 g1: 0000000000000001 g2: 00000000000012a8 g3: 00000000008ff7b8 [ 132.598832] g4: 00000000008e7aa0 g5: fffff80011244000 g6: 00000000008cc000 g7: 0000000000000002 [ 132.598861] o0: 00000000008f0d50 o1: 0000000000466520 o2: 00000000008cf401 o3: 0000000000000000 [ 132.598890] o4: fffff803f6d60000 o5: 0000000000000000 sp: 00000000008cf281 ret_pc: 000000000046641c [ 132.598927] RPC: <set_next_entity+0x8/0xa8> [ 132.598948] l0: 00000000007602d0 l1: 00000000008cf401 l2: ffffffffffffffff l3: 0000000000000400 [ 132.598977] l4: 00000000008e7aa0 l5: 0006000000000000 l6: 000000000000000e l7: 0000000000000008 [ 132.599005] i0: fffff80011dc5d18 i1: 0000000000000000 i2: 0000000000000001 i3: 0000000000000000 [ 132.599033] i4: 0000000000000002 i5: 0000000000000000 i6: 00000000008cf341 i7: 00000000004665d8 [ 132.599067] I7: <pick_next_task_fair+0xc8/0xe8> [ 132.599084] Instruction DUMP: 110023c3 4005f3ad 90122150 <c2066038> 80a06000 02480011 84063f68 90100018 7fffffa6 [ 132.599160] note: events/0[195] exited with preempt_count 2 [ 131.546191] o4: 0000000000855960 o5: 0000000000000001 sp: fffff803ff952aa1 ret_pc: 0000000000476568 [ 131.549532] RPC: <irq_exit+0xcc/0xdc> [ 131.551039] l0: 000000000000804e l1: 0000000000bf3d10 l2: 0000000000bf3d20 l3: 00000000008d0180 [ 131.554218] l4: 000000000076457c l5: fffff800163c2c88 l6: 0000000000000001 l7: fffff803eccaa038 [ 131.557243] i0: 0000000000000000 i1: 0000001616202980 i2: fffff803ff952b81 i3: 00000000005e3328 [ 131.560247] i4: 0000000000000001 i5: fffff800163c2c88 i6: 0000000000000001 i7: fffff803eccaa038 [ 131.564010] I7: <0xfffff803eccaa038> [ 131.565656] Instruction DUMP: 90102000 7fffabd9 90102001 <81cfe008> 01000000 9de3bf40 110023cd 4005b354 901220c0 [ 131.575366] Unable to handle kernel NULL pointer dereference [ 131.577418] tsk->{mm,active_mm}->context = 000000000000067b [ 131.579556] tsk->{mm,active_mm}->pgd = fffff803f0fac000 [ 131.581611] \|/ ____ \|/ [ 131.582683] "@'/ .. \`@" [ 131.583668] /_| \__/ |_\ [ 131.584740] \__U_/ [ 131.586339] ksoftirqd/35(109): Oops [#2] [ 131.588374] TSTATE: 0000004480e01604 TPC: 00000000004730b8 TNPC: 00000000004af1c4 Y: 00000000 Tainted: G D [ 131.591849] TPC: <do_exit+0x2bc/0x77c> [ 131.593318] g0: 0000000000000000 g1: 0000000000000000 g2: 00000000000012a8 g3: 0000000000000015 [ 131.596339] g4: fffff803f7fad900 g5: fffff80011244000 g6: fffff803f7fd4000 g7: 0000000000000002 [ 131.599249] o0: 0000000000000001 o1: fffff803f7f52624 o2: 0000000000000001 o3: 0000000000000000 [ 131.602059] o4: 0000000000855960 o5: 0000000000000001 sp: fffff803ff952561 ret_pc: 00000000004730b4 [ 131.605016] RPC: <do_exit+0x2b8/0x77c> [ 131.606452] l0: fffff803ff9532a0 l1: fffff803f7fad900 l2: fffff803f7fad900 l3: 0000000080e01607 [ 131.609228] l4: 0000000000000000 l5: 00000000008f2700 l6: 0000000000000001 l7: 000000003b9aca00 [ 131.612125] i0: 0000000000000009 i1: 0000000000000020 i2: 00000000901220c0 i3: 0000000000000020 [ 131.615054] i4: 00000000000000ff i5: 000000000000000b i6: fffff803ff952631 i7: 00000000005e331c [ 131.618120] I7: <ftrace_graph_caller+0x10/0x18> [ 131.619877] Instruction DUMP: a4100011 c2586030 4000f044 <d0586048> 4001abd1 90100011 8207a7df 90100011 c277a7e7 [ 131.627682] Fixing recursive fault but reboot is needed! [ 132.598442] ------------[ cut hrnel NULL p<4>>[ 1 32329.48944] Unabointer de efehraenndclee <1>[ 132.598499] kerne[ 132.890947] WARNING: at kernel/trace/trace_functions_graph.c:144 ftrace_return_to_handler+0x94/0x128() [ 132.911378] Bad frame pointer: expected 8cefc1, received fffff803f6d630a1 [ 132.911389] from func get_fault_insn return to 44ed0c [ 132.925314] Modules linked in: loop usbhid ehci_hcd ohci_hcd e1000e sg sr_mod cdrom [ 132.939710] Call Trace: [ 132.946238] [000000000046ece0] warn_slowpath_common+0x50/0x84 [ 132.952799] [000000000046ed74] warn_slowpath_fmt+0x34/0x44 [ 132.959135] [00000000004d03cc] ftrace_return_to_handler+0x94/0x128 [ 132.965529] [00000000005e3328] return_to_handler+0x4/0x1c [ 132.971834] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 132.977974] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 132.983840] [000000000046427c] load_balance+0xcb8/0x111c [ 132.989524] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 132.995075] [0000000000489ab4] worker_thread+0x90/0x358 [ 133.000470] [000000000048e974] kthread+0x64/0x78 [ 133.005645] [000000000042ba68] kernel_thread+0x3c/0x54 [ 133.010717] [000000000048ea4c] kthreadd+0xc4/0x118 [ 133.015704] ---[ end trace 04c19b1c514a2c20 ]--- [ 133.020612] Kernel panic - not syncing: [ 133.025362] Call Trace: [ 133.029992] [000000000075f9a0] panic+0x64/0x130 [ 133.034638] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 133.039272] [000000000046427c] load_balance+0xcb8/0x111c [ 133.043798] [00000000005e331c] ftrace_graph_caller+0x10/0x18 [ 133.048261] [0000000000489ab4] worker_thread+0x90/0x358 [ 133.052606] [000000000048e974] kthread+0x64/0x78 [ 133.056797] [000000000042ba68] kernel_thread+0x3c/0x54 [ 133.060914] [000000000048ea4c] kthreadd+0xc4/0x118 -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Frederic Weisbecker <fweisbec@gmail.com> Date: Wed, 14 Apr 2010 17:48:04 +0200 > Ok, I've applied the two patches and it fixes the problems > partially. > > The crashes happen less likely as run into a manual > loop of > > echo function_graph > current_tracer > echo nop > current_tracer > > But after some trials: Thanks for testing, I still get weird crashes too during stress tests. At this point I think either we're still getting stack overflow due to deep call chains, or something that is supposed to be NMI safe actually isn't :-) Validating the stack overflow theory is pretty simple so I'll do some tests tonight to check that out. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hey Frederic, I just wanted you to know that I'm slowly but surely trying to make progress on these crashes. I'm trying various different things to narrow down the source of the corruptions, so here's what I've done so far. I did some things to eliminate various aspects of the function tracing code paths, and see if the problem persists. First, I made function_trace_call() unconditionally return immediately. Next, I restored function_trace_call() back to normal, and instead made trace_function() return immediately. I could not reproduce the corruptions in either of these cases with the function tracer enabled in situations where I was guarenteed normally to see a crash. So the only part of the code paths left is the ring buffer and the filling in of the entries. Therefore, what I'm doing now is trying things like running various hacked up variants of the ring buffer benchmark module while doing things that usually trigger the bug (for me a "make -j128" is usually enough) hoping I can trigger corruption. No luck on that so far but I'll keep trying this angle just to make sure. BTW, I noticed that every single time we see the corruptions now, we always see that "hrtimer: interrupt took xxx ns" message first. I have never seen the corruption messages without that reaching the logs first. Have you? That might be an important clue, who knows... -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 16, 2010 at 02:12:32AM -0700, David Miller wrote: > > Hey Frederic, I just wanted you to know that I'm slowly but > surely trying to make progress on these crashes. > > I'm trying various different things to narrow down the source of the > corruptions, so here's what I've done so far. > > I did some things to eliminate various aspects of the function tracing > code paths, and see if the problem persists. > > First, I made function_trace_call() unconditionally return > immediately. > > Next, I restored function_trace_call() back to normal, and instead > made trace_function() return immediately. > > I could not reproduce the corruptions in either of these cases with > the function tracer enabled in situations where I was guarenteed > normally to see a crash. > > So the only part of the code paths left is the ring buffer and the > filling in of the entries. > > Therefore, what I'm doing now is trying things like running various > hacked up variants of the ring buffer benchmark module while doing > things that usually trigger the bug (for me a "make -j128" is usually > enough) hoping I can trigger corruption. No luck on that so far but > I'll keep trying this angle just to make sure. > > BTW, I noticed that every single time we see the corruptions now, we > always see that "hrtimer: interrupt took xxx ns" message first. I > have never seen the corruption messages without that reaching the logs > first. > > Have you? > > That might be an important clue, who knows... Yep that's what I told you in my previous mail :) """(note the hrtimer warnings are normals. This is a hanging prevention that has been added because of the function graph tracer first but eventually serves as a general protection for hrtimer. It's about similar to the balancing problem scheme: the time to service timers is so slow that timers re-expire before we exit the servicing loop, so we risk an endless loop).""" This comes from the early days of the function graph tracer. To work on it, I was sometimes using VirtualBox and the function graph tracer and noticed it was making the system so slow that hrtimers was hanging (in fact it was also partly promoted by guest switches). Hence we've made this hanging protection, but that's ok, hrtimer can sort it out this situation. Though if it happens too much, some timers may be often delayed. That said it also means there is a problem I think. It's normal that it happens in a guest, but not a normal box. May be there a contention in the tracer fast path that slows down the machine. Do you have CONFIG_DEBUG_LOCKDEP enabled? This was one of the sources of these contentions (fixed lately in -tip but for .35). -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Frederic Weisbecker <fweisbec@gmail.com> Date: Fri, 16 Apr 2010 17:44:21 +0200 > """(note the hrtimer warnings are normals. This is a hanging prevention > that has been added because of the function graph tracer first but > eventually serves as a general protection for hrtimer. It's about > similar to the balancing problem scheme: the time to service timers > is so slow that timers re-expire before we exit the servicing loop, > so we risk an endless loop).""" I don't think it's normal in this case, I suspect we loop because of some kind of corruption. > That said it also means there is a problem I think. It's normal > that it happens in a guest, but not a normal box. May be there > a contention in the tracer fast path that slows down the machine. I think it's looping not because of contention, but because of corrupted memory/registers. > Do you have CONFIG_DEBUG_LOCKDEP enabled? This was one of the > sources of these contentions (fixed lately in -tip but for > .35). I'm using PROVE_LOCKING but not DEBUG_LOCKDEP. Anyways, consistently my machine crashes with completely corrupted registers in either irq_exit() or __do_softirq(). Usually we get an unaligned access of some sort, either accessing the stack (because %fp is garbage) or via an indirect call (usually because %i7 is garbage). One thing that's interesting about the softirq path is that it uses the softirq stack. The only thing that guards us jumping onto the softirq_stack are the tests done by do_softirq(), mainly !in_interrupt() and we have softirqs pending. What if preempt_count() got corrupted in such a way that we end up evaluating in_interrupt() to zero when we shouldn't? If that happens, and this makes us jump onto the top of softirq stack of the current cpu multiple times, that could cause some wild corruptions. Another thing I've noticed is that there appears to be some kind of pattern to many of the register corruptions I've seen. There is a pattern of 64-bit values that often looks like this (in memory order): 0xffffffffc3300000 0xffffffffc33000cc 0xffffffffc3d00000 0xffffffffc3d000cc 0xffffffffc4000000 0xffffffffc40000cc and, from another trace: 0xffffffffc6100000 0xffffffffc61000cc 0xffffffffc6a00000 0xffffffffc6a000cc 0xffffffffc6e00000 0xffffffffc6e000cc They look like some kind of descriptor. The closest thing I could find were the scatter-gather descriptors used by the Fusion mptsas driver, but I can't find a way that the descriptors would be formed exactly like the above, but it does come close. For example, drivers/message/fusion/mptscsih.c:mptscsih_qcmd() has this call: ioc->add_sge((char *)&pScsiReq->SGL, MPT_SGE_FLAGS_SSIMPLE_READ | 0, (dma_addr_t) -1); which puts -1 into the address field, but this doesn't exactly line up because the 32-bit SGE descriptors are in the order "flags" then "address" not the other way around. Ho hum... anyways, just looking for clues. If those are mptsas descriptors, then it would be consistent with how I've found that the block I/O path seems to invariably be involved during the crashes. In another trace (that time with PROVE_LOCKING disabled) I saw the host->host_lock passed down into spin_lock_irqsave() being NULL. And this was in the software interrupt handler. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Frederic, I forgot to mention last night that I also did a test wherein' I disabled all of the NMI stuff. That made no difference at all for the crashes, which frankly was a relief for me :-) Anyways, just FYI... -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 16, 2010 at 01:47:01PM -0700, David Miller wrote: > From: Frederic Weisbecker <fweisbec@gmail.com> > Date: Fri, 16 Apr 2010 17:44:21 +0200 > > > """(note the hrtimer warnings are normals. This is a hanging prevention > > that has been added because of the function graph tracer first but > > eventually serves as a general protection for hrtimer. It's about > > similar to the balancing problem scheme: the time to service timers > > is so slow that timers re-expire before we exit the servicing loop, > > so we risk an endless loop).""" > > I don't think it's normal in this case, I suspect we loop because > of some kind of corruption. > > > That said it also means there is a problem I think. It's normal > > that it happens in a guest, but not a normal box. May be there > > a contention in the tracer fast path that slows down the machine. > > I think it's looping not because of contention, but because of > corrupted memory/registers. > > > Do you have CONFIG_DEBUG_LOCKDEP enabled? This was one of the > > sources of these contentions (fixed lately in -tip but for > > .35). > > I'm using PROVE_LOCKING but not DEBUG_LOCKDEP. > > Anyways, consistently my machine crashes with completely corrupted > registers in either irq_exit() or __do_softirq(). Usually we get an > unaligned access of some sort, either accessing the stack (because %fp > is garbage) or via an indirect call (usually because %i7 is garbage). > > One thing that's interesting about the softirq path is that it uses > the softirq stack. The only thing that guards us jumping onto the > softirq_stack are the tests done by do_softirq(), mainly > !in_interrupt() and we have softirqs pending. > > What if preempt_count() got corrupted in such a way that we end up > evaluating in_interrupt() to zero when we shouldn't? > > If that happens, and this makes us jump onto the top of softirq stack > of the current cpu multiple times, that could cause some wild > corruptions. > > Another thing I've noticed is that there appears to be some kind of > pattern to many of the register corruptions I've seen. There is > a pattern of 64-bit values that often looks like this (in memory > order): > > 0xffffffffc3300000 > 0xffffffffc33000cc > 0xffffffffc3d00000 > 0xffffffffc3d000cc > 0xffffffffc4000000 > 0xffffffffc40000cc > > and, from another trace: > > 0xffffffffc6100000 > 0xffffffffc61000cc > 0xffffffffc6a00000 > 0xffffffffc6a000cc > 0xffffffffc6e00000 > 0xffffffffc6e000cc > > They look like some kind of descriptor. The closest thing I could > find were the scatter-gather descriptors used by the Fusion mptsas > driver, but I can't find a way that the descriptors would be formed > exactly like the above, but it does come close. > > For example, drivers/message/fusion/mptscsih.c:mptscsih_qcmd() > has this call: > > ioc->add_sge((char *)&pScsiReq->SGL, > MPT_SGE_FLAGS_SSIMPLE_READ | 0, > (dma_addr_t) -1); > > which puts -1 into the address field, but this doesn't exactly line up > because the 32-bit SGE descriptors are in the order "flags" then > "address" not the other way around. > > Ho hum... anyways, just looking for clues. If those are mptsas > descriptors, then it would be consistent with how I've found that the > block I/O path seems to invariably be involved during the crashes. > In another trace (that time with PROVE_LOCKING disabled) I saw > the host->host_lock passed down into spin_lock_irqsave() being NULL. > And this was in the software interrupt handler. Hmm, just a random idea: do you think it could be due to stack overflows? Because the function graph eats more stack by digging to function graph handlers, ring buffer, etc... It diggs further than what is supposed to happen without tracing. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Frederic Weisbecker <fweisbec@gmail.com> Date: Sat, 17 Apr 2010 01:14:12 +0200 > Hmm, just a random idea: do you think it could be due to stack overflows? > Because the function graph eats more stack by digging to function graph > handlers, ring buffer, etc... > > It diggs further than what is supposed to happen without tracing. I have mcount checking for stack-overflow by hand in assembler during these tests, it even knows about the irq stacks. And those checks are not triggering. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Miller <davem@davemloft.net> Date: Fri, 16 Apr 2010 16:17:44 -0700 (PDT) > From: Frederic Weisbecker <fweisbec@gmail.com> > Date: Sat, 17 Apr 2010 01:14:12 +0200 > >> Hmm, just a random idea: do you think it could be due to stack overflows? >> Because the function graph eats more stack by digging to function graph >> handlers, ring buffer, etc... >> >> It diggs further than what is supposed to happen without tracing. > > I have mcount checking for stack-overflow by hand in assembler > during these tests, it even knows about the irq stacks. > > And those checks are not triggering. Ugh... hold on. They're not triggering because I put the test assembler into mcount and dynamic ftrace patches the call sites to bypass mcount altogether :-) Doing real tests now, and I bet you're right. That's pretty insane though, as we use 16K stacks on sparc64 and the gcc I'm using has the minimum stack frame decreased down to 176 bytes (used to be 192). I'd be interested to see what one of these too-large backtraces look like. Hmmm, will it be in the scheduler load balancing code? :-) Will report once I have some goot triggers. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Apr 17, 2010 at 12:51:09AM -0700, David Miller wrote: > From: David Miller <davem@davemloft.net> > Date: Fri, 16 Apr 2010 16:17:44 -0700 (PDT) > > > From: Frederic Weisbecker <fweisbec@gmail.com> > > Date: Sat, 17 Apr 2010 01:14:12 +0200 > > > >> Hmm, just a random idea: do you think it could be due to stack overflows? > >> Because the function graph eats more stack by digging to function graph > >> handlers, ring buffer, etc... > >> > >> It diggs further than what is supposed to happen without tracing. > > > > I have mcount checking for stack-overflow by hand in assembler > > during these tests, it even knows about the irq stacks. > > > > And those checks are not triggering. > > Ugh... hold on. > > They're not triggering because I put the test assembler into mcount > and dynamic ftrace patches the call sites to bypass mcount altogether > :-) > > Doing real tests now, and I bet you're right. > > That's pretty insane though, as we use 16K stacks on sparc64 and > the gcc I'm using has the minimum stack frame decreased down to > 176 bytes (used to be 192). I'd be interested to see what one > of these too-large backtraces look like. > > Hmmm, will it be in the scheduler load balancing code? :-) May be yeah :) This could be also a tracing recursion somewhere. One good way to know is to put pause_graph_tracing() and unpause_graph_tracing() in the very beginning and end of the tracing paths. /me goes to try this. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Frederic Weisbecker <fweisbec@gmail.com> Date: Sat, 17 Apr 2010 18:59:25 +0200 > This could be also a tracing recursion somewhere. > One good way to know is to put pause_graph_tracing() > and unpause_graph_tracing() in the very beginning and > end of the tracing paths. > > /me goes to try this. Last night I still couldn't get the stack overflow checks to trigger, even once i accounted for dynamic ftrace patchin, unfortunately. But... I added a tracing_off() call to the hrtimer deadlock avoidance case and the machine doesn't crash after that triggers. So I went to look at the per-cpu trace of the processor on which the hrtimer deadlock avoidance triggered, and it has a looping sequence of timestamps which looks really strange. The other per-cpu traces look just fine. This is a major clue I think, so I'll keep snooping around trying to dig more deeply into this. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/sparc/lib/mcount.S b/arch/sparc/lib/mcount.S index 3753e3c..3ad6cbd 100644 --- a/arch/sparc/lib/mcount.S +++ b/arch/sparc/lib/mcount.S @@ -34,7 +34,7 @@ mcount: cmp %g1, %g2 be,pn %icc, 1f mov %i7, %g3 - save %sp, -128, %sp + save %sp, -176, %sp mov %g3, %o1 jmpl %g1, %o7 mov %i7, %o0 @@ -56,7 +56,7 @@ mcount: nop 5: mov %i7, %g2 mov %fp, %g3 - save %sp, -128, %sp + save %sp, -176, %sp mov %g2, %l0 ba,pt %xcc, ftrace_graph_caller mov %g3, %l1 @@ -85,7 +85,7 @@ ftrace_caller: lduw [%g1 + %lo(function_trace_stop)], %g1 brnz,pn %g1, ftrace_stub mov %fp, %g3 - save %sp, -128, %sp + save %sp, -176, %sp mov %g2, %o1 mov %g2, %l0 mov %g3, %l1 @@ -120,7 +120,7 @@ ENTRY(ftrace_graph_caller) END(ftrace_graph_caller) ENTRY(return_to_handler) - save %sp, -128, %sp + save %sp, -176, %sp call ftrace_return_to_handler mov %fp, %o0 jmpl %o0 + 8, %g0