Message ID | 20090916075721.GA20122@orion.carnet.hr |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
From: Josip Rodin <joy@entuzijast.net> Date: Wed, 16 Sep 2009 09:57:22 +0200 > On Tue, Sep 15, 2009 at 12:05:30PM +0200, Sébastien Bernard wrote: >>> I was able to reproduce the hang with your originally posted config. >>> >>> It only triggers when CONFIG_PROM_CONSOLE is enabled >>> >> Very good news. >> I had a lot of trouble because of this. >> I had to disable it for initcall_debug to work, but then, any kernel >> patched or not was working. >> I thought I messed up between patches and built kernels. >> I realize now that I had the explanation under my nose. >> Duh.... > > I had just recompiled and booted my SMP kernel without the dreaded > PROM_CONSOLE, and yet the machine got stuck in the same place in the > NMI code. I then removed initcall_debug, and it still doesn't work. > What am I missing?! > > The .config diff between the working UP kernel and this one is: I put up the kernel image I booted with your config (sans PROM_CONSOLE) at: http://vger.kernel.org/~davem/vmlinux-debug give it a try. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller a écrit : > > > I put up the kernel image I booted with your config (sans PROM_CONSOLE) > at: > > http://vger.kernel.org/~davem/vmlinux-debug > > give it a try. > -- > To unsubscribe from this list: send the line "unsubscribe sparclinux" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Works for me ok. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 16, 2009 at 01:14:35AM -0700, David Miller wrote: > >> I thought I messed up between patches and built kernels. > >> I realize now that I had the explanation under my nose. > >> Duh.... > > > > I had just recompiled and booted my SMP kernel without the dreaded > > PROM_CONSOLE, and yet the machine got stuck in the same place in the > > NMI code. I then removed initcall_debug, and it still doesn't work. > > What am I missing?! > > > > The .config diff between the working UP kernel and this one is: > > I put up the kernel image I booted with your config (sans PROM_CONSOLE) > at: > > http://vger.kernel.org/~davem/vmlinux-debug > > give it a try. Yours works fine... I've reverted all the interim patches and rebuilt mine, and it still won't boot, hanging in the same place again. I don't get it. I'm attaching the exact .config used in my last attempt, just in case. diff says PROM_CONSOLE is the only change. I'll be going over your image's config (extracted from /proc/config.gz).
From: Josip Rodin <joy@entuzijast.net> Date: Wed, 16 Sep 2009 14:49:00 +0200 > I'm attaching the exact .config used in my last attempt, just in case. Where is that attachment? :-) I want to try it again here myself as a double check. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 16, 2009 at 03:48:18PM -0700, David Miller wrote: > From: Josip Rodin <joy@entuzijast.net> > Date: Wed, 16 Sep 2009 14:49:00 +0200 > > > I'm attaching the exact .config used in my last attempt, just in case. > > Where is that attachment? :-) I want to try it again here myself as a > double check. Oh sorry, standard attachment error :) Now it's attached.
From: Josip Rodin <joy@entuzijast.net> Date: Thu, 17 Sep 2009 01:29:37 +0200 > On Wed, Sep 16, 2009 at 03:48:18PM -0700, David Miller wrote: >> From: Josip Rodin <joy@entuzijast.net> >> Date: Wed, 16 Sep 2009 14:49:00 +0200 >> >> > I'm attaching the exact .config used in my last attempt, just in case. >> >> Where is that attachment? :-) I want to try it again here myself as a >> double check. > > Oh sorry, standard attachment error :) Now it's attached. I built my kernel using v2.6.30.x -stable FWIW. I'll try straight 2.6.31 with your config. I wonder if we're in the territory of some compiler issue. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Josip Rodin <joy@entuzijast.net> Date: Thu, 17 Sep 2009 01:29:37 +0200 > On Wed, Sep 16, 2009 at 03:48:18PM -0700, David Miller wrote: >> From: Josip Rodin <joy@entuzijast.net> >> Date: Wed, 16 Sep 2009 14:49:00 +0200 >> >> > I'm attaching the exact .config used in my last attempt, just in case. >> >> Where is that attachment? :-) I want to try it again here myself as a >> double check. > > Oh sorry, standard attachment error :) Now it's attached. BTW, while we're on the topic of configurations, dists should use a CONFIG_NR_CPUS value of at least 256 as that's the highest cpu count out there on real sparc64 systems (Bartoka 4 socket Niagara-T2+, 64 cpus per socket, 4 * 64 == 256). -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 16, 2009 at 04:49:34PM -0700, David Miller wrote: > I built my kernel using v2.6.30.x -stable FWIW. > > I'll try straight 2.6.31 with your config. I wonder if we're > in the territory of some compiler issue. I'm guessing that compile is finished by now :) can you please post the link for me to try and boot?
From: Josip Rodin <joy@entuzijast.net> Date: Fri, 18 Sep 2009 15:24:51 +0200 > On Wed, Sep 16, 2009 at 04:49:34PM -0700, David Miller wrote: >> I built my kernel using v2.6.30.x -stable FWIW. >> >> I'll try straight 2.6.31 with your config. I wonder if we're >> in the territory of some compiler issue. > > I'm guessing that compile is finished by now :) can you please post the link > for me to try and boot? I didn't get around to it, I have to run off now to LinuxCon and LinuxPlumbers so all of this will have to wait 2 weeks. Sorry. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 18, 2009 at 10:19:39AM -0700, David Miller wrote: > >> I built my kernel using v2.6.30.x -stable FWIW. > >> > >> I'll try straight 2.6.31 with your config. I wonder if we're > >> in the territory of some compiler issue. > > > > I'm guessing that compile is finished by now :) can you please post the link > > for me to try and boot? > > I didn't get around to it, I have to run off now to LinuxCon and > LinuxPlumbers so all of this will have to wait 2 weeks. > > Sorry. FWIW I've since compiled and tested latest stable .30, .29 and .28, just to make sure I'm not hitting a heisenbug, but the symptoms are unchanged for .30 and .29, while .28 works. The compiler is the Debian 'stable' sparc gcc, nothing fancy there.
On Thu, Sep 24, 2009 at 12:03:37AM +0200, Josip Rodin wrote: > On Fri, Sep 18, 2009 at 10:19:39AM -0700, David Miller wrote: > > >> I built my kernel using v2.6.30.x -stable FWIW. > > >> > > >> I'll try straight 2.6.31 with your config. I wonder if we're > > >> in the territory of some compiler issue. > > > > > > I'm guessing that compile is finished by now :) can you please post the link > > > for me to try and boot? > > > > I didn't get around to it, I have to run off now to LinuxCon and > > LinuxPlumbers so all of this will have to wait 2 weeks. > > > > Sorry. > > FWIW I've since compiled and tested latest stable .30, .29 and .28, just to > make sure I'm not hitting a heisenbug, but the symptoms are unchanged for > .30 and .29, while .28 works. The compiler is the Debian 'stable' sparc gcc, > nothing fancy there. Josip, did you manage to figure out what's going on here? Did we conclude it is a toolchain issue? Cheers.
On Sat, Oct 31, 2009 at 03:08:47PM +0000, Jurij Smakov wrote: > > > >> I built my kernel using v2.6.30.x -stable FWIW. > > > >> > > > >> I'll try straight 2.6.31 with your config. I wonder if we're > > > >> in the territory of some compiler issue. > > > > > > > > I'm guessing that compile is finished by now :) can you please post the link > > > > for me to try and boot? > > > > > > I didn't get around to it, I have to run off now to LinuxCon and > > > LinuxPlumbers so all of this will have to wait 2 weeks. > > > > > > Sorry. > > > > FWIW I've since compiled and tested latest stable .30, .29 and .28, just to > > make sure I'm not hitting a heisenbug, but the symptoms are unchanged for > > .30 and .29, while .28 works. The compiler is the Debian 'stable' sparc gcc, > > nothing fancy there. > > Josip, did you manage to figure out what's going on here? Did we conclude it is > a toolchain issue? No idea, I got stuck there and reverted to .28. Then the machine started exhibiting some other issues so it was reverted to .26. :/
From: Josip Rodin <joy@entuzijast.net> Date: Sat, 31 Oct 2009 17:48:06 +0100 > No idea, I got stuck there and reverted to .28. Then the machine started > exhibiting some other issues so it was reverted to .26. :/ Sorry for dropping the ball on this one. As promised long ago, here is a 2.6.31.6 kernel built with you 2.6.31 config file. Let me know if it exhibits the bootup problem so we can diagnose further: http://vger.kernel.org/~davem/josip_test_2631_6.img Thanks! -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 23, 2009 at 12:24:46PM -0800, David Miller wrote: > As promised long ago, here is a 2.6.31.6 kernel built with you > 2.6.31 config file. Let me know if it exhibits the bootup problem > so we can diagnose further: > > http://vger.kernel.org/~davem/josip_test_2631_6.img Tested on my SunFire480, output until hang is attached. Greetings Hermann Rebooting with command: boot gem:dhcp Boot device: /pci@8,600000/network@2:dhcp File and args: Timed out waiting for BOOTP/DHCP reply Timed out waiting for BOOTP/DHCP reply Timed out waiting for BOOTP/DHCP reply \ PROMLIB: Sun IEEE Boot Prom 'OBP 4.22.34 2007/07/23 13:01' PROMLIB: Root node compatible: Linux version 2.6.31.6 (davem@huronp11) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #2 SMP Mon Nov 23 12:18:05 PST 2009 console [earlyprom0] enabled ARCH: SUN4U Ethernet address: 00:03:ba:29:7c:9f Kernel: Using 1 locked TLB entries for main kernel image. Remapping the kernel... done. OF stdout device is: /pci@9,700000/ebus@1/rsc-console@1,3083f8 PROM: Built device tree with 104327 bytes of memory. Top of RAM: 0xa3ffb22000, Total RAM: 0x3ffad6000 Memory hole size: 655360MB [0000000340000000-fffff8a000800000] page_structs=131072 node=0 entry=1280/0 [0000000340000000-fffff8a000c00000] page_structs=131072 node=0 entry=1281/0 [0000000340800000-fffff8a001000000] page_structs=131072 node=0 entry=1282/0 [0000000340800000-fffff8a001400000] page_structs=131072 node=0 entry=1283/0 [0000000341000000-fffff8a001800000] page_structs=131072 node=0 entry=1284/0 [0000000341000000-fffff8a001c00000] page_structs=131072 node=0 entry=1285/0 [0000000341800000-fffff8a002000000] page_structs=131072 node=0 entry=1286/0 [0000000341800000-fffff8a002400000] page_structs=131072 node=0 entry=1287/0 [0000000342000000-fffff8a002800000] page_structs=131072 node=0 entry=1288/0 [0000000342000000-fffff8a002c00000] page_structs=131072 node=0 entry=1289/0 [0000000342800000-fffff8a003000000] page_structs=131072 node=0 entry=1290/0 [0000000342800000-fffff8a003400000] page_structs=131072 node=0 entry=1291/0 [0000000343000000-fffff8a003800000] page_structs=131072 node=0 entry=1292/0 [0000000343000000-fffff8a003c00000] page_structs=131072 node=0 entry=1293/0 [0000000343800000-fffff8a004000000] page_structs=131072 node=0 entry=1294/0 [0000000343800000-fffff8a004400000] page_structs=131072 node=0 entry=1295/0 [0000000344000000-fffff8a004800000] page_structs=131072 node=0 entry=1296/0 [0000000344000000-fffff8a004c00000] page_structs=131072 node=0 entry=1297/0 [0000000344800000-fffff8a005000000] page_structs=131072 node=0 entry=1298/0 [0000000344800000-fffff8a005400000] page_structs=131072 node=0 entry=1299/0 [0000000345000000-fffff8a005800000] page_structs=131072 node=0 entry=1300/0 [0000000345000000-fffff8a005c00000] page_structs=131072 node=0 entry=1301/0 [0000000345800000-fffff8a006000000] page_structs=131072 node=0 entry=1302/0 [0000000345800000-fffff8a006400000] page_structs=131072 node=0 entry=1303/0 [0000000346000000-fffff8a006800000] page_structs=131072 node=0 entry=1304/0 [0000000346000000-fffff8a006c00000] page_structs=131072 node=0 entry=1305/0 [0000000346800000-fffff8a007000000] page_structs=131072 node=0 entry=1306/0 [0000000346800000-fffff8a007400000] page_structs=131072 node=0 entry=1307/0 [0000000347000000-fffff8a007800000] page_structs=131072 node=0 entry=1308/0 [0000000347000000-fffff8a007c00000] page_structs=131072 node=0 entry=1309/0 [0000000347800000-fffff8a008000000] page_structs=131072 node=0 entry=1310/0 [0000000347800000-fffff8a008400000] page_structs=131072 node=0 entry=1311/0 Zone PFN ranges: Normal 0x05000000 -> 0x051ffd91 Movable zone start PFN for each node early_node_map[4] active PFN ranges 0: 0x05000000 -> 0x051ff7ff 0: 0x051ff800 -> 0x051ffd5c 0: 0x051ffd80 -> 0x051ffd8f 0: 0x051ffd90 -> 0x051ffd91 Booting Linux... Built 1 zonelists in Zone order, mobility grouping on. Total pages: 2080111 Kernel command line: PID hash table entries: 4096 (order: 12, 32768 bytes) Dentry cache hash table entries: 2097152 (order: 11, 16777216 bytes) Inode-cache hash table entries: 1048576 (order: 10, 8388608 bytes) Memory: 16610992k available (2488k kernel code, 936k data, 192k init) [fffff80000000000,000000a3ffb22000] NR_IRQS:255 clocksource: mult[640000] shift[16] clockevent: mult[28f5c28] shift[32]
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de> Date: Mon, 23 Nov 2009 22:11:27 +0100 > On Mon, Nov 23, 2009 at 12:24:46PM -0800, David Miller wrote: >> As promised long ago, here is a 2.6.31.6 kernel built with you >> 2.6.31 config file. Let me know if it exhibits the bootup problem >> so we can diagnose further: >> >> http://vger.kernel.org/~davem/josip_test_2631_6.img > > Tested on my SunFire480, output until hang is attached. Thanks for testing. Although it wasn't meant to fix your problem :-) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 23, 2009 at 12:24:46PM -0800, David Miller wrote: > From: Josip Rodin <joy@entuzijast.net> > Date: Sat, 31 Oct 2009 17:48:06 +0100 > > > No idea, I got stuck there and reverted to .28. Then the machine started > > exhibiting some other issues so it was reverted to .26. :/ > > Sorry for dropping the ball on this one. > > As promised long ago, here is a 2.6.31.6 kernel built with you > 2.6.31 config file. Let me know if it exhibits the bootup problem > so we can diagnose further: > > http://vger.kernel.org/~davem/josip_test_2631_6.img > > Thanks! It works! Compiler issue? Here's the early portion of dmesg for the record: boot: LinuxDaveM Allocated 8 Megs of memory at 0x40000000 for kernel Loaded kernel version 2.6.31 PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.4 2003/07/23 08:04' PROMLIB: Root node compatible: Linux version 2.6.31.6 (davem@huronp11) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #2 SMP Mon Nov 23 12:18:05 PST 2009 console [earlyprom0] enabled ARCH: SUN4U Ethernet address: 00:03:ba:5a:53:a5 Kernel: Using 1 locked TLB entries for main kernel image. Remapping the kernel... done. OF stdout device is: /pci@1e,600000/isa@7/serial@0,3f8 PROM: Built device tree with 85818 bytes of memory. Top of RAM: 0x123fedc000, Total RAM: 0xffed4000 Memory hole size: 70656MB [0000000200000000-fffff80000400000] page_structs=131072 node=0 entry=0/0 [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=1/0 [0000000204000000-fffff80000c00000] page_structs=131072 node=0 entry=16/0 [0000000204000000-fffff80001000000] page_structs=131072 node=0 entry=17/0 [0000000220000000-fffff80001400000] page_structs=131072 node=0 entry=128/0 [0000000220000000-fffff80001800000] page_structs=131072 node=0 entry=129/0 [0000000224000000-fffff80001c00000] page_structs=131072 node=0 entry=144/0 [0000000224000000-fffff80002000000] page_structs=131072 node=0 entry=145/0 Zone PFN ranges: Normal 0x00000000 -> 0x0091ff6e Movable zone start PFN for each node early_node_map[7] active PFN ranges 0: 0x00000000 -> 0x00020000 0: 0x00100000 -> 0x00120000 0: 0x00800000 -> 0x00820000 0: 0x00900000 -> 0x0091f7ff 0: 0x0091f800 -> 0x0091fef3 0: 0x0091fef5 -> 0x0091ff60 0: 0x0091ff61 -> 0x0091ff6e Booting Linux... Built 1 zonelists in Zone order, mobility grouping on. Total pages: 449387 Kernel command line: root=/dev/md1 ro md=0,/dev/sda1,/dev/sdb1 md=1,/dev/sda2,/dev/sdb2 md=2,/dev/sda4,/dev/sdb4 md: Will configure md0 (super-block) from /dev/sda1,/dev/sdb1, below. md: Will configure md1 (super-block) from /dev/sda2,/dev/sdb2, below. md: Will configure md2 (super-block) from /dev/sda4,/dev/sdb4, below. PID hash table entries: 4096 (order: 12, 32768 bytes) Dentry cache hash table entries: 524288 (order: 9, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 8, 2097152 bytes) Memory: 4148904k available (2488k kernel code, 936k data, 192k init) [fffff80000000000,000000123fedc000] NR_IRQS:255 clocksource: mult[535555] shift[16] clockevent: mult[3126e97] shift[32] Console: colour dummy device 80x25 console handover: boot [earlyprom0] -> real [tty0] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.4 2003/07/23 08:04' PROMLIB: Root node compatible: Linux version 2.6.31.6 (davem@huronp11) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #2 SMP Mon Nov 23 12:18:05 PST 2009 console [earlyprom0] enabled ARCH: SUN4U Ethernet address: 00:03:ba:5a:53:a5 Kernel: Using 1 locked TLB entries for main kernel image. Remapping the kernel... done. OF stdout device is: /pci@1e,600000/isa@7/serial@0,3f8 PROM: Built device tree with 85818 bytes of memory. Top of RAM: 0x123fedc000, Total RAM: 0xffed4000 Memory hole size: 70656MB [0000000200000000-fffff80000400000] page_structs=131072 node=0 entry=0/0 [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=1/0 [0000000204000000-fffff80000c00000] page_structs=131072 node=0 entry=16/0 [0000000204000000-fffff80001000000] page_structs=131072 node=0 entry=17/0 [0000000220000000-fffff80001400000] page_structs=131072 node=0 entry=128/0 [0000000220000000-fffff80001800000] page_structs=131072 node=0 entry=129/0 [0000000224000000-fffff80001c00000] page_structs=131072 node=0 entry=144/0 [0000000224000000-fffff80002000000] page_structs=131072 node=0 entry=145/0 Zone PFN ranges: Normal 0x00000000 -> 0x0091ff6e Movable zone start PFN for each node early_node_map[7] active PFN ranges 0: 0x00000000 -> 0x00020000 0: 0x00100000 -> 0x00120000 0: 0x00800000 -> 0x00820000 0: 0x00900000 -> 0x0091f7ff 0: 0x0091f800 -> 0x0091fef3 0: 0x0091fef5 -> 0x0091ff60 0: 0x0091ff61 -> 0x0091ff6e Booting Linux... Built 1 zonelists in Zone order, mobility grouping on. Total pages: 449387 Kernel command line: root=/dev/md1 ro md=0,/dev/sda1,/dev/sdb1 md=1,/dev/sda2,/dev/sdb2 md=2,/dev/sda4,/dev/sdb4 md: Will configure md0 (super-block) from /dev/sda1,/dev/sdb1, below. md: Will configure md1 (super-block) from /dev/sda2,/dev/sdb2, below. md: Will configure md2 (super-block) from /dev/sda4,/dev/sdb4, below. PID hash table entries: 4096 (order: 12, 32768 bytes) Dentry cache hash table entries: 524288 (order: 9, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 8, 2097152 bytes) Memory: 4148904k available (2488k kernel code, 936k data, 192k init) [fffff80000000000,000000123fedc000] NR_IRQS:255 clocksource: mult[535555] shift[16] clockevent: mult[3126e97] shift[32] Console: colour dummy device 80x25 console handover: boot [earlyprom0] -> real [tty0] Calibrating delay using timer specific routine.. 24.01 BogoMIPS (lpj=48029) Mount-cache hash table entries: 512 CPU 0: synchronized TICK with master CPU (last diff 0 cycles, maxerr 6 cycles) Brought up 2 CPUs NET: Registered protocol family 16 Testing NMI watchdog ... OK. /pci@1f,700000: TOMATILLO PCI Bus Module ver[4:0] /pci@1f,700000: PCI IO[7f601000000] MEM[7f700000000] PCI: Scanning PBM /pci@1f,700000 pci 0000:00:02.0: PME# supported from D3hot pci 0000:00:02.0: PME# disabled pci 0000:00:02.1: PME# supported from D3hot pci 0000:00:02.1: PME# disabled /pci@1e,600000: TOMATILLO PCI Bus Module ver[4:0] /pci@1e,600000: PCI IO[7fe01000000] MEM[7ff00000000] PCI: Scanning PBM /pci@1e,600000 pci 0001:00:06.0: quirk: region 0800-083f claimed by ali7101 ACPI pci 0001:00:06.0: quirk: region 0600-061f claimed by ali7101 SMB pci 0001:00:0a.0: PME# supported from D3cold pci 0001:00:0a.0: PME# disabled pci 0001:00:02.0: PME# supported from D0 D2 D3hot pci 0001:00:02.0: PME# disabled pci 0001:00:02.1: PME# supported from D0 D2 D3hot pci 0001:00:02.1: PME# disabled /pci@1c,600000: TOMATILLO PCI Bus Module ver[4:0] /pci@1c,600000: PCI IO[7ce01000000] MEM[7cf00000000] PCI: Scanning PBM /pci@1c,600000 /pci@1d,700000: TOMATILLO PCI Bus Module ver[4:0] /pci@1d,700000: PCI IO[7c601000000] MEM[7c700000000] PCI: Scanning PBM /pci@1d,700000 pci 0003:00:02.0: PME# supported from D3hot pci 0003:00:02.0: PME# disabled pci 0003:00:02.1: PME# supported from D3hot pci 0003:00:02.1: PME# disabled bio: create slab <bio-0> at 0 SCSI subsystem initialized /pci@1e,600000/isa@7/rtc@0,70: RTC regs at 0x7fe01000070 NET: Registered protocol family 2 IP route cache hash table entries: 131072 (order: 7, 1048576 bytes) TCP established hash table entries: 524288 (order: 10, 8388608 bytes) TCP bind hash table entries: 65536 (order: 7, 1048576 bytes) TCP: Hash tables configured (established 524288 bind 65536) TCP reno registered NET: Registered protocol family 1 power: Control reg at 7fe01000800 chmc: UltraSPARC-IIIi memory controller at /memory-controller@0,0 chmc: UltraSPARC-IIIi memory controller at /memory-controller@1,0 msgmni has been set to 8104 io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci 0001:00:07.0: Activating ISA DMA hang workarounds f00990ec: ttyS0 at MMIO 0x7fe010003f8 (irq = 15) is a 16550A Console: ttyS0 (SU) console [ttyS0] enabled f009ab54: ttyS1 at MMIO 0x7fe010002e8 (irq = 15) is a 16550A brd: module loaded loop: module loaded sym0: <1010-66> rev 0x1 at pci 0002:00:02.0 irq 25 sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking sym0: SCSI BUS has been reset. scsi0 : sym-2.2.3 scsi 0:0:0:0: Direct-Access SEAGATE ST336607LSUN36G 0507 PQ: 0 ANSI: 3 target0:0:0: tagged command queuing enabled, command queue depth 16. target0:0:0: Beginning Domain Validation target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) target0:0:0: Ending Domain Validation scsi 0:0:1:0: Direct-Access SEAGATE ST336607LSUN36G 0507 PQ: 0 ANSI: 3 target0:0:1: tagged command queuing enabled, command queue depth 16. target0:0:1: Beginning Domain Validation target0:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) target0:0:1: Ending Domain Validation scsi 0:0:2:0: Direct-Access SEAGATE ST373307LC 0003 PQ: 0 ANSI: 3 target0:0:2: tagged command queuing enabled, command queue depth 16. target0:0:2: Beginning Domain Validation target0:0:2: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) target0:0:2: Ending Domain Validation scsi 0:0:3:0: Direct-Access SEAGATE ST373307LC 0003 PQ: 0 ANSI: 3 target0:0:3: tagged command queuing enabled, command queue depth 16. target0:0:3: Beginning Domain Validation target0:0:3: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) target0:0:3: Ending Domain Validation sym1: <1010-66> rev 0x1 at pci 0002:00:02.1 irq 26 sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking sym1: SCSI BUS has been reset. scsi1 : sym-2.2.3 sd 0:0:0:0: [sda] 71132959 512-byte logical blocks: (36.4 GB/33.9 GiB) mice: PS/2 mouse device common for all mice sd 0:0:1:0: [sdb] 71132959 512-byte logical blocks: (36.4 GB/33.9 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:2:0: [sdc] 143374744 512-byte logical blocks: (73.4 GB/68.3 GiB) sd 0:0:3:0: [sdd] 143374744 512-byte logical blocks: (73.4 GB/68.3 GiB) sd 0:0:1:0: [sdb] Write Protect is off sd 0:0:2:0: [sdc] Write Protect is off sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA sd 0:0:2:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA sdc: sda: sdb: sdb1 sdb2 sdb3 sdb4 sdb5 sda1 sda2 sda3 sda4 sda5 sdc1 sdc3 rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0 sd 0:0:2:0: [sdc] Attached SCSI disk sd 0:0:0:0: [sda] Attached SCSI disk sd 0:0:1:0: [sdb] Attached SCSI disk sd 0:0:3:0: [sdd] Write Protect is off sd 0:0:3:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA rtc0: no alarms, 114 bytes nvram md: raid1 personality registered for level 1 TCP cubic registered NET: Registered protocol family 17 rtc_cmos rtc_cmos: setting system clock to 2009-11-23 22:27:15 UTC (1259015235) sdd: sdd1 sdd3 sd 0:0:3:0: [sdd] Attached SCSI disk md: Waiting for all devices to be available before autodetect md: If you don't use raid, use raid=noautodetect md: Autodetecting RAID arrays. md: Scanned 8 and added 8 devices. md: autorun ... md: considering sdd1 ... [...]
From: Josip Rodin <joy@entuzijast.net> Date: Mon, 23 Nov 2009 23:27:34 +0100 > On Mon, Nov 23, 2009 at 12:24:46PM -0800, David Miller wrote: >> From: Josip Rodin <joy@entuzijast.net> >> Date: Sat, 31 Oct 2009 17:48:06 +0100 >> >> > No idea, I got stuck there and reverted to .28. Then the machine started >> > exhibiting some other issues so it was reverted to .26. :/ >> >> Sorry for dropping the ball on this one. >> >> As promised long ago, here is a 2.6.31.6 kernel built with you >> 2.6.31 config file. Let me know if it exhibits the bootup problem >> so we can diagnose further: >> >> http://vger.kernel.org/~davem/josip_test_2631_6.img >> >> Thanks! > > It works! Compiler issue? Something like that. It could also just be compiled "differently" by your gcc and expose some race or bug. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 23, 2009 at 02:32:04PM -0800, David Miller wrote: > From: Josip Rodin <joy@entuzijast.net> > Date: Mon, 23 Nov 2009 23:27:34 +0100 > > > On Mon, Nov 23, 2009 at 12:24:46PM -0800, David Miller wrote: > >> From: Josip Rodin <joy@entuzijast.net> > >> Date: Sat, 31 Oct 2009 17:48:06 +0100 > >> > >> > No idea, I got stuck there and reverted to .28. Then the machine started > >> > exhibiting some other issues so it was reverted to .26. :/ > >> > >> Sorry for dropping the ball on this one. > >> > >> As promised long ago, here is a 2.6.31.6 kernel built with you > >> 2.6.31 config file. Let me know if it exhibits the bootup problem > >> so we can diagnose further: > >> > >> http://vger.kernel.org/~davem/josip_test_2631_6.img > >> > >> Thanks! > > > > It works! Compiler issue? > > Something like that. It could also just be compiled "differently" by > your gcc and expose some race or bug. OK. Yours has gcc 4.2.4, and our ones have gcc 4.3.2 (that we shipped as "stable" :) I also just tried a newer packaged image, and it has the same issue. It comes from the linux-image-2.6.30-bpo.2-sparc64-smp package, which you can get from: http://www.backports.org/debian/pool/main/l/linux-2.6/linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb To extract, use: dpkg-deb -x linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb newdir And then you have newdir/boot/vmlinuz-2.6.30-bpo.2-sparc64-smp etc
From: Josip Rodin <joy@entuzijast.net> Date: Mon, 23 Nov 2009 23:40:28 +0100 > On Mon, Nov 23, 2009 at 02:32:04PM -0800, David Miller wrote: >> Something like that. It could also just be compiled "differently" by >> your gcc and expose some race or bug. > > OK. Yours has gcc 4.2.4, and our ones have gcc 4.3.2 (that we shipped > as "stable" :) > > I also just tried a newer packaged image, and it has the same issue. I just tossed lenny onto my main build system and I will try to reproduce this and track it down. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Josip Rodin <joy@entuzijast.net> Date: Mon, 23 Nov 2009 23:40:28 +0100 > http://www.backports.org/debian/pool/main/l/linux-2.6/linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb > > To extract, use: > > dpkg-deb -x linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb newdir > > And then you have newdir/boot/vmlinuz-2.6.30-bpo.2-sparc64-smp etc I've tried everything to reproduce this. I've tried building 2.6.31.6 -stable from Josip's config using Debian stable's compiler (gcc-4.3.2) I've also tried the image in that dpkg. All of them boot fine on my two similarly configured UltraSPARC-IIIi systems. I'll try to think some more about this, but meanwhile if you have some means by which to make your V240 available to me online to do somet debugging that would be really useful. You can see from Hermann's providing access to his V480 to me once I have access I tend to fix the bug within a day or two :-) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 30, 2009 at 09:23:36PM -0800, David Miller wrote: > From: Josip Rodin <joy@entuzijast.net> > Date: Mon, 23 Nov 2009 23:40:28 +0100 > > > http://www.backports.org/debian/pool/main/l/linux-2.6/linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb > > > > To extract, use: > > > > dpkg-deb -x linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb newdir > > > > And then you have newdir/boot/vmlinuz-2.6.30-bpo.2-sparc64-smp etc > > I've tried everything to reproduce this. > > I've tried building 2.6.31.6 -stable from Josip's config using > Debian stable's compiler (gcc-4.3.2) I think you need to build with the gcc from unstable to reproduce the failure. Lenny kernels have been booting fine on my box (SunBlade 1000), unstable kernels started failing about 3 months ago. I might have some time over the weekend to build the current unstable kernel with both stable and unstable gcc to verify that it's unstable gcc which causes problems. > I've also tried the image in that dpkg. > > All of them boot fine on my two similarly configured UltraSPARC-IIIi > systems. > > I'll try to think some more about this, but meanwhile if you have some > means by which to make your V240 available to me online to do somet > debugging that would be really useful. > > You can see from Hermann's providing access to his V480 to me once I > have access I tend to fix the bug within a day or two :-) > > > -- > To UNSUBSCRIBE, email to debian-sparc-REQUEST@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
On Tue, Dec 01, 2009 at 09:42:12PM +0000, Jurij Smakov wrote: > > > http://www.backports.org/debian/pool/main/l/linux-2.6/linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb > > > > I've tried everything to reproduce this. > > > > I've tried building 2.6.31.6 -stable from Josip's config using > > Debian stable's compiler (gcc-4.3.2) > > I think you need to build with the gcc from unstable to reproduce the > failure. Lenny kernels have been booting fine on my box (SunBlade 1000), > unstable kernels started failing about 3 months ago. Hm, OK, but that doesn't help explain why that exact image, a lenny backport, didn't work here... JFTR the difference would be 4.3.2 vs. 4.3.4 per http://packages.debian.org/gcc-4.3
On Tue, Dec 01, 2009 at 11:57:45PM +0100, Josip Rodin wrote: > On Tue, Dec 01, 2009 at 09:42:12PM +0000, Jurij Smakov wrote: > > > > http://www.backports.org/debian/pool/main/l/linux-2.6/linux-image-2.6.30-bpo.2-sparc64-smp_2.6.30-8~bpo50+1_sparc.deb > > > > > > I've tried everything to reproduce this. > > > > > > I've tried building 2.6.31.6 -stable from Josip's config using > > > Debian stable's compiler (gcc-4.3.2) > > > > I think you need to build with the gcc from unstable to reproduce the > > failure. Lenny kernels have been booting fine on my box (SunBlade 1000), > > unstable kernels started failing about 3 months ago. > > Hm, OK, but that doesn't help explain why that exact image, a lenny > backport, didn't work here... > > JFTR the difference would be 4.3.2 vs. 4.3.4 per > http://packages.debian.org/gcc-4.3 I've upgraded to the latest unstable on my box today, and this pulled in the stock Debian 2.6.31 kernel, which, amusingly, boots just fine: jurij@debian:~$ uname -a Linux debian 2.6.31-1-sparc64-smp #1 SMP Mon Nov 16 14:12:48 UTC 2009 sparc GNU/Linux jurij@debian:~$ zcat /boot/vmlinuz-2.6.31-1-sparc64-smp | strings | grep gcc | head -1 Linux version 2.6.31-1-sparc64-smp (Debian 2.6.31-2) (ben@decadent.org.uk) (gcc version 4.3.4 (Debian 4.3.4-6) ) #1 SMP Mon Nov 16 14:12:48 UTC 2009 Can you try whether it works for you as well? Best regards,
On Sat, Dec 05, 2009 at 12:18:22PM +0000, Jurij Smakov wrote: > > Hm, OK, but that doesn't help explain why that exact image, a lenny > > backport, didn't work here... > > I've upgraded to the latest unstable on my box today, and this pulled in the > stock Debian 2.6.31 kernel, which, amusingly, boots just fine: > > jurij@debian:~$ uname -a > Linux debian 2.6.31-1-sparc64-smp #1 SMP Mon Nov 16 14:12:48 UTC 2009 sparc GNU/Linux > jurij@debian:~$ zcat /boot/vmlinuz-2.6.31-1-sparc64-smp | strings | grep gcc | head -1 > Linux version 2.6.31-1-sparc64-smp (Debian 2.6.31-2) (ben@decadent.org.uk) (gcc version 4.3.4 (Debian 4.3.4-6) ) #1 SMP Mon Nov 16 14:12:48 UTC 2009 > > Can you try whether it works for you as well? I've tried that one right now, and this is what I got: boot: Linux Allocated 8 Megs of memory at 0x40000000 for kernel Uncompressing image... Loaded kernel version 2.6.31 Loading initial ramdisk (7069568 bytes at 0x1200000000 phys, 0x40C00000 virt)... / [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.4 2003/07/23 08:04' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 2.6.31-1-sparc64-smp (Debian 2.6.31-2) (ben@decadent.org.uk) (gcc version 4.3.4 (Debian 4.3.4-6) ) #1 SMP Mon Nov 16 14:12:48 UTC 2009 [ 0.000000] console [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 00:03:ba:5a:53:a5 [ 0.000000] Kernel: Using 2 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] OF stdout device is: /pci@1e,600000/isa@7/serial@0,3f8 [ 0.000000] PROM: Built device tree with 85794 bytes of memory. [ 0.000000] Top of RAM: 0x123fedc000, Total RAM: 0xffed0000 [ 0.000000] Memory hole size: 70656MB [ 0.000000] [0000000200000000-fffff80000400000] page_structs=131072 node=0 entry=0/0 [ 0.000000] [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=1/0 [ 0.000000] [0000000204000000-fffff80000c00000] page_structs=131072 node=0 entry=16/0 [ 0.000000] [0000000204000000-fffff80001000000] page_structs=131072 node=0 entry=17/0 [ 0.000000] [0000000220000000-fffff80001400000] page_structs=131072 node=0 entry=128/0 [ 0.000000] [0000000220000000-fffff80001800000] page_structs=131072 node=0 entry=129/0 [ 0.000000] [0000000224000000-fffff80001c00000] page_structs=131072 node=0 entry=144/0 [ 0.000000] [0000000224000000-fffff80002000000] page_structs=131072 node=0 entry=145/0 [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0x00000000 -> 0x0091ff6e [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[7] active PFN ranges [ 0.000000] 0: 0x00000000 -> 0x00020000 [ 0.000000] 0: 0x00100000 -> 0x00120000 [ 0.000000] 0: 0x00800000 -> 0x00820000 [ 0.000000] 0: 0x00900000 -> 0x0091f7ff [ 0.000000] 0: 0x0091f800 -> 0x0091fef3 [ 0.000000] 0: 0x0091fef5 -> 0x0091ff5e [ 0.000000] 0: 0x0091ff61 -> 0x0091ff6e [ 0.000000] Booting Linux... [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 449385 [ 0.000000] Kernel command line: root=/dev/md1 ro rootdelay=10 console=ttyS0,9600n1 [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes) [ 0.000000] Dentry cache hash table entries: 524288 (order: 9, 4194304 bytes)[ 0.000000] Inode-cache hash table entries: 262144 (order: 8, 2097152 bytes) [ 0.000000] Memory: 4140168k available (3440k kernel code, 1336k data, 216k init) [fffff80000000000,000000123fedc000] [ 0.000000] NR_IRQS:255 [ 0.000000] clocksource: mult[535555] shift[16] [ 0.000000] clockevent: mult[3126e97] shift[32] [ 40.900976] Console: colour dummy device 80x25 [ 41.039420] Calibrating delay using timer specific routine.. 24.01 BogoMIPS (lpj=48029) [ 41.144787] Security Framework initialized [ 41.198547] SELinux: Disabled at boot. [ 41.248905] Mount-cache hash table entries: 512 [ 41.308793] Initializing cgroup subsys ns [ 41.361476] Initializing cgroup subsys cpuacct [ 41.419802] Initializing cgroup subsys devices [ 41.478128] Initializing cgroup subsys freezer [ 41.536458] Initializing cgroup subsys net_cls [ 41.596728] CPU 0: synchronized TICK with master CPU (last diff 1 cycles, maxerr 6 cycles) [ 41.596742] Brought up 2 CPUs [ 41.744948] regulator: core version 0.5 [ 41.795563] NET: Registered protocol family 16 [ There it hung. Doesn't look much different from before.
On Sat, Dec 05, 2009 at 02:00:33PM +0100, Josip Rodin wrote: > I've tried that one right now, and this is what I got: > > [ 41.596742] Brought up 2 CPUs > [ 41.744948] regulator: core version 0.5 > [ 41.795563] NET: Registered protocol family 16 > [ > > There it hung. Doesn't look much different from before. [...] Just a fix confirmation for our mailing list - http://bugs.debian.org/572442
--- /boot/config.old 2009-09-14 12:08:04.000000000 +0000 +++ /boot/config 2009-09-16 07:44:16.000000000 +0000 @@ -4 +4 @@ -# Mon Sep 14 11:47:09 2009 +# Tue Sep 15 09:59:03 2009 @@ -35 +35 @@ -CONFIG_BROKEN_ON_SMP=y +CONFIG_LOCK_KERNEL=y @@ -113,0 +114 @@ +CONFIG_USE_GENERIC_SMP_HELPERS=y @@ -129,0 +131 @@ +CONFIG_STOP_MACHINE=y @@ -152 +154,2 @@ -# CONFIG_SMP is not set +CONFIG_SMP=y +CONFIG_NR_CPUS=32 @@ -163,0 +167 @@ +CONFIG_SPARC64_SMP=y @@ -166,0 +171 @@ +# CONFIG_HOTPLUG_CPU is not set @@ -172,0 +178 @@ +# CONFIG_NUMA is not set @@ -193,0 +200,2 @@ +CONFIG_SCHED_SMT=y +CONFIG_SCHED_MC=y @@ -1141 +1149 @@ -CONFIG_PROM_CONSOLE=y +# CONFIG_PROM_CONSOLE is not set