Message ID | 4ABC7486.8040500@kernel.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Tejun Heo wrote: > Tejun Heo wrote: > >> Hello, >> >> Sachin Sant wrote: >> >>> <4>PERCPU: chunk 1 relocating -1 -> 18 c0000000db70fb00 >>> <c0000000db70fb00:c0000000db70fb00> >>> <4>PERCPU: relocated <c000000001120320:c000000001120320> >>> <4>PERCPU: chunk 1 relocating 18 -> 16 c0000000db70fb00 >>> <c000000001120320:c000000001120320> >>> <4>PERCPU: relocated <c000000001120300:c000000001120300> >>> <4>PERCPU: chunk 1, alloc pages [0,1) >>> <4>PERCPU: chunk 1, map pages [0,1) >>> <4>PERCPU: map 0xd00007fffff00000, 1 pages 53544 >>> <4>PERCPU: map 0xd00007fffff80000, 1 pages 53545 >>> <4>PERCPU: chunk 1, will clear 4096b/unit d00007fffff00000 d00007fffff80000 >>> <3>INFO: RCU detected CPU 0 stall (t=1000 jiffies) >>> >> This supports my hypothesis. This is the first area being allocated >> from a dynamic chunk and cleared. PFN 53544 and 53545 have been >> allocated and successfully mapped to 0xd00007fffff00000 and >> 0xd00007fffff80000 using map_kernel_range_noflush() but when those >> addresses are actually accessed, we end up with infinite faults. The >> fault handler probably thinks that the fault has been handled >> correctly but, when the control is returned, the processor faults >> again. Benjamin, I'm way out of my depth here, can you please help? >> >> Oh, one more simple experiment. Sachin, does the following patch make >> any difference? >> With this patch applied the machine boots OK :-) Have attached the boot log. Note that this boot log is from a different machine, but the reported problem can be recreate on this machine as well. Thanks -Sachin > > Oops, the patch should look like the following. > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 69511e6..37ab9e2 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2056,7 +2056,8 @@ static unsigned long pvm_determine_end(struct vmap_area **pnext, > struct vmap_area **pprev, > unsigned long align) > { > - const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1); > + const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align); > + const unsigned long vmalloc_end = vmalloc_start + (512 << 20); > unsigned long addr; > > if (*pnext) > @@ -2102,7 +2103,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > size_t align, gfp_t gfp_mask) > { > const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align); > - const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1); > + const unsigned long vmalloc_end = vmalloc_start + (512 << 20); > struct vmap_area **vas, *prev, *next; > struct vm_struct **vms; > int area, area2, last_area, term_area; > >
Sachin Sant wrote:
> With this patch applied the machine boots OK :-)
Ah... so, the problem really is too high address. If you've got some
time, it might be interesting to find out how far high is safe.
Thanks.
On Fri, 2009-09-25 at 18:01 +0900, Tejun Heo wrote: > > With this patch applied the machine boots OK :-) > > Ah... so, the problem really is too high address. If you've got some > time, it might be interesting to find out how far high is safe. > Might give me a clue about what the problem is but I think I'll just cook up a test case that forcibly vmap something high up and see how it goes from there. It could be a very old bug that nobody ever noticed because our vmalloc space on 64-bit is so huge :-) Cheers, Ben.
Benjamin Herrenschmidt wrote: > On Fri, 2009-09-25 at 18:01 +0900, Tejun Heo wrote: > >>> With this patch applied the machine boots OK :-) >>> >> Ah... so, the problem really is too high address. If you've got some >> time, it might be interesting to find out how far high is safe. >> >> > Might give me a clue about what the problem is but I think I'll just > cook up a test case that forcibly vmap something high up and see how it > goes from there. It could be a very old bug that nobody ever noticed > because our vmalloc space on 64-bit is so huge :-) > I still have this problem with 2.6.32-rc3. Here is the relevant information 0:mon> t [link register ] c0000000001a7f78 .pcpu_alloc+0x798/0xa04 [c0000000033e37f0] c0000000001a7f08 .pcpu_alloc+0x728/0xa04 (unreliable) [c0000000033e3920] c0000000001a8278 .__alloc_percpu+0x3c/0x58 [c0000000033e39b0] c0000000005d1ad0 .snmp_mib_init+0x64/0xb0 [c0000000033e3a40] c0000000005d1c00 .ipv4_mib_init_net+0xe4/0x1f8 [c0000000033e3b00] c00000000055b608 .setup_net+0x78/0x138 [c0000000033e3ba0] c00000000055be38 .copy_net_ns+0x9c/0x148 [c0000000033e3c30] c0000000000d06d8 .create_new_namespaces+0x120/0x1e4 [c0000000033e3ce0] c0000000000d09e0 .unshare_nsproxy_namespaces+0x7c/0xfc [c0000000033e3d80] c00000000009dd74 .SyS_unshare+0x148/0x33c [c0000000033e3e30] c0000000000085b4 syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 00000fff8b0ab978 SP (fffe633fe30) is in userspace 0:mon> e cpu 0x0: Vector: 501 (Hardware Interrupt) at [c0000000033e3570] pc: c00000000004bdc0: .memset+0x60/0xfc lr: c0000000001a7f78: .pcpu_alloc+0x798/0xa04 sp: c0000000033e37f0 msr: 8000000000009032 current = 0xc000000003270860 paca = 0xc0000000010c2600 pid = 3442, comm = two_children_ns 0:mon> r R00 = 0000000000000040 R07 = d00007fffff00000 R01 = c0000000033e37f0 R08 = 0000000000000000 R02 = c000000000fe7c78 R09 = c000000001700180 R03 = d00007fffff00000 R10 = c000000001095aa0 R04 = 0000000000000000 R11 = 00000000000003c0 R05 = 0000000000000000 R12 = 0000000048004428 R06 = d00007fffff00000 R13 = c0000000010c2600 pc = c00000000004bdc0 .memset+0x60/0xfc lr = c0000000001a7f78 .pcpu_alloc+0x798/0xa04 msr = 8000000000009032 cr = 44004420 ctr = 0000000000000040 xer = 0000000020000020 trap = 501 0:mon> di $.memset c00000000004bd60 7c0300d0 neg r0,r3 c00000000004bd64 5084442e rlwimi r4,r4,8,16,23 c00000000004bd68 70000007 andi. r0,r0,7 c00000000004bd6c 5084801e rlwimi r4,r4,16,0,15 c00000000004bd70 7c850040 cmplw cr1,r5,r0 c00000000004bd74 7884000e rldimi r4,r4,32,0 c00000000004bd78 7c101120 mtocrf 1,r0 c00000000004bd7c 7c661b78 mr r6,r3 c00000000004bd80 418400ac blt cr1,c00000000004be2c # .memset+0xcc/0xfc c00000000004bd84 41e2002c beq+ c00000000004bdb0 # .memset+0x50/0xfc c00000000004bd88 7ca02850 subf r5,r0,r5 c00000000004bd8c 409f000c bns cr7,c00000000004bd98 # .memset+0x38/0xfc c00000000004bd90 98860000 stb r4,0(r6) c00000000004bd94 38c60001 addi r6,r6,1 c00000000004bd98 409e000c bne cr7,c00000000004bda4 # .memset+0x44/0xfc c00000000004bd9c b0860000 sth r4,0(r6) 0:mon> c00000000004bda0 38c60002 addi r6,r6,2 c00000000004bda4 409d000c ble cr7,c00000000004bdb0 # .memset+0x50/0xfc c00000000004bda8 90860000 stw r4,0(r6) c00000000004bdac 38c60004 addi r6,r6,4 c00000000004bdb0 78a0d183 rldicl. r0,r5,58,6 c00000000004bdb4 78a506a0 clrldi r5,r5,58 c00000000004bdb8 7c0903a6 mtctr r0 c00000000004bdbc 4182002c beq c00000000004bde8 # .memset+0x88/0xfc c00000000004bdc0 f8860000 std r4,0(r6) At this point R06 contains d00007fffff00000. Have attached the xmon log. Thanks -Sachin
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 69511e6..37ab9e2 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2056,7 +2056,8 @@ static unsigned long pvm_determine_end(struct vmap_area **pnext, struct vmap_area **pprev, unsigned long align) { - const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1); + const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align); + const unsigned long vmalloc_end = vmalloc_start + (512 << 20); unsigned long addr; if (*pnext) @@ -2102,7 +2103,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, size_t align, gfp_t gfp_mask) { const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align); - const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1); + const unsigned long vmalloc_end = vmalloc_start + (512 << 20); struct vmap_area **vas, *prev, *next; struct vm_struct **vms; int area, area2, last_area, term_area;