Message ID | 7f63e8a8-95c5-eeca-dc79-3c13f4d98d39@lwfinger.net (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | warning | Failed to apply on branch powerpc/merge (a5bc6e124219546a81ce334dc9b16483d55e9abf) |
snowpatch_ozlabs/apply_patch | warning | Failed to apply on branch powerpc/next (71c3a888cbcaf453aecf8d2f8fb003271d28073f) |
snowpatch_ozlabs/apply_patch | warning | Failed to apply on branch linus/master (0bf999f9c5e74c7ecf9dafb527146601e5c848b9) |
snowpatch_ozlabs/apply_patch | warning | Failed to apply on branch powerpc/fixes (d4bf905307a1c90a27714ff7a9fd29b0a2ceed98) |
snowpatch_ozlabs/apply_patch | warning | Failed to apply on branch linux-next (51d5d207918d2a5a1da6c7a0a9e249db75521501) |
snowpatch_ozlabs/apply_patch | fail | Failed to apply to any branch |
Christophe, With this patch, it gets further. Sometime after the boot process tries to start process init, it crashes with the unable to read data at 0x000157a0 with a faulting address of 0xc001683c. The screenshot is attached and the gzipped vmlinux is at http://www.lwfinger.com/download/vmlinux2.gz. The patches that were applied for this kernel are also attached, Larry
Larry, Le 14/02/2020 à 00:09, Larry Finger a écrit : > Christophe, > > With this patch, it gets further. Sometime after the boot process tries > to start process init, it crashes with the unable to read data at > 0x000157a0 with a faulting address of 0xc001683c. The screenshot is > attached and the gzipped vmlinux is at > http://www.lwfinger.com/download/vmlinux2.gz. The patches that were > applied for this kernel are also attached, > Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? I see the problem happens in kprobe_handler(). Can you try without CONFIG_KPROBE ? Christophe
Le 14/02/2020 à 07:24, Christophe Leroy a écrit : > Larry, > > Le 14/02/2020 à 00:09, Larry Finger a écrit : >> Christophe, >> >> With this patch, it gets further. Sometime after the boot process >> tries to start process init, it crashes with the unable to read data >> at 0x000157a0 with a faulting address of 0xc001683c. The screenshot is >> attached and the gzipped vmlinux is at >> http://www.lwfinger.com/download/vmlinux2.gz. The patches that were >> applied for this kernel are also attached, >> > > > Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? > > I see the problem happens in kprobe_handler(). Can you try without > CONFIG_KPROBE ? > In fact, you hit two bugs. The first one is due to CONFIG_VMAP_STACK. The second one has always existed (at least since kernel source tree has been in git). First bug is in function enter_rtas() which tries to read data on stack by using the linear physical address translation. This cannot be used with VM stack, it must re-enable data MMU translation to access data on the stack. Second bug is in kprobe_handler() function, which does: if (*addr != BREAKPOINT_INSTRUCTION) addr is the address where the 'trap' happened. When a trap happens with MMU disabled, addr contains the physical address of the trap. kprobe_handler() tries to read the instruction using physical address whereas MMU is enabled, so you get a bad access either because the said address is not mapped, or because access to userspace is not allowed. Due to the first bug, you get a 'machine check', and as current->thread.rtas_sp has not been cleared yet, the machine check handler jumps to 'machine_check_in_rtas'. machine_check_in_rtas does a trap, which in turn triggers the second bug. Once the first bug is fixed, the second one should not popup. Can you test patch https://patchwork.ozlabs.org/patch/1237929/ that fixes the first bug ? Christophe
On 2/14/20 12:24 AM, Christophe Leroy wrote: > > Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? Christophe, When I apply that patch, there is an error at --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -301,6 +301,39 @@ MachineCheck: . = 0x300 DO_KVM 0x300 DataAccess: It complains about "an attempt to move .org backwards". Larry
On 2/14/20 12:24 AM, Christophe Leroy wrote: > > Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ? Christophe, When I apply that patch, there is an error at --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -301,6 +301,39 @@ MachineCheck: . = 0x300 DO_KVM 0x300 DataAccess: It complains about "an attempt to move .org backwards". When I change the 0x300 to 0x310 in two places, it builds OK. Is that OK? Larry
On 02/14/2020 06:24 PM, Larry Finger wrote: > On 2/14/20 12:24 AM, Christophe Leroy wrote: >> >> Did you try with the patch at >> https://patchwork.ozlabs.org/patch/1237387/ ? > > Christophe, > > When I apply that patch, there is an error at > > --- a/arch/powerpc/kernel/head_32.S > +++ b/arch/powerpc/kernel/head_32.S > @@ -301,6 +301,39 @@ MachineCheck: > . = 0x300 > DO_KVM 0x300 > DataAccess: > > It complains about "an attempt to move .org backwards". > Argh ! > When I change the 0x300 to 0x310 in two places, it builds OK. Is that OK? No you can't do that. The following should solve it for your case. --- diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S index 32875afb3319..f9941b766f63 100644 --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: * pointer when we take an exception from supervisor mode.) * -- paulus. */ +#ifdef CONFIG_PPC_CHRP +1: b machine_check_in_rtas +#endif . = 0x200 DO_KVM 0x200 MachineCheck: @@ -290,12 +293,9 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addi r3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP - bne cr1,1f + bne cr1,1b #endif EXC_XFER_STD(0x200, machine_check_exception) -#ifdef CONFIG_PPC_CHRP -1: b machine_check_in_rtas -#endif /* Data access exception. */ . = 0x300 --- Christophe
Christophe, On 2/14/20 1:35 PM, Christophe Leroy wrote: > --- a/arch/powerpc/kernel/head_32.S > +++ b/arch/powerpc/kernel/head_32.S > @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: > * pointer when we take an exception from supervisor mode.) > * -- paulus. > */ > +#ifdef CONFIG_PPC_CHRP > +1: b machine_check_in_rtas > +#endif > . = 0x200 > DO_KVM 0x200 > MachineCheck: > @@ -290,12 +293,9 @@ MachineCheck: > 7: EXCEPTION_PROLOG_2 > addi r3,r1,STACK_FRAME_OVERHEAD > #ifdef CONFIG_PPC_CHRP > - bne cr1,1f > + bne cr1,1b > #endif > EXC_XFER_STD(0x200, machine_check_exception) > -#ifdef CONFIG_PPC_CHRP > -1: b machine_check_in_rtas > -#endif > > /* Data access exception. */ > . = 0x300 With the above changes and all the other patches applied, the machine finally boots. It is so bloody slow that it takes a long time to do anything, but you finally got all the places that needed patches. I really lost track of how many bugs were fixed in the process, but I can now put that old box aside until time for v5.7.0-rc1. As you can tell, it only gets used to verify that PPC32 is working on real G4 hardware. It has no real value for any other function. Thanks for the help, Larry
Le 15/02/2020 à 03:42, Larry Finger a écrit : > Christophe, > > On 2/14/20 1:35 PM, Christophe Leroy wrote: >> --- a/arch/powerpc/kernel/head_32.S >> +++ b/arch/powerpc/kernel/head_32.S >> @@ -270,6 +270,9 @@ __secondary_hold_acknowledge: >> * pointer when we take an exception from supervisor mode.) >> * -- paulus. >> */ >> +#ifdef CONFIG_PPC_CHRP >> +1: b machine_check_in_rtas >> +#endif >> . = 0x200 >> DO_KVM 0x200 >> MachineCheck: >> @@ -290,12 +293,9 @@ MachineCheck: >> 7: EXCEPTION_PROLOG_2 >> addi r3,r1,STACK_FRAME_OVERHEAD >> #ifdef CONFIG_PPC_CHRP >> - bne cr1,1f >> + bne cr1,1b >> #endif >> EXC_XFER_STD(0x200, machine_check_exception) >> -#ifdef CONFIG_PPC_CHRP >> -1: b machine_check_in_rtas >> -#endif I'll need to make it a bit different because it shoehorns into your config but won't fit if CONFIG_KVM_BOOK3S_32 is added. >> >> /* Data access exception. */ >> . = 0x300 > > With the above changes and all the other patches applied, the machine > finally boots. It is so bloody slow that it takes a long time to do > anything, but you finally got all the places that needed patches. I > really lost track of how many bugs were fixed in the process, but I can > now put that old box aside until time for v5.7.0-rc1. As you can tell, > it only gets used to verify that PPC32 is working on real G4 hardware. > It has no real value for any other function. Yes, I don't have a G4 myself but this is so much nested with other stuff for the powerpc 83xx than we can't avoid the changes impacting the G4 and other hash-MMU based PPC32 allthough the changes I'm doing are not targetted at those platform at first. And as the 83xx is a 603 core, it is non-hash so all hash related things can't be verified. Plus all those small parts like power saving, RTAS, etc... which are more specific. And checking with all possible options is also not easy. VMAP-STACK was really a challenging functionnality, I'm happy it made its way to mainline though. > > Thanks for the help, Thanks to you for testing and for your patience. Christophe
X-Account-Key: account11 X-UIDL: GmailId16b2c9a50ea3b90f X-Mozilla-Status: 0013 X-Mozilla-Status2: 00000000 X-Mozilla-Keys: Delivered-To: larry.finger@gmail.com Received: by 2002:a4a:b2cb:0:0:0:0:0 with SMTP id l11csp8290747ooo; Thu, 6 Jun 2019 04:43:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqwCuWkmxNEorwoFA8dV9n0gnv2GBWEgptpWhjUpg7zmkwpEpAZ0OoddjKTZH+rGzLk5PqDb X-Received: by 2002:a0d:f7c1:: with SMTP id h184mr23332511ywf.461.1559821439502; Thu, 06 Jun 2019 04:43:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559821439; cv=none; d=google.com; s=arc-20160816; b=cwyAMW7kgYORTfk9s11BVuRyOK1aZ/y91JWwr6KOKPjipJrjl4Z4hU3kdcWdy4XKFx gYTTnCmpLQjMriyYBUCnK6H+TIupoTDV4xc2Zbd0o3MYzlJDnRLoxhQ0tGLa9faooOCk HVI7sohOmcS2usEzXxiIFXmHZeya5uHoc8dqbhVGERSvmUYr7eJay3A3XxiVCfsgUo2x Wdp3LeIQ8WTPcLfY7YXjggvhCrALhNjfjR+cuKRupwAc5ExKfZ6KTUikPOVqAZeDyEk6 FsO52PBlpxzCc+eWSTyrnhUxm1yaluTml84QKHyu4mihylKmeQH4KtH+12GJ2+ZfhDHc Zyww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:delivered-to; bh=kt7o1+XVy+lXH5WOMXil8IA2OM6q4I/uYbcFkZqbRc8=; b=yphupkT6Zz+pYJvawZCl4bXGIugoNj5iCMqRlPqyqbEf4RN+3aY22jD5xpiDlw0jjS 4h5S9kSgQJn14JjtpTfGee/fKwo5JiazPvYu68XodDh2maHWANND00ToKt9H10yeHUUm 09L9jV51XLMRIGvJyTsw+qYDq52Rms2nuS93spZQ+gFrEmpaVPcnua1Mk7kgT+J5hgGa X5Ps+iqFFYXg7SeMwWvx0juniqkBIEUvbVyyb3SKd8FW24HyYx7b/q31/0euFLwj6W8A 8UlSDrDJ4008OUhrjFeibePvH2exlHbj/RPDpDYY4XNIsqQmb+BZulXC09QlzV0aKdLC D/AQ== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 209.17.115.53 is neither permitted nor denied by best guess record for domain of hch@lst.de) smtp.mailfrom=hch@lst.de Return-Path: <hch@lst.de> Received: from atl4mhob15.registeredsite.com (atl4mhob15.registeredsite.com. [209.17.115.53]) by mx.google.com with ESMTP id u187si831499ybb.479.2019.06.06.04.43.59 for <larry.finger@gmail.com>; Thu, 06 Jun 2019 04:43:59 -0700 (PDT) Received-SPF: neutral (google.com: 209.17.115.53 is neither permitted nor denied by best guess record for domain of hch@lst.de) client-ip=209.17.115.53; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.17.115.53 is neither permitted nor denied by best guess record for domain of hch@lst.de) smtp.mailfrom=hch@lst.de Received: from mail.hostingplatform.com (atl4qibmail17pod0.registeredsite.com [10.30.71.59]) by atl4mhob15.registeredsite.com (8.14.4/8.14.4) with ESMTP id x56Bhxx8019893 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for <larry.finger@gmail.com>; Thu, 6 Jun 2019 07:43:59 -0400 Received: (qmail 10870 invoked by uid 78); 6 Jun 2019 11:43:58 -0000 Delivered-To: lwfinger.net-Larry.Finger@lwfinger.net Received: (qmail 10858 invoked by uid 0); 6 Jun 2019 11:43:58 -0000 Received: from unknown (HELO atl4mhib12.myregisteredsite.com) (209.17.115.147) by 0 with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted); 6 Jun 2019 11:43:58 -0000 Received: from newverein.lst.de (verein.lst.de [213.95.11.211]) by atl4mhib12.myregisteredsite.com (8.14.4/8.14.4) with ESMTP id x56BhuC5008752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for <Larry.Finger@lwfinger.net>; Thu, 6 Jun 2019 07:43:57 -0400 Received: by newverein.lst.de (Postfix, from userid 2407) id C2C8268B20; Thu, 6 Jun 2019 13:43:26 +0200 (CEST) Date: Thu, 6 Jun 2019 13:43:25 +0200 From: Christoph Hellwig <hch@lst.de> To: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Aaro Koskinen <aaro.koskinen@iki.fi>, Christoph Hellwig <hch@lst.de>, Christian Zigotzky <chzigotzky@xenosoft.de>, Michael Ellerman <mpe@ellerman.id.au>, Larry Finger <Larry.Finger@lwfinger.net>, linuxppc-dev@lists.ozlabs.org, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [BISECTED REGRESSION] b43legacy broken on G4 PowerBook Message-ID: <20190606114325.GA7497@lst.de> References: <20190605225059.GA9953@darkstar.musicnaut.iki.fi> <dfe6451c93574b61d4bdde4a05c5f8ccf86b31a0.camel@kernel.crashing.org> <20190606093149.GA11598@darkstar.musicnaut.iki.fi> <d87ac9a7faac0d5522cb496d74afc586410fed9c.camel@kernel.crashing.org> <f8df19ffe5b75537045119037459ae9ad4a1de39.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <f8df19ffe5b75537045119037459ae9ad4a1de39.camel@kernel.crashing.org> User-Agent: Mutt/1.5.17 (2007-11-01) X-SpamScore: 4 X-MailHub-Apparently-To: Larry.Finger@lwfinger.net X-MailHub-Forwarded: Yes On Thu, Jun 06, 2019 at 08:57:49PM +1000, Benjamin Herrenschmidt wrote: > > Wow... that's an odd amount. One thing we could possibly do is add code > > to limit the amount of RAM when we detect that device.... > > Sent too quickly... I mean that *or* force swiotlb at 30-bits on those systems based > on detecting the presence of that device in the device-tree. swiotlb doesn't really help you, as these days swiotlb on matters for the dma_map* case. What would help is a ZONE_DMA that covers these devices. No need to do the 24-bit x86 does, but 30-bit would do it. WIP patch for testing below: diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index b8286a2013b4..7a367ce87c41 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -319,6 +319,10 @@ struct vm_area_struct; #endif /* __ASSEMBLY__ */ #include <asm/slice.h> +#if 1 /* XXX: pmac? dynamic discovery? */ +#define ARCH_ZONE_DMA_BITS 30 +#else #define ARCH_ZONE_DMA_BITS 31 +#endif #endif /* _ASM_POWERPC_PAGE_H */ diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index cba29131bccc..2540d3b2588c 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -248,7 +248,8 @@ void __init paging_init(void) (long int)((top_of_ram - total_ram) >> 20)); #ifdef CONFIG_ZONE_DMA - max_zone_pfns[ZONE_DMA] = min(max_low_pfn, 0x7fffffffUL >> PAGE_SHIFT); + max_zone_pfns[ZONE_DMA] = min(max_low_pfn, + ((1UL << ARCH_ZONE_DMA_BITS) - 1) >> PAGE_SHIFT); #endif max_zone_pfns[ZONE_NORMAL] = max_low_pfn; #ifdef CONFIG_HIGHMEM