diff mbox

Openbios upgrade broke sparc32 linux.

Message ID 51D5EEAD.9010103@caramail.com
State New
Headers show

Commit Message

Olivier DANET July 4, 2013, 9:52 p.m. UTC
On 29/06/2013 22:29, Olivier Danet wrote:
> On 28/06/2013 23:44, Mark Cave-Ayland wrote:
>> On 28/06/13 03:08, Rob Landley wrote:
>>
>>> Commit 467b34689d27 upgraded the openbios image, and ever since my 
>>> linux
>>> system images hang about the time they try to initialize interrupts.
>>>
>>> http://landley.net/aboriginal/bin/system-image-sparc.tar.bz2
>>>
>>> Extract that and "./run-emulator.sh" in the tarball. Using qemu 1.2.0
>>> for example works fine, you get a shell prompt. Using 1.5.0 hangs.
>>>
>>> Rob
>>
>> Hi Rob,
>>
>> Thanks for the bug report. I did a quick bisect on OpenBIOS and it 
>> points to the following commit:
>>
>> commit 167aafd70f64e74a77787ca5bf9f4dc750b27fc3
>> Author: blueswirl <blueswirl@f158a5a8-5612-0410-a976-696ce0be7e32>
>> Date:   Sun Feb 3 16:50:11 2013 +0000
>>
>>     SPARC32: microSPARC-II identification
>>
>>     For the microSPARC-II = Fujitsu MB86904 = Sun STP1012PGA,
>>     PSR.IMPL=0 and PSR.VERS=4.
>>
>>     This CPU model is used as default by QEMU when emulating
>>     a SparcStation-4 or SparcStation-5.
>>
>>     Signed-off-by: Olivier DANET <odanet@caramail.com>
>>     Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
>>
>>
>> The commit itself is very simple and looks like this: 
>> http://git.qemu.org/?p=openbios.git;a=commitdiff;h=0fe772df8717ef75d91eae8ef221e9966ce2fd7f.
>>
>> My guess would be that Linux is trying to do some slightly different 
>> initialisation based upon identifying the CPU, but I'm not too 
>> familiar with the kernel code myself. Blue/Olivier - can either of 
>> you comment on this?
>>
>>
>> ATB,
>>
>> Mark.
>
> How embarrassing...
>
> - QEMU 1.5.1 can boot Debian Etch (kernel 2.6.18), RedHat 4.2 (kernel 
> 2.0.30), NetBSD 6.1 and OpenBSD 5.3.
>
> - Your image (Linux 3.8) can be started with a TurboSparc CPU : qemu 
> -cpu "Fujitsu MB86907".
>
> - My SparcStation-5 has a 110MHz MicroSPARC-II and the .attributes 
> (aka .properties) fields are identical
> to OpenBIOS values, except for the mask_rev : I have 0x26, OpenBIOS 
> sets 0x23
>
> Before the patch, OpenBIOS had an incoherence between the PSR register 
> content and the BIOS defined values.
> In Linux "arch/sparc/mm/srmmu.c:get_srmmu_type(void)", this correspond 
> to "a TurboSparc emulating Swift".
> (Swift is the MS-2).
>
> TurboSPARC could be the new QEMU default, but, ideally, the MS-II 
> should be preferred
>  as it is compatible with more OSes ( hoping to run NextStep in QEMU 
> one day ...).
>
> Maybe recent Linux kernels are not compatible with the way QEMU 
> emulates the MS-II...
>
> Regards
> Olivier
> [temlib.org]
>
>
>

Hello
I think I have found the problem.

Each SPARC CPU model use different MMU TLB management functions.
For Linux, the callbacks are set in arch/sparc/mm/srmmu.c : 
xxx_flush_tlb_all, xxx_flush_tlb_mm, xxx_flush_tlb_range, 
xxx_flush_tlb_page.
The assembly code used for the MicroSparcII is arch/sparc/mm/swift.S. 
This code accesses the vm_mm member of vm_area_struct 
(include/linux/mm_types.h)

The position of the vm_mm field in the structure was modified recently, 
and the assembly
was not adjusted accordingly.
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/mm_types.h?id=e4c6bfd2d79d063017ab19a18915f0bc759f32d9)

The bug was introduced in Linux 3.8

Here is a patch for swift, there are also issues in hypersparc.S, 
viking.S, tsunami.S ...:

         sethi   %hi(4096), %o3
         cmp     %o2, %o3
@@ -116,7 +116,7 @@ swift_flush_cache_range:

         .globl  swift_flush_cache_page
  swift_flush_cache_page:
-       ld      [%o0 + 0x0], %o0                /* XXX vma->vm_mm, GROSS 
XXX */
+       ld      [%o0 + 0x20], %o0               /* XXX vma->vm_mm, GROSS 
XXX */
  70:
         ld      [%o0 + AOFF_mm_context], %g2
         cmp     %g2, -1
@@ -219,7 +219,7 @@ swift_flush_sig_insns:
         .globl  swift_flush_tlb_range
         .globl  swift_flush_tlb_all
  swift_flush_tlb_range:
-       ld      [%o0 + 0x00], %o0       /* XXX vma->vm_mm GROSS XXX */
+       ld      [%o0 + 0x20], %o0       /* XXX vma->vm_mm GROSS XXX */
  swift_flush_tlb_mm:
         ld      [%o0 + AOFF_mm_context], %g2
         cmp     %g2, -1
@@ -233,7 +233,7 @@ swift_flush_tlb_all_out:

         .globl  swift_flush_tlb_page
  swift_flush_tlb_page:
-       ld      [%o0 + 0x00], %o0       /* XXX vma->vm_mm GROSS XXX */
+       ld      [%o0 + 0x20], %o0       /* XXX vma->vm_mm GROSS XXX */
         mov     SRMMU_CTX_REG, %g1
         ld      [%o0 + AOFF_mm_context], %o3
         andn    %o1, (PAGE_SIZE - 1), %o1
==========================================================================

For a cleaner fix, arch/sparc/kernel/asm_offsets.c should be modified.

Cool !
Olivier
[temlib.org]

Comments

Mark Cave-Ayland July 15, 2013, 4:03 p.m. UTC | #1
On 04/07/13 22:52, Olivier Danet wrote:

> The bug was introduced in Linux 3.8
>
> Here is a patch for swift, there are also issues in hypersparc.S,
> viking.S, tsunami.S ...:
>
> ==========================================================================
> diff -up linux_prev/arch/sparc/mm/swift.S linux/arch/sparc/mm/swift.S
> --- linux_prev/arch/sparc/mm/swift.S 2013-07-04 23:16:37.785273225 +0200
> +++ linux/arch/sparc/mm/swift.S 2013-07-04 23:30:50.445310001 +0200
> @@ -105,7 +105,7 @@ swift_flush_cache_mm_out:
>
> .globl swift_flush_cache_range
> swift_flush_cache_range:
> - ld [%o0 + 0x0], %o0 /* XXX vma->vm_mm, GROSS XXX */
> + ld [%o0 + 0x20], %o0 /* XXX vma->vm_mm, GROSS XXX */
> sub %o2, %o1, %o2
> sethi %hi(4096), %o3
> cmp %o2, %o3
> @@ -116,7 +116,7 @@ swift_flush_cache_range:
>
> .globl swift_flush_cache_page
> swift_flush_cache_page:
> - ld [%o0 + 0x0], %o0 /* XXX vma->vm_mm, GROSS XXX */
> + ld [%o0 + 0x20], %o0 /* XXX vma->vm_mm, GROSS XXX */
> 70:
> ld [%o0 + AOFF_mm_context], %g2
> cmp %g2, -1
> @@ -219,7 +219,7 @@ swift_flush_sig_insns:
> .globl swift_flush_tlb_range
> .globl swift_flush_tlb_all
> swift_flush_tlb_range:
> - ld [%o0 + 0x00], %o0 /* XXX vma->vm_mm GROSS XXX */
> + ld [%o0 + 0x20], %o0 /* XXX vma->vm_mm GROSS XXX */
> swift_flush_tlb_mm:
> ld [%o0 + AOFF_mm_context], %g2
> cmp %g2, -1
> @@ -233,7 +233,7 @@ swift_flush_tlb_all_out:
>
> .globl swift_flush_tlb_page
> swift_flush_tlb_page:
> - ld [%o0 + 0x00], %o0 /* XXX vma->vm_mm GROSS XXX */
> + ld [%o0 + 0x20], %o0 /* XXX vma->vm_mm GROSS XXX */
> mov SRMMU_CTX_REG, %g1
> ld [%o0 + AOFF_mm_context], %o3
> andn %o1, (PAGE_SIZE - 1), %o1
> ==========================================================================
>
> For a cleaner fix, arch/sparc/kernel/asm_offsets.c should be modified.
>
> Cool !
> Olivier
> [temlib.org]

Hi Olivier,

Thanks for this - this is great work! Are either you or Rob able to 
chase this upstream on the LKML?


Many thanks,

Mark.
diff mbox

Patch

==========================================================================
diff -up linux_prev/arch/sparc/mm/swift.S linux/arch/sparc/mm/swift.S
--- linux_prev/arch/sparc/mm/swift.S    2013-07-04 23:16:37.785273225 +0200
+++ linux/arch/sparc/mm/swift.S 2013-07-04 23:30:50.445310001 +0200
@@ -105,7 +105,7 @@  swift_flush_cache_mm_out:

         .globl  swift_flush_cache_range
  swift_flush_cache_range:
-       ld      [%o0 + 0x0], %o0                /* XXX vma->vm_mm, GROSS 
XXX */
+       ld      [%o0 + 0x20], %o0               /* XXX vma->vm_mm, GROSS 
XXX */
         sub     %o2, %o1, %o2