diff mbox series

powerpc: Fixed duplicate copying in the early boot.

Message ID 20240617023509.5674-1-jinglin.wen@shingroup.cn (mailing list archive)
State Changes Requested
Headers show
Series powerpc: Fixed duplicate copying in the early boot. | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.

Commit Message

Jinglin Wen June 17, 2024, 2:35 a.m. UTC
According to the code logic, when the kernel is loaded to address 0,
no copying operation should be performed, but it is currently being
done.

This patch fixes the issue where the kernel code was incorrectly
duplicated to address 0 when booting from address 0.

Signed-off-by: Jinglin Wen <jinglin.wen@shingroup.cn>
---
 arch/powerpc/kernel/head_64.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Michael Ellerman June 17, 2024, 11:28 a.m. UTC | #1
Jinglin Wen <jinglin.wen@shingroup.cn> writes:
> According to the code logic, when the kernel is loaded to address 0,
> no copying operation should be performed, but it is currently being
> done.
>
> This patch fixes the issue where the kernel code was incorrectly
> duplicated to address 0 when booting from address 0.
>
> Signed-off-by: Jinglin Wen <jinglin.wen@shingroup.cn>
> ---
>  arch/powerpc/kernel/head_64.S | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Thanks for the improved change log.

The subject could probably still be clearer, maybe:
  Fix unnecessary copy to 0 when kernel is booted at address 0

Looks like this was introduced by:

  Fixes: b270bebd34e3 ("powerpc/64s: Run at the kernel virtual address earlier in boot")
  Cc: stable@vger.kernel.org # v6.4+

Let me know if you think otherwise.

Just out of interest, how are you hitting this bug? AFAIK none of our
"normal" boot loaders will load the kernel at 0. 

> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 4690c219bfa4..6c73551bdc50 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -647,7 +647,9 @@ __after_prom_start:
>   * Note: This process overwrites the OF exception vectors.
>   */
>  	LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
> -	mr.	r4,r26			/* In some cases the loader may  */
> +	tophys(r4,r26)
> +	cmplwi	cr0,r4,0	/* runtime base addr is zero */
> +	mr	r4,r26			/* In some cases the loader may */
>  	beq	9f			/* have already put us at zero */
	
That is a pretty minimal fix, but I think the code would be clearer if
we just compared the source and destination addresses.

Something like the diff below. Can you confirm that works for you.

cheers

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 4690c219bfa4..6ad1435303f9 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -647,8 +647,9 @@ __after_prom_start:
  * Note: This process overwrites the OF exception vectors.
  */
 	LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
-	mr.	r4,r26			/* In some cases the loader may  */
-	beq	9f			/* have already put us at zero */
+	mr	r4, r26			// Load the source address into r4
+	cmpld	cr0, r3, r4		// Check if source == dest
+	beq	9f			// If so skip the copy
 	li	r6,0x100		/* Start offset, the first 0x100 */
 					/* bytes were copied earlier.	 */
Segher Boessenkool June 17, 2024, 4:13 p.m. UTC | #2
Hi!

On Mon, Jun 17, 2024 at 10:35:09AM +0800, Jinglin Wen wrote:
> +	cmplwi	cr0,r4,0	/* runtime base addr is zero */

Just write
   cmpwi r4,0

cr0 is the default, also implicit in many other instructions, please
don't clutter the source code.  All the extra stuff makes you miss the
things that do matter!

The "l" is unnecessary, you only care about equality here after all.

Should it not be cmpdi here, though?


Segher
Jinglin Wen June 18, 2024, 9:34 a.m. UTC | #3
Hi Michael Ellerman,

Michael Ellerman <mpe@ellerman.id.au> writes:
> Jinglin Wen <jinglin.wen@shingroup.cn> writes:
> > According to the code logic, when the kernel is loaded to address 0,
> > no copying operation should be performed, but it is currently being
> > done.
> >
> > This patch fixes the issue where the kernel code was incorrectly
> > duplicated to address 0 when booting from address 0.
> >
> > Signed-off-by: Jinglin Wen <jinglin.wen@shingroup.cn>
> > ---
> >  arch/powerpc/kernel/head_64.S | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> Thanks for the improved change log.
> 
> The subject could probably still be clearer, maybe:
>   Fix unnecessary copy to 0 when kernel is booted at address 0

Thanks for your feedback, I will revise my subject.

> 
> Looks like this was introduced by:
> 
>   Fixes: b270bebd34e3 ("powerpc/64s: Run at the kernel virtual address earlier in boot")
>   Cc: stable@vger.kernel.org # v6.4+
> 
> Let me know if you think otherwise.
> 
> Just out of interest, how are you hitting this bug? AFAIK none of our
> "normal" boot loaders will load the kernel at 0. 
> 
> > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> > index 4690c219bfa4..6c73551bdc50 100644
> > --- a/arch/powerpc/kernel/head_64.S
> > +++ b/arch/powerpc/kernel/head_64.S
> > @@ -647,7 +647,9 @@ __after_prom_start:
> >   * Note: This process overwrites the OF exception vectors.
> >   */
> >  	LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
> > -	mr.	r4,r26			/* In some cases the loader may  */
> > +	tophys(r4,r26)
> > +	cmplwi	cr0,r4,0	/* runtime base addr is zero */
> > +	mr	r4,r26			/* In some cases the loader may */
> >  	beq	9f			/* have already put us at zero */
> 	
> That is a pretty minimal fix, but I think the code would be clearer if
> we just compared the source and destination addresses.
> 
> Something like the diff below. Can you confirm that works for you.
> 
> cheers
> 

As for how I discovered this bug, we use zImage.epapr for emulation, which 
loads vmlinux.bin at address 0. When vmlinux.bin is relatively large, I 
found that the boot time of Linux 6.6 is much slower compared to Linux 5.10.108. 
I discovered this issue while comparing the code between the two versions.

> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 4690c219bfa4..6ad1435303f9 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -647,8 +647,9 @@ __after_prom_start:
>   * Note: This process overwrites the OF exception vectors.
>   */
>  	LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
> -	mr.	r4,r26			/* In some cases the loader may  */
> -	beq	9f			/* have already put us at zero */
> +	mr	r4, r26			// Load the source address into r4
> +	cmpld	cr0, r3, r4		// Check if source == dest
> +	beq	9f			// If so skip the copy
>  	li	r6,0x100		/* Start offset, the first 0x100 */
>  					/* bytes were copied earlier.	 */

Indeed, your code looks much clearer. I will make the following modifications 
based on your code:

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 4690c219bfa4..751181dfb897 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -647,8 +647,9 @@ __after_prom_start:
  * Note: This process overwrites the OF exception vectors.
  */
        LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
-       mr.     r4,r26                  /* In some cases the loader may  */
-       beq     9f                      /* have already put us at zero */
+       mr      r4,r26                  /* Load the virtual source address into r4 */
+       cmpd    r3,r4           /* Check if source == dest */
+       beq     9f                      /* If so skip the copy  */
        li      r6,0x100                /* Start offset, the first 0x100 */
                                        /* bytes were copied earlier.    */ 

Thanks,

Jinglin Wen
Jinglin Wen June 18, 2024, 9:40 a.m. UTC | #4
Hi Segher Boessenkool,

Segher Boessenkool <segher@kernel.crashing.org> writes:
> Hi!
> 
> On Mon, Jun 17, 2024 at 10:35:09AM +0800, Jinglin Wen wrote:
> > +	cmplwi	cr0,r4,0	/* runtime base addr is zero */
> 
> Just write
>    cmpwi r4,0
> 
> cr0 is the default, also implicit in many other instructions, please
> don't clutter the source code.  All the extra stuff makes you miss the
> things that do matter!
> 
> The "l" is unnecessary, you only care about equality here after all.
> 
> Should it not be cmpdi here, though?
> 
> 
> Segher

Thank you very much for your suggestion. I will use cmpd directly from now on. 
The specific code is as follows:

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 4690c219bfa4..751181dfb897 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -647,8 +647,9 @@ __after_prom_start:
  * Note: This process overwrites the OF exception vectors.
  */
        LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
-       mr.     r4,r26                  /* In some cases the loader may  */
-       beq     9f                      /* have already put us at zero */
+       mr      r4,r26                  /* Load the virtual source address into r4 */
+       cmpd    r3,r4           /* Check if source == dest */
+       beq     9f                      /* If so skip the copy  */
        li      r6,0x100                /* Start offset, the first 0x100 */
                                        /* bytes were copied earlier.    */ 

Thanks,

Jinglin Wen
Michael Ellerman June 18, 2024, 12:12 p.m. UTC | #5
Segher Boessenkool <segher@kernel.crashing.org> writes:
> Hi!
>
> On Mon, Jun 17, 2024 at 10:35:09AM +0800, Jinglin Wen wrote:
>> +	cmplwi	cr0,r4,0	/* runtime base addr is zero */
>
> Just write
>    cmpwi r4,0
>
> cr0 is the default, also implicit in many other instructions, please
> don't clutter the source code.  All the extra stuff makes you miss the
> things that do matter!
>
> The "l" is unnecessary, you only care about equality here after all.

In my mind it's an unsigned comparison, so I'd use cmpld, even though as
you say all we actually care about is equality.

cheers
Segher Boessenkool June 18, 2024, 1:27 p.m. UTC | #6
On Tue, Jun 18, 2024 at 10:12:54PM +1000, Michael Ellerman wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Mon, Jun 17, 2024 at 10:35:09AM +0800, Jinglin Wen wrote:
> >> +	cmplwi	cr0,r4,0	/* runtime base addr is zero */
> >
> > Just write
> >    cmpwi r4,0
> >
> > cr0 is the default, also implicit in many other instructions, please
> > don't clutter the source code.  All the extra stuff makes you miss the
> > things that do matter!
> >
> > The "l" is unnecessary, you only care about equality here after all.
> 
> In my mind it's an unsigned comparison, so I'd use cmpld, even though as
> you say all we actually care about is equality.

We want to know if it is zero or not, so in my mind "unsigned comparison"
does not apply at all, that is only for range checks.  Heh.

But it doesn't matter at all: if you think cmpld looks more natural / is
what you expect to see, then you should use cmpld, that is my point :-)


Segher
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 4690c219bfa4..6c73551bdc50 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -647,7 +647,9 @@  __after_prom_start:
  * Note: This process overwrites the OF exception vectors.
  */
 	LOAD_REG_IMMEDIATE(r3, PAGE_OFFSET)
-	mr.	r4,r26			/* In some cases the loader may  */
+	tophys(r4,r26)
+	cmplwi	cr0,r4,0	/* runtime base addr is zero */
+	mr	r4,r26			/* In some cases the loader may */
 	beq	9f			/* have already put us at zero */
 	li	r6,0x100		/* Start offset, the first 0x100 */
 					/* bytes were copied earlier.	 */