diff mbox series

[v4,1/2] Revert "powerpc: Set max_mapnr correctly"

Message ID 20220216121109.157605-1-wangkefeng.wang@huawei.com (mailing list archive)
State Superseded
Headers show
Series [v4,1/2] Revert "powerpc: Set max_mapnr correctly" | expand

Commit Message

Kefeng Wang Feb. 16, 2022, 12:11 p.m. UTC
This reverts commit 602946ec2f90d5bd965857753880db29d2d9a1e9.

If CONFIG_HIGHMEM enabled, highmem will be disappeared with max_mapnr
set to max_low_pfn, see mem_init(), 

  for (pfn = highmem_mapnr; pfn < max_mapnr; ++pfn) {
        ...
	free_highmem_page();
  }

Revert it and will fix virt_addr_valid() check in the next patch.

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Fixes: 602946ec2f90 ("powerpc: Set max_mapnr correctly")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
v4:
- new patch
 arch/powerpc/mm/mem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Christophe Leroy March 9, 2022, 4 p.m. UTC | #1
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Kefeng Wang March 26, 2022, 7:55 a.m. UTC | #2
Hi maintainers,

I saw the patches has been reviewed[1], could they be merged?

Many thanks.

[1] https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=286464

On 2022/2/16 20:11, Kefeng Wang wrote:
> This reverts commit 602946ec2f90d5bd965857753880db29d2d9a1e9.
>
> If CONFIG_HIGHMEM enabled, highmem will be disappeared with max_mapnr
> set to max_low_pfn, see mem_init(),
>
>    for (pfn = highmem_mapnr; pfn < max_mapnr; ++pfn) {
>          ...
> 	free_highmem_page();
>    }
>
> Revert it and will fix virt_addr_valid() check in the next patch.
>
> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
> Fixes: 602946ec2f90 ("powerpc: Set max_mapnr correctly")
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> v4:
> - new patch
>   arch/powerpc/mm/mem.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 8e301cd8925b..4d221d033804 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -255,7 +255,7 @@ void __init mem_init(void)
>   #endif
>   
>   	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> -	set_max_mapnr(max_low_pfn);
> +	set_max_mapnr(max_pfn);
>   
>   	kasan_late_init();
>
Michael Ellerman March 28, 2022, 10:37 a.m. UTC | #3
Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> Hi maintainers,
>
> I saw the patches has been reviewed[1], could they be merged?

Maybe I'm just misreading the change log, but it seems wrong that we
need to add extra checks. pfn_valid() shouldn't return true for vmalloc
addresses in the first place, shouldn't we fix that instead? Who knows
what else that might be broken because of that.

cheers
Christophe Leroy March 28, 2022, 10:59 a.m. UTC | #4
Le 28/03/2022 à 12:37, Michael Ellerman a écrit :
> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>> Hi maintainers,
>>
>> I saw the patches has been reviewed[1], could they be merged?
> 
> Maybe I'm just misreading the change log, but it seems wrong that we
> need to add extra checks. pfn_valid() shouldn't return true for vmalloc
> addresses in the first place, shouldn't we fix that instead? Who knows
> what else that might be broken because of that.
> 

pfn_valid() doesn't take an address but a PFN

If you have 1Gbyte of memory you have 256k PFNs.

In a generic config the kernel will map 768 Mbytes of mémory (From 
0xc0000000 to 0xe0000000) and will use 0xf0000000-0xffffffff for 
everything else including vmalloc.

If you take a page above that 768 Mbytes limit, and tries to linarly 
convert it's PFN to a va, you'll hip vmalloc space. Anyway that PFN is 
valid.

So the check really needs to be done in virt_addr_valid().

There is another thing however that would be worth fixing (in another 
patch): that's the virt_to_pfn() in PPC64:

#define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)

#define __pa(x)								\
({									\
	VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);		\
	(unsigned long)(x) & 0x0fffffffffffffffUL;			\
})


So 0xc000000000000000 and 0xd000000000000000 have the same PFN. That's 
wrong.

Christophe
Christophe Leroy March 28, 2022, 2:12 p.m. UTC | #5
Hi,

Le 26/03/2022 à 08:55, Kefeng Wang a écrit :
> Hi maintainers,
> 
> I saw the patches has been reviewed[1], could they be merged?

Thinking about it once more, I think the patches should go in reverse 
order. Patch 2 should go first and patch 1 should go after.

Otherwise, once patch 1 is applied and patch 2 is not applied yet, 
virt_addr_valid() doesn't work anymore.

Christophe

> 
> Many thanks.
> 
> [1] https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=286464
> 
> On 2022/2/16 20:11, Kefeng Wang wrote:
>> This reverts commit 602946ec2f90d5bd965857753880db29d2d9a1e9.
>>
>> If CONFIG_HIGHMEM enabled, highmem will be disappeared with max_mapnr
>> set to max_low_pfn, see mem_init(),
>>
>>    for (pfn = highmem_mapnr; pfn < max_mapnr; ++pfn) {
>>          ...
>>     free_highmem_page();
>>    }
>>
>> Revert it and will fix virt_addr_valid() check in the next patch.
>>
>> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Fixes: 602946ec2f90 ("powerpc: Set max_mapnr correctly")
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> v4:
>> - new patch
>>   arch/powerpc/mm/mem.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index 8e301cd8925b..4d221d033804 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -255,7 +255,7 @@ void __init mem_init(void)
>>   #endif
>>       high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>> -    set_max_mapnr(max_low_pfn);
>> +    set_max_mapnr(max_pfn);
>>       kasan_late_init();
Kefeng Wang March 29, 2022, 11:32 a.m. UTC | #6
On 2022/3/28 22:12, Christophe Leroy wrote:
> Hi,
>
> Le 26/03/2022 à 08:55, Kefeng Wang a écrit :
>> Hi maintainers,
>>
>> I saw the patches has been reviewed[1], could they be merged?
> Thinking about it once more, I think the patches should go in reverse
> order. Patch 2 should go first and patch 1 should go after.
>
> Otherwise, once patch 1 is applied and patch 2 is not applied yet,
> virt_addr_valid() doesn't work anymore.
Should I resend them or could the maintainer reverse order when merging 
them?
>
> Christophe
>
>> Many thanks.
>>
>> [1] https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=286464
>>
Michael Ellerman April 1, 2022, 11:23 a.m. UTC | #7
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 28/03/2022 à 12:37, Michael Ellerman a écrit :
>> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>>> Hi maintainers,
>>>
>>> I saw the patches has been reviewed[1], could they be merged?
>>
>> Maybe I'm just misreading the change log, but it seems wrong that we
>> need to add extra checks. pfn_valid() shouldn't return true for vmalloc
>> addresses in the first place, shouldn't we fix that instead? Who knows
>> what else that might be broken because of that.
>
> pfn_valid() doesn't take an address but a PFN

Yeah sorry that was unclear wording on my part.

What I mean is that pfn_valid(virt_to_pfn(some_vmalloc_addr)) should be
false, because virt_to_pfn(vmalloc_addr) should fail.

The right way to convert a vmalloc address to a pfn is with
vmalloc_to_pfn(), which walks the page tables to find the actual pfn
backing that vmalloc addr.

> If you have 1Gbyte of memory you have 256k PFNs.
>
> In a generic config the kernel will map 768 Mbytes of mémory (From
> 0xc0000000 to 0xe0000000) and will use 0xf0000000-0xffffffff for
> everything else including vmalloc.
>
> If you take a page above that 768 Mbytes limit, and tries to linarly
> convert it's PFN to a va, you'll hip vmalloc space. Anyway that PFN is
> valid.

That's true, but it's just some random page in vmalloc space, there's no
guarantee that it's the same page as the PFN you started with.

Note it's not true on 64-bit Book3S. There if you take a valid PFN (ie.
backed by RAM) and convert it to a virtual address (with __va()), you
will never get a vmalloc address.

> So the check really needs to be done in virt_addr_valid().

I don't think it has to, but with the way our virt_to_pfn()/__pa() works
I guess for now it's the easiest solution.

> There is another thing however that would be worth fixing (in another
> patch): that's the virt_to_pfn() in PPC64:
>
> #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
>
> #define __pa(x)								\
> ({									\
> 	VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET);		\
> 	(unsigned long)(x) & 0x0fffffffffffffffUL;			\
> })
>
>
> So 0xc000000000000000 and 0xd000000000000000 have the same PFN. That's
> wrong.

Yes it was wrong. But we don't use 0xd000000000000000 anymore, so it
shouldn't be an issue in practice.

See 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range").

I guess it is still a problem for 64-bit Book3E, because they use 0xc
and 0x8.

I looked at fixing __pa()/__va() to use +/- a few years back, but from
memory it still didn't work and/or generated bad code. There's a comment
about it working around a GCC miscompile.

The other thing that makes me reluctant to change it is that I worry we
have places that inadvertantly use __pa() on addresses that are already
physical addresses. If we changed __pa() to use subtraction that would
break, ie. we'd end up with a negative address.

See eg. a6e2c226c3d5 ("powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL")

So to fix it we'd have to 1) verify that the compiler bug is no longer
an issue and 2) audit uses of __pa()/__va().

cheers
Christophe Leroy April 1, 2022, 12:07 p.m. UTC | #8
Le 01/04/2022 à 13:23, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 28/03/2022 à 12:37, Michael Ellerman a écrit :
>>> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>>>> Hi maintainers,
>>>>
>>>> I saw the patches has been reviewed[1], could they be merged?
>>>
>>> Maybe I'm just misreading the change log, but it seems wrong that we
>>> need to add extra checks. pfn_valid() shouldn't return true for vmalloc
>>> addresses in the first place, shouldn't we fix that instead? Who knows
>>> what else that might be broken because of that.
>>
>> pfn_valid() doesn't take an address but a PFN
> 
> Yeah sorry that was unclear wording on my part.
> 
> What I mean is that pfn_valid(virt_to_pfn(some_vmalloc_addr)) should be
> false, because virt_to_pfn(vmalloc_addr) should fail.

Yes that's probably how it should be but none of the main architectures 
do that.

The best we find is some architecture that WARN_ON(some valloc addr) in 
virt_to_pfn(). That's wouldn't help in our case, as it would then WARN_ON()


> 
> The right way to convert a vmalloc address to a pfn is with
> vmalloc_to_pfn(), which walks the page tables to find the actual pfn
> backing that vmalloc addr.
> 
>> If you have 1Gbyte of memory you have 256k PFNs.
>>
>> In a generic config the kernel will map 768 Mbytes of mémory (From
>> 0xc0000000 to 0xe0000000) and will use 0xf0000000-0xffffffff for
>> everything else including vmalloc.
>>
>> If you take a page above that 768 Mbytes limit, and tries to linarly
>> convert it's PFN to a va, you'll hip vmalloc space. Anyway that PFN is
>> valid.
> 
> That's true, but it's just some random page in vmalloc space, there's no
> guarantee that it's the same page as the PFN you started with.

Yes sure, what I meant however is that pfn_valid(some_valid_pfn) should 
return true, even if virt_to_pfn(some_vmalloc_address) profides a valid PFN.

> 
> Note it's not true on 64-bit Book3S. There if you take a valid PFN (ie.
> backed by RAM) and convert it to a virtual address (with __va()), you
> will never get a vmalloc address.
> 
>> So the check really needs to be done in virt_addr_valid().
> 
> I don't think it has to, but with the way our virt_to_pfn()/__pa() works
> I guess for now it's the easiest solution.
> 

At least other architectures do it that way. See for instance how ARM 
does it:

#define virt_addr_valid(kaddr)	(((unsigned long)(kaddr) >= PAGE_OFFSET 
&& (unsigned long)(kaddr) < (unsigned long)high_memory) \
					&& pfn_valid(virt_to_pfn(kaddr)))



high_memory being the top of linear RAM mapping

Christophe
Michael Ellerman April 4, 2022, 12:31 p.m. UTC | #9
Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> On 2022/3/28 22:12, Christophe Leroy wrote:
>> Hi,
>>
>> Le 26/03/2022 à 08:55, Kefeng Wang a écrit :
>>> Hi maintainers,
>>>
>>> I saw the patches has been reviewed[1], could they be merged?
>> Thinking about it once more, I think the patches should go in reverse
>> order. Patch 2 should go first and patch 1 should go after.
>>
>> Otherwise, once patch 1 is applied and patch 2 is not applied yet,
>> virt_addr_valid() doesn't work anymore.
>
> Should I resend them or could the maintainer reverse order when merging 
> them?

I'll reverse them. I've found some breakage in other code while testing
this, so I'll fix that up first before merging these.

In patch 2 you didn't say what hardware you hit this on, what CPU does
your system have?

cheers
Kefeng Wang April 6, 2022, 2:21 a.m. UTC | #10
On 2022/4/4 20:31, Michael Ellerman wrote:
> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
>> On 2022/3/28 22:12, Christophe Leroy wrote:
>>> Hi,
>>>
>>> Le 26/03/2022 à 08:55, Kefeng Wang a écrit :
>>>> Hi maintainers,
>>>>
>>>> I saw the patches has been reviewed[1], could they be merged?
>>> Thinking about it once more, I think the patches should go in reverse
>>> order. Patch 2 should go first and patch 1 should go after.
>>>
>>> Otherwise, once patch 1 is applied and patch 2 is not applied yet,
>>> virt_addr_valid() doesn't work anymore.
>> Should I resend them or could the maintainer reverse order when merging
>> them?
> I'll reverse them. I've found some breakage in other code while testing
> this, so I'll fix that up first before merging these.
Thanks.
>
> In patch 2 you didn't say what hardware you hit this on, what CPU does
> your system have?

CPU e5500  from fsl,P5040DS.

>
> cheers
> .
diff mbox series

Patch

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8e301cd8925b..4d221d033804 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -255,7 +255,7 @@  void __init mem_init(void)
 #endif
 
 	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
-	set_max_mapnr(max_low_pfn);
+	set_max_mapnr(max_pfn);
 
 	kasan_late_init();