diff mbox series

[RFC,v4,12/16] powerpc/e500: Encode hugepage size in PTE bits

Message ID 10eae3c6815e3aba5f624af92321948e4684c95a.1716815901.git.christophe.leroy@csgroup.eu (mailing list archive)
State Superseded
Headers show
Series Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64) | expand

Commit Message

Christophe Leroy May 27, 2024, 1:30 p.m. UTC
Use U0-U3 bits to encode hugepage size, more exactly page shift.

As we start using hugepages at shift 21 (2Mbytes), substract 20
so that it fits into 4 bits. That may change in the future if
we want to use smaller hugepages.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++
 arch/powerpc/include/asm/nohash/pte-e500.h     | 3 +++
 2 files changed, 9 insertions(+)

Comments

Oscar Salvador May 29, 2024, 8:05 a.m. UTC | #1
On Mon, May 27, 2024 at 03:30:10PM +0200, Christophe Leroy wrote:
> Use U0-U3 bits to encode hugepage size, more exactly page shift.
> 
> As we start using hugepages at shift 21 (2Mbytes), substract 20
> so that it fits into 4 bits. That may change in the future if
> we want to use smaller hugepages.

What other shifts we can have here on e500? PUD_SHIFT?
Could you please spell them out here?
Or even better,

> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++
>  arch/powerpc/include/asm/nohash/pte-e500.h     | 3 +++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
> index 8f04ad20e040..d8e51a3f8557 100644
> --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h
> +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
> @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift)
>  	return shift_to_mmu_psize(shift);
>  }
>  
> +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
> +{
> +	return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20)));
> +}
> +#define arch_make_huge_pte arch_make_huge_pte
> +
>  #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */
> diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h
> index 975facc7e38e..091e4bff1fba 100644
> --- a/arch/powerpc/include/asm/nohash/pte-e500.h
> +++ b/arch/powerpc/include/asm/nohash/pte-e500.h
> @@ -46,6 +46,9 @@
>  #define _PAGE_NO_CACHE	0x400000 /* I: cache inhibit */
>  #define _PAGE_WRITETHRU	0x800000 /* W: cache write-through */
> +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3)
> +#define _PAGE_HSIZE_SHIFT	14

Add a comment in above explaining which P*_SHIFT we need cover with these
4bits.
Christophe Leroy May 29, 2024, 9:49 a.m. UTC | #2
Le 29/05/2024 à 10:05, Oscar Salvador a écrit :
> [Vous ne recevez pas souvent de courriers de osalvador@suse.com. D?couvrez pourquoi ceci est important ? https://aka.ms/LearnAboutSenderIdentification ]
> 
> On Mon, May 27, 2024 at 03:30:10PM +0200, Christophe Leroy wrote:
>> Use U0-U3 bits to encode hugepage size, more exactly page shift.
>>
>> As we start using hugepages at shift 21 (2Mbytes), substract 20
>> so that it fits into 4 bits. That may change in the future if
>> we want to use smaller hugepages.
> 
> What other shifts we can have here on e500? PUD_SHIFT?

Doesn't really matter if it's PUD or PMD at this point. On a 32 bits 
kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD.

At the time being (as implemented with hugepd), Linux support 4M, 16M, 
64M, 256M and 1G (Shifts 22, 24, 26, 28, 30)

The hardware supports the following page sizes, and encodes them on 4 
bits allthough it is not directly a shift. Maybe it would be better to 
use that encoding after all:

0001 4 Kbytes (Shift 12)
0010 16 Kbytes (Shift 14)
0011 64 Kbytes (Shift 16)
0100 256 Kbytes (Shift 18)
0101 1 Mbyte (Shift 20)
0110 4 Mbytes (Shift 22)
0111 16 Mbytes (Shift 24)
1000 64 Mbytes (Shift 26)
1001 256 Mbytes (Shift 28)
1010 1 Gbyte (e500v2 only) (Shift 30)
1011 4 Gbytes (e500v2 only) (Shift 32)


> Could you please spell them out here?
> Or even better,
> 
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>>   arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++
>>   arch/powerpc/include/asm/nohash/pte-e500.h     | 3 +++
>>   2 files changed, 9 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
>> index 8f04ad20e040..d8e51a3f8557 100644
>> --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h
>> +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
>> @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift)
>>        return shift_to_mmu_psize(shift);
>>   }
>>
>> +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
>> +{
>> +     return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20)));
>> +}
>> +#define arch_make_huge_pte arch_make_huge_pte
>> +
>>   #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */
>> diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h
>> index 975facc7e38e..091e4bff1fba 100644
>> --- a/arch/powerpc/include/asm/nohash/pte-e500.h
>> +++ b/arch/powerpc/include/asm/nohash/pte-e500.h
>> @@ -46,6 +46,9 @@
>>   #define _PAGE_NO_CACHE       0x400000 /* I: cache inhibit */
>>   #define _PAGE_WRITETHRU      0x800000 /* W: cache write-through */
>> +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3)
>> +#define _PAGE_HSIZE_SHIFT    14
> 
> Add a comment in above explaining which P*_SHIFT we need cover with these
> 4bits.
> 
> 
> 
> --
> Oscar Salvador
> SUSE Labs
Oscar Salvador May 29, 2024, 10:09 a.m. UTC | #3
On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote:
> Doesn't really matter if it's PUD or PMD at this point. On a 32 bits 
> kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD.
> 
> At the time being (as implemented with hugepd), Linux support 4M, 16M, 
> 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30)
> 
> The hardware supports the following page sizes, and encodes them on 4 
> bits allthough it is not directly a shift. Maybe it would be better to 
> use that encoding after all:

I think so.

> 
> 0001 4 Kbytes (Shift 12)
> 0010 16 Kbytes (Shift 14)
> 0011 64 Kbytes (Shift 16)
> 0100 256 Kbytes (Shift 18)
> 0101 1 Mbyte (Shift 20)
> 0110 4 Mbytes (Shift 22)
> 0111 16 Mbytes (Shift 24)
> 1000 64 Mbytes (Shift 26)
> 1001 256 Mbytes (Shift 28)
> 1010 1 Gbyte (e500v2 only) (Shift 30)
> 1011 4 Gbytes (e500v2 only) (Shift 32)

You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage
Linux support is 4MB (shift 22).?
Christophe Leroy May 29, 2024, 10:14 a.m. UTC | #4
Le 29/05/2024 à 12:09, Oscar Salvador a écrit :
> On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote:
>> Doesn't really matter if it's PUD or PMD at this point. On a 32 bits
>> kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD.
>>
>> At the time being (as implemented with hugepd), Linux support 4M, 16M,
>> 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30)
>>
>> The hardware supports the following page sizes, and encodes them on 4
>> bits allthough it is not directly a shift. Maybe it would be better to
>> use that encoding after all:
> 
> I think so.
> 
>>
>> 0001 4 Kbytes (Shift 12)
>> 0010 16 Kbytes (Shift 14)
>> 0011 64 Kbytes (Shift 16)
>> 0100 256 Kbytes (Shift 18)
>> 0101 1 Mbyte (Shift 20)
>> 0110 4 Mbytes (Shift 22)
>> 0111 16 Mbytes (Shift 24)
>> 1000 64 Mbytes (Shift 26)
>> 1001 256 Mbytes (Shift 28)
>> 1010 1 Gbyte (e500v2 only) (Shift 30)
>> 1011 4 Gbytes (e500v2 only) (Shift 32)
> 
> You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage
> Linux support is 4MB (shift 22).?
> 
> 

No I say PMD_SIZE is 2MB on e500 with 64 bits PTE and at the time being 
Linux powerpc implementation for e500 supports sizes 4M, 16M, 64M, 256M 
and 1G.

But for instead on 8xx we have 16k and 512M hugepages. Here on the e500 
we could in a follow-up patch add support to lower pagesizes for 
instance 16k, 64k, 256k and 1M. Of course all would then be cont-PTE and 
not cont-PMD
Oscar Salvador May 29, 2024, 10:15 a.m. UTC | #5
On Wed, May 29, 2024 at 10:14:15AM +0000, Christophe Leroy wrote:
> 
> 
> Le 29/05/2024 à 12:09, Oscar Salvador a écrit :
> > On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote:
> >> Doesn't really matter if it's PUD or PMD at this point. On a 32 bits
> >> kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD.
> >>
> >> At the time being (as implemented with hugepd), Linux support 4M, 16M,
> >> 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30)
> >>
> >> The hardware supports the following page sizes, and encodes them on 4
> >> bits allthough it is not directly a shift. Maybe it would be better to
> >> use that encoding after all:
> > 
> > I think so.
> > 
> >>
> >> 0001 4 Kbytes (Shift 12)
> >> 0010 16 Kbytes (Shift 14)
> >> 0011 64 Kbytes (Shift 16)
> >> 0100 256 Kbytes (Shift 18)
> >> 0101 1 Mbyte (Shift 20)
> >> 0110 4 Mbytes (Shift 22)
> >> 0111 16 Mbytes (Shift 24)
> >> 1000 64 Mbytes (Shift 26)
> >> 1001 256 Mbytes (Shift 28)
> >> 1010 1 Gbyte (e500v2 only) (Shift 30)
> >> 1011 4 Gbytes (e500v2 only) (Shift 32)
> > 
> > You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage
> > Linux support is 4MB (shift 22).?
> > 
> > 
> 
> No I say PMD_SIZE is 2MB on e500 with 64 bits PTE and at the time being 
> Linux powerpc implementation for e500 supports sizes 4M, 16M, 64M, 256M 
> and 1G.

Got it. I got confused.
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
index 8f04ad20e040..d8e51a3f8557 100644
--- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h
+++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h
@@ -42,4 +42,10 @@  static inline int check_and_get_huge_psize(int shift)
 	return shift_to_mmu_psize(shift);
 }
 
+static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
+{
+	return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20)));
+}
+#define arch_make_huge_pte arch_make_huge_pte
+
 #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */
diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h
index 975facc7e38e..091e4bff1fba 100644
--- a/arch/powerpc/include/asm/nohash/pte-e500.h
+++ b/arch/powerpc/include/asm/nohash/pte-e500.h
@@ -46,6 +46,9 @@ 
 #define _PAGE_NO_CACHE	0x400000 /* I: cache inhibit */
 #define _PAGE_WRITETHRU	0x800000 /* W: cache write-through */
 
+#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3)
+#define _PAGE_HSIZE_SHIFT	14
+
 /* "Higher level" linux bit combinations */
 #define _PAGE_EXEC		(_PAGE_BAP_SX | _PAGE_BAP_UX) /* .. and was cache cleaned */
 #define _PAGE_READ		(_PAGE_BAP_SR | _PAGE_BAP_UR) /* User read permission */