Message ID | 20210316095841.5837-1-anton.ivanov@cambridgegreys.com |
---|---|
State | Rejected |
Headers | show |
Series | um: implement arch_sync_kernel_mappings | expand |
On Tue, 2021-03-16 at 09:58 +0000, anton.ivanov@cambridgegreys.com wrote: > From: Anton Ivanov <anton.ivanov@cambridgegreys.com> > > This improves the tlb flushing behavior by syncing modified > ranges out of map_kernel_range_noflush in a way which is > similar to x86. So I take it this replaces my patch? Want me to test it (without the kvmalloc)? johannes
On 16/03/2021 09:59, Johannes Berg wrote: > On Tue, 2021-03-16 at 09:58 +0000, anton.ivanov@cambridgegreys.com > wrote: >> From: Anton Ivanov <anton.ivanov@cambridgegreys.com> >> >> This improves the tlb flushing behavior by syncing modified >> ranges out of map_kernel_range_noflush in a way which is >> similar to x86. > So I take it this replaces my patch? I get slightly better performance on userspace, etc compared to your patch. It is marginal - sub 1% close to the experimental error. I am testing different things though :) > Want me to test it (without the kvmalloc)? Yes :) > > johannes > >
On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: > > Want me to test it (without the kvmalloc)? > > Yes :) This patch doesn't seem to improve vmalloc performance at all. johannes
On 16/03/2021 10:19, Johannes Berg wrote: > On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: >>> Want me to test it (without the kvmalloc)? >> >> Yes :) > > This patch doesn't seem to improve vmalloc performance at all. Bugger... I was looking at it from this perspective. flush_cache_vmap is invoked out of here right after a map_kernel_range_noflush(): https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L320 If I understand the invocation correctly your patch results in a full flush for the mapped range each time it is mapped. The invocation of map_kernel_range_noflush() actually has a built-in facility for a partial flush - it is exactly arch_sync_page_mappings and it is invoked here: https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L314 This makes the flush dependent on the level of modification - to be functionally equivalent to your patch it is an OR of all page levels which looked way too brutal :( That's why I tried it using only PMD. That looks like not enough :( Can you try tweaking the mask? Available masks are defined here: https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L1474 > > johannes > > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um >
On Tue, 2021-03-16 at 10:30 +0000, Anton Ivanov wrote: On 16/03/2021 10:19, Johannes Berg wrote: > On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: > > > Want me to test it (without the kvmalloc)? > > > > Yes :) > > This patch doesn't seem to improve vmalloc performance at all. Bugger... I was looking at it from this perspective. flush_cache_vmap is invoked out of here right after a map_kernel_range_noflush(): https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L320 If I understand the invocation correctly your patch results in a full flush for the mapped range each time it is mapped. Yes, that's how I understood/planned it. The invocation of map_kernel_range_noflush() actually has a built-in facility for a partial flush - it is exactly arch_sync_page_mappings and it is invoked here: https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L314 Right, I saw this when you posted your patch and I looked how it would work. This makes the flush dependent on the level of modification - to be functionally equivalent to your patch it is an OR of all page levels which looked way too brutal :( That's why I tried it using only PMD. That looks like not enough :( Can you try tweaking the mask? Available masks are defined here: https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L1474 Ahhh. Well, I think you'd have to go for all of them. My workload is vmalloc'ing single pages only (for really tiny objects like 24 bytes, so it's stupid), so really you'd only get a single PTE changed, right? And then once you do that, it becomes equivalent to my patch (though actually done _more_ often since there might be cases where _noflush() is called and not flush_cache_vmap(). johannes
On Tue, 2021-03-16 at 11:34 +0100, Johannes Berg wrote: > Bugger... > > I was looking at it from this perspective. Sorry about the lack of quoting here. Looked sort of OK locally, but clearly got lost along the way. johannes
On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: > On 16/03/2021 09:59, Johannes Berg wrote: > > On Tue, 2021-03-16 at 09:58 +0000, anton.ivanov@cambridgegreys.com > > wrote: > > > From: Anton Ivanov <anton.ivanov@cambridgegreys.com> > > > > > > This improves the tlb flushing behavior by syncing modified > > > ranges out of map_kernel_range_noflush in a way which is > > > similar to x86. > > So I take it this replaces my patch? > > I get slightly better performance on userspace, > etc compared to your patch. It is marginal - > sub 1% close to the experimental error. Why does that even affect userspace at all, btw? We're talking about kernel mappings? Do you see this speed up userspace relative to my patch, or relative to unpatched? johannes
On 16/03/2021 10:34, Johannes Berg wrote: > On Tue, 2021-03-16 at 10:30 +0000, Anton Ivanov wrote: > > On 16/03/2021 10:19, Johannes Berg wrote: >> On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: >>>> Want me to test it (without the kvmalloc)? >>> Yes :) >> This patch doesn't seem to improve vmalloc performance at all. > Bugger... > > I was looking at it from this perspective. > > flush_cache_vmap is invoked out of here right after a > map_kernel_range_noflush(): > > https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L320 > > > If I understand the invocation correctly your patch results in a full > flush for the mapped range each time it is mapped. > > Yes, that's how I understood/planned it. > > > The invocation of map_kernel_range_noflush() actually has a built-in > facility for a partial flush - it is exactly arch_sync_page_mappings and > it is invoked here: > https://elixir.bootlin.com/linux/latest/source/mm/vmalloc.c#L314 > > > Right, I saw this when you posted your patch and I looked how it would > work. > > This makes the flush dependent on the level of modification - to be > functionally equivalent to your patch it is an OR of all page levels > which looked way too brutal :( > > That's why I tried it using only PMD. That looks like not enough :( > > > Can you try tweaking the mask? Available masks are defined here: > https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L1474 > > Ahhh. > > > Well, I think you'd have to go for all of them. My workload is > vmalloc'ing single pages only (for really tiny objects like 24 bytes, so > it's stupid), so really you'd only get a single PTE changed, right? > > And then once you do that, it becomes equivalent to my patch (though > actually done _more_ often since there might be cases where _noflush() > is called and not flush_cache_vmap(). Ugh... OK. Let's stick with your approach. I will +1 it in a minute. Brgds, > > johannes > > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um >
On 16/03/2021 10:44, Johannes Berg wrote: > On Tue, 2021-03-16 at 10:02 +0000, Anton Ivanov wrote: >> On 16/03/2021 09:59, Johannes Berg wrote: >>> On Tue, 2021-03-16 at 09:58 +0000, anton.ivanov@cambridgegreys.com >>> wrote: >>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com> >>>> >>>> This improves the tlb flushing behavior by syncing modified >>>> ranges out of map_kernel_range_noflush in a way which is >>>> similar to x86. >>> So I take it this replaces my patch? >> >> I get slightly better performance on userspace, >> etc compared to your patch. It is marginal - >> sub 1% close to the experimental error. > > Why does that even affect userspace at all, btw? We're talking about > kernel mappings? My userspace tests do heavy IO. > > Do you see this speed up userspace relative to my patch, or relative to > unpatched? > > johannes > > > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um >
On Tue, 2021-03-16 at 10:49 +0000, Anton Ivanov wrote: > > > I get slightly better performance on userspace, > > > etc compared to your patch. It is marginal - > > > sub 1% close to the experimental error. > > > > Why does that even affect userspace at all, btw? We're talking about > > kernel mappings? > > My userspace tests do heavy IO. OK, but even then, my patch should be an improvement on anything that needs updated ranges since it avoids the segfault(s), and not really make anything worse that *doesn't*? But then again, I'm not sure you said that this patch was an improvement or not for your workload? johannes
On 16/03/2021 11:04, Johannes Berg wrote: > On Tue, 2021-03-16 at 10:49 +0000, Anton Ivanov wrote: >>>> I get slightly better performance on userspace, >>>> etc compared to your patch. It is marginal - >>>> sub 1% close to the experimental error. >>> >>> Why does that even affect userspace at all, btw? We're talking about >>> kernel mappings? >> >> My userspace tests do heavy IO. > > OK, but even then, my patch should be an improvement on anything that > needs updated ranges since it avoids the segfault(s), and not really > make anything worse that *doesn't*? > > But then again, I'm not sure you said that this patch was an improvement > or not for your workload? Your patch showed no difference on heavy IO userspace. Roughly the same numbers I usually get. Syncing pages at a PMD level change showed some marginal improvement on a heavy IO testcase but very close to the margin of error - in the realm of 0.2-0.3% I would really need to do 20+ runs to have reliable numbers on that to confirm as it is too close to the margin of error. > > johannes > >
diff --git a/arch/um/include/asm/page.h b/arch/um/include/asm/page.h index 95af12e82a32..f4c66689f039 100644 --- a/arch/um/include/asm/page.h +++ b/arch/um/include/asm/page.h @@ -9,6 +9,8 @@ #include <linux/const.h> +#define ARCH_PAGE_TABLE_SYNC_MASK PGTBL_PMD_MODIFIED + /* PAGE_SHIFT determines the page size */ #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT) diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c index bc38f79ca3a3..b27c4457cc72 100644 --- a/arch/um/kernel/tlb.c +++ b/arch/um/kernel/tlb.c @@ -602,3 +602,8 @@ void force_flush_all(void) vma = vma->vm_next; } } + +void arch_sync_kernel_mappings(unsigned long start, unsigned long end) +{ + flush_tlb_kernel_range_common(start, end); +}