Message ID | 20210315233804.d3e52f6a3422.I9672eef7dfa7ce6c3de1ccf7ab8d9aad1fa7f3a6@changeid |
---|---|
State | Accepted |
Headers | show |
Series | um: implement flush_cache_vmap/flush_cache_vunmap | expand |
On 15/03/2021 22:38, Johannes Berg wrote: > From: Johannes Berg <johannes.berg@intel.com> > > vmalloc() heavy workloads in UML are extremely slow, due to > flushing the entire kernel VM space (flush_tlb_kernel_vm()) > on the first segfault. > > Implement flush_cache_vmap() to avoid that, and while at it > also add flush_cache_vunmap() since it's trivial. > > This speeds up my vmalloc() heavy test of copying files out > from /sys/kernel/debug/gcov/ by 30x (from 30s to 1s.) > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > --- > arch/um/include/asm/cacheflush.h | 9 +++++++++ > arch/um/include/asm/tlb.h | 2 +- > 2 files changed, 10 insertions(+), 1 deletion(-) > create mode 100644 arch/um/include/asm/cacheflush.h > > diff --git a/arch/um/include/asm/cacheflush.h b/arch/um/include/asm/cacheflush.h > new file mode 100644 > index 000000000000..4c9858cd36ec > --- /dev/null > +++ b/arch/um/include/asm/cacheflush.h > @@ -0,0 +1,9 @@ > +#ifndef __UM_ASM_CACHEFLUSH_H > +#define __UM_ASM_CACHEFLUSH_H > + > +#include <asm/tlbflush.h> > +#define flush_cache_vmap flush_tlb_kernel_range > +#define flush_cache_vunmap flush_tlb_kernel_range > + > +#include <asm-generic/cacheflush.h> > +#endif /* __UM_ASM_CACHEFLUSH_H */ > diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h > index ff9c62828962..0422467bda5b 100644 > --- a/arch/um/include/asm/tlb.h > +++ b/arch/um/include/asm/tlb.h > @@ -5,7 +5,7 @@ > #include <linux/mm.h> > > #include <asm/tlbflush.h> > -#include <asm-generic/cacheflush.h> > +#include <asm/cacheflush.h> > #include <asm-generic/tlb.h> > > #endif > Well spotted. Unless I am mistaken, there may be a slightly better way of doing it. We can implement arch_sync_kernel_mappings() and sync only where the page modified mask says so by setting ARCH_PAGE_TABLE_SYNC_MASK This way flush_cache_* can remain nops as in asm-generic I am going to give that a spin, if it works, I will post it to the list by lunchtime GMT >
On Tue, 2021-03-16 at 08:46 +0000, Anton Ivanov wrote: > > Well spotted. > > Unless I am mistaken, there may be a slightly better way of doing it. > > We can implement arch_sync_kernel_mappings() and sync only where the > page modified mask says so by setting ARCH_PAGE_TABLE_SYNC_MASK > > This way flush_cache_* can remain nops as in asm-generic Would that actually buy us anything? Not that I mind, or even understand the TLB code well, but it seems fairly similar? > I am going to give that a spin, if it works, I will post it to the list by lunchtime GMT Sounds good to me :) I also made these patches: https://lore.kernel.org/lkml/20210315235453.e3fbb86e99a0.I08a3ee6dbe47ea3e8024956083f162884a958e40@changeid/T/#u so my "vmalloc-heavy workload" no longer is vmalloc heavy since it now uses kvmalloc and never hits vmalloc, always kmalloc :) johannes
On 16/03/2021 08:55, Johannes Berg wrote: > On Tue, 2021-03-16 at 08:46 +0000, Anton Ivanov wrote: >> >> Well spotted. >> >> Unless I am mistaken, there may be a slightly better way of doing it. >> >> We can implement arch_sync_kernel_mappings() and sync only where the >> page modified mask says so by setting ARCH_PAGE_TABLE_SYNC_MASK >> >> This way flush_cache_* can remain nops as in asm-generic > > Would that actually buy us anything? It makes flushing conditional on a specific mask from the result of the mapping op. In theory, should be better. In practice - probably more of the same. > > Not that I mind, or even understand the TLB code well, but it seems > fairly similar? > >> I am going to give that a spin, if it works, I will post it to the list by lunchtime GMT > > Sounds good to me :) > > I also made these patches: > > https://lore.kernel.org/lkml/20210315235453.e3fbb86e99a0.I08a3ee6dbe47ea3e8024956083f162884a958e40@changeid/T/#u > > > so my "vmalloc-heavy workload" no longer is vmalloc heavy since it now > uses kvmalloc and never hits vmalloc, always kmalloc :) :) > > johannes > > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um >
On 15/03/2021 22:38, Johannes Berg wrote: > From: Johannes Berg <johannes.berg@intel.com> > > vmalloc() heavy workloads in UML are extremely slow, due to > flushing the entire kernel VM space (flush_tlb_kernel_vm()) > on the first segfault. > > Implement flush_cache_vmap() to avoid that, and while at it > also add flush_cache_vunmap() since it's trivial. > > This speeds up my vmalloc() heavy test of copying files out > from /sys/kernel/debug/gcov/ by 30x (from 30s to 1s.) > > Signed-off-by: Johannes Berg <johannes.berg@intel.com> > --- > arch/um/include/asm/cacheflush.h | 9 +++++++++ > arch/um/include/asm/tlb.h | 2 +- > 2 files changed, 10 insertions(+), 1 deletion(-) > create mode 100644 arch/um/include/asm/cacheflush.h > > diff --git a/arch/um/include/asm/cacheflush.h b/arch/um/include/asm/cacheflush.h > new file mode 100644 > index 000000000000..4c9858cd36ec > --- /dev/null > +++ b/arch/um/include/asm/cacheflush.h > @@ -0,0 +1,9 @@ > +#ifndef __UM_ASM_CACHEFLUSH_H > +#define __UM_ASM_CACHEFLUSH_H > + > +#include <asm/tlbflush.h> > +#define flush_cache_vmap flush_tlb_kernel_range > +#define flush_cache_vunmap flush_tlb_kernel_range > + > +#include <asm-generic/cacheflush.h> > +#endif /* __UM_ASM_CACHEFLUSH_H */ > diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h > index ff9c62828962..0422467bda5b 100644 > --- a/arch/um/include/asm/tlb.h > +++ b/arch/um/include/asm/tlb.h > @@ -5,7 +5,7 @@ > #include <linux/mm.h> > > #include <asm/tlbflush.h> > -#include <asm-generic/cacheflush.h> > +#include <asm/cacheflush.h> > #include <asm-generic/tlb.h> > > #endif > Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
diff --git a/arch/um/include/asm/cacheflush.h b/arch/um/include/asm/cacheflush.h new file mode 100644 index 000000000000..4c9858cd36ec --- /dev/null +++ b/arch/um/include/asm/cacheflush.h @@ -0,0 +1,9 @@ +#ifndef __UM_ASM_CACHEFLUSH_H +#define __UM_ASM_CACHEFLUSH_H + +#include <asm/tlbflush.h> +#define flush_cache_vmap flush_tlb_kernel_range +#define flush_cache_vunmap flush_tlb_kernel_range + +#include <asm-generic/cacheflush.h> +#endif /* __UM_ASM_CACHEFLUSH_H */ diff --git a/arch/um/include/asm/tlb.h b/arch/um/include/asm/tlb.h index ff9c62828962..0422467bda5b 100644 --- a/arch/um/include/asm/tlb.h +++ b/arch/um/include/asm/tlb.h @@ -5,7 +5,7 @@ #include <linux/mm.h> #include <asm/tlbflush.h> -#include <asm-generic/cacheflush.h> +#include <asm/cacheflush.h> #include <asm-generic/tlb.h> #endif