Message ID | 20210216033307.69863-2-aik@ozlabs.ru (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | powerpc/iommu: Stop crashing the host when VM is terminated | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (626a6c3d2e20da80aaa710104f34ea6037b28b33) |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 41 lines checked |
snowpatch_ozlabs/needsstable | success | Patch has no Fixes tags |
On Tue, Feb 16, 2021 at 02:33:06PM +1100, Alexey Kardashevskiy wrote: > The IOMMU table uses the it_map bitmap to keep track of allocated DMA > pages. This has always been a contiguous array allocated at either > the boot time or when a passed through device is returned to the host OS. > The it_map memory is allocated by alloc_pages() which allocates > contiguous physical memory. > > Such allocation method occasionally creates a problem when there is > no big chunk of memory available (no free memory or too fragmented). > On powernv/ioda2 the default DMA window requires 16MB for it_map. > > This replaces alloc_pages_node() with vzalloc_node() which allocates > contiguous block but in virtual memory. This should reduce changes of > failure but should not cause other behavioral changes as it_map is only > used by the kernel's DMA hooks/api when MMU is on. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > arch/powerpc/kernel/iommu.c | 15 +++------------ > 1 file changed, 3 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c > index c00214a4355c..8eb6eb0afa97 100644 > --- a/arch/powerpc/kernel/iommu.c > +++ b/arch/powerpc/kernel/iommu.c > @@ -719,7 +719,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > { > unsigned long sz; > static int welcomed = 0; > - struct page *page; > unsigned int i; > struct iommu_pool *p; > > @@ -728,11 +727,9 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > /* number of bytes needed for the bitmap */ > sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); > > - page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz)); > - if (!page) > + tbl->it_map = vzalloc_node(sz, nid); > + if (!tbl->it_map) > panic("iommu_init_table: Can't allocate %ld bytes\n", sz); > - tbl->it_map = page_address(page); > - memset(tbl->it_map, 0, sz); > > iommu_table_reserve_pages(tbl, res_start, res_end); > > @@ -774,8 +771,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > > static void iommu_table_free(struct kref *kref) > { > - unsigned long bitmap_sz; > - unsigned int order; > struct iommu_table *tbl; > > tbl = container_of(kref, struct iommu_table, it_kref); > @@ -796,12 +791,8 @@ static void iommu_table_free(struct kref *kref) > if (!bitmap_empty(tbl->it_map, tbl->it_size)) > pr_warn("%s: Unexpected TCEs\n", __func__); > > - /* calculate bitmap size in bytes */ > - bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); > - > /* free bitmap */ > - order = get_order(bitmap_sz); > - free_pages((unsigned long) tbl->it_map, order); > + vfree(tbl->it_map); > > /* free table */ > kfree(tbl);
On Tue, 2021-02-16 at 14:33 +1100, Alexey Kardashevskiy wrote: > The IOMMU table uses the it_map bitmap to keep track of allocated DMA > pages. This has always been a contiguous array allocated at either > the boot time or when a passed through device is returned to the host OS. > The it_map memory is allocated by alloc_pages() which allocates > contiguous physical memory. > > Such allocation method occasionally creates a problem when there is > no big chunk of memory available (no free memory or too fragmented). > On powernv/ioda2 the default DMA window requires 16MB for it_map. > > This replaces alloc_pages_node() with vzalloc_node() which allocates > contiguous block but in virtual memory. This should reduce changes of > failure but should not cause other behavioral changes as it_map is only > used by the kernel's DMA hooks/api when MMU is on. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> It looks a very good change, and also makes code much simpler to read. FWIW: Reviewed-by: Leonardo Bras <leobras.c@gmail,com> > --- > arch/powerpc/kernel/iommu.c | 15 +++------------ > 1 file changed, 3 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c > index c00214a4355c..8eb6eb0afa97 100644 > --- a/arch/powerpc/kernel/iommu.c > +++ b/arch/powerpc/kernel/iommu.c > @@ -719,7 +719,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > { > unsigned long sz; > static int welcomed = 0; > - struct page *page; > unsigned int i; > struct iommu_pool *p; > > > > > @@ -728,11 +727,9 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > /* number of bytes needed for the bitmap */ > sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); > > > > > - page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz)); > - if (!page) > + tbl->it_map = vzalloc_node(sz, nid); > + if (!tbl->it_map) > panic("iommu_init_table: Can't allocate %ld bytes\n", sz); > - tbl->it_map = page_address(page); > - memset(tbl->it_map, 0, sz); > > > > > iommu_table_reserve_pages(tbl, res_start, res_end); > > > > > @@ -774,8 +771,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, > > > > > static void iommu_table_free(struct kref *kref) > { > - unsigned long bitmap_sz; > - unsigned int order; > struct iommu_table *tbl; > > > > > tbl = container_of(kref, struct iommu_table, it_kref); > @@ -796,12 +791,8 @@ static void iommu_table_free(struct kref *kref) > if (!bitmap_empty(tbl->it_map, tbl->it_size)) > pr_warn("%s: Unexpected TCEs\n", __func__); > > > > > - /* calculate bitmap size in bytes */ > - bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); > - > /* free bitmap */ > - order = get_order(bitmap_sz); > - free_pages((unsigned long) tbl->it_map, order); > + vfree(tbl->it_map); > > > > > /* free table */ > kfree(tbl);
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index c00214a4355c..8eb6eb0afa97 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -719,7 +719,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, { unsigned long sz; static int welcomed = 0; - struct page *page; unsigned int i; struct iommu_pool *p; @@ -728,11 +727,9 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, /* number of bytes needed for the bitmap */ sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); - page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz)); - if (!page) + tbl->it_map = vzalloc_node(sz, nid); + if (!tbl->it_map) panic("iommu_init_table: Can't allocate %ld bytes\n", sz); - tbl->it_map = page_address(page); - memset(tbl->it_map, 0, sz); iommu_table_reserve_pages(tbl, res_start, res_end); @@ -774,8 +771,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, static void iommu_table_free(struct kref *kref) { - unsigned long bitmap_sz; - unsigned int order; struct iommu_table *tbl; tbl = container_of(kref, struct iommu_table, it_kref); @@ -796,12 +791,8 @@ static void iommu_table_free(struct kref *kref) if (!bitmap_empty(tbl->it_map, tbl->it_size)) pr_warn("%s: Unexpected TCEs\n", __func__); - /* calculate bitmap size in bytes */ - bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); - /* free bitmap */ - order = get_order(bitmap_sz); - free_pages((unsigned long) tbl->it_map, order); + vfree(tbl->it_map); /* free table */ kfree(tbl);
The IOMMU table uses the it_map bitmap to keep track of allocated DMA pages. This has always been a contiguous array allocated at either the boot time or when a passed through device is returned to the host OS. The it_map memory is allocated by alloc_pages() which allocates contiguous physical memory. Such allocation method occasionally creates a problem when there is no big chunk of memory available (no free memory or too fragmented). On powernv/ioda2 the default DMA window requires 16MB for it_map. This replaces alloc_pages_node() with vzalloc_node() which allocates contiguous block but in virtual memory. This should reduce changes of failure but should not cause other behavioral changes as it_map is only used by the kernel's DMA hooks/api when MMU is on. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- arch/powerpc/kernel/iommu.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-)