Message ID | 20191223060351.26359-1-aik@ozlabs.ru |
---|---|
State | Not Applicable |
Headers | show |
Series | [kernel,v3] powerpc/book3s64: Fix error handling in mm_iommu_do_alloc() | expand |
Alexey Kardashevskiy <aik@ozlabs.ru> writes: > The last jump to free_exit in mm_iommu_do_alloc() happens after page > pointers in struct mm_iommu_table_group_mem_t were already converted to > physical addresses. Thus calling put_page() on these physical addresses > will likely crash. > > This moves the loop which calculates the pageshift and converts page > struct pointers to physical addresses later after the point when > we cannot fail; thus eliminating the need to convert pointers back. > > Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") > Reported-by: Jan Kara <jack@suse.cz> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > Changes: > v3: > * move pointers conversion after the last possible failure point > --- > arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++------------- > 1 file changed, 21 insertions(+), 18 deletions(-) > > diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c > index 56cc84520577..ef164851738b 100644 > --- a/arch/powerpc/mm/book3s64/iommu_api.c > +++ b/arch/powerpc/mm/book3s64/iommu_api.c > @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, > goto free_exit; > } > > - pageshift = PAGE_SHIFT; > - for (i = 0; i < entries; ++i) { > - struct page *page = mem->hpages[i]; > - > - /* > - * Allow to use larger than 64k IOMMU pages. Only do that > - * if we are backed by hugetlb. > - */ > - if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) > - pageshift = page_shift(compound_head(page)); > - mem->pageshift = min(mem->pageshift, pageshift); > - /* > - * We don't need struct page reference any more, switch > - * to physical address. > - */ > - mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; > - } > - > good_exit: > atomic64_set(&mem->mapped, 1); > mem->used = 1; > @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, > } > } > > + if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) { Couldn't you avoid testing this again ... > + /* > + * Allow to use larger than 64k IOMMU pages. Only do that > + * if we are backed by hugetlb. Skip device memory as it is not > + * backed with page structs. > + */ > + pageshift = PAGE_SHIFT; > + for (i = 0; i < entries; ++i) { ... by making this loop up to `pinned`. `pinned` is only incremented in the loop that does the GUP, and there's a check that pinned == entries after that loop. So when we get here we know pinned == entries, and if pinned is zero it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at the start of the function to get here. Or do you think that's too subtle to rely on? cheers > + struct page *page = mem->hpages[i]; > + > + if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) > + pageshift = page_shift(compound_head(page)); > + mem->pageshift = min(mem->pageshift, pageshift); > + /* > + * We don't need struct page reference any more, switch > + * to physical address. > + */ > + mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; > + } > + } > + > list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list); > > mutex_unlock(&mem_list_mutex); > -- > 2.17.1
On 23/12/2019 22:18, Michael Ellerman wrote: > Alexey Kardashevskiy <aik@ozlabs.ru> writes: > >> The last jump to free_exit in mm_iommu_do_alloc() happens after page >> pointers in struct mm_iommu_table_group_mem_t were already converted to >> physical addresses. Thus calling put_page() on these physical addresses >> will likely crash. >> >> This moves the loop which calculates the pageshift and converts page >> struct pointers to physical addresses later after the point when >> we cannot fail; thus eliminating the need to convert pointers back. >> >> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") >> Reported-by: Jan Kara <jack@suse.cz> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> Changes: >> v3: >> * move pointers conversion after the last possible failure point >> --- >> arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++------------- >> 1 file changed, 21 insertions(+), 18 deletions(-) >> >> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c >> index 56cc84520577..ef164851738b 100644 >> --- a/arch/powerpc/mm/book3s64/iommu_api.c >> +++ b/arch/powerpc/mm/book3s64/iommu_api.c >> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >> goto free_exit; >> } >> >> - pageshift = PAGE_SHIFT; >> - for (i = 0; i < entries; ++i) { >> - struct page *page = mem->hpages[i]; >> - >> - /* >> - * Allow to use larger than 64k IOMMU pages. Only do that >> - * if we are backed by hugetlb. >> - */ >> - if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >> - pageshift = page_shift(compound_head(page)); >> - mem->pageshift = min(mem->pageshift, pageshift); >> - /* >> - * We don't need struct page reference any more, switch >> - * to physical address. >> - */ >> - mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >> - } >> - >> good_exit: >> atomic64_set(&mem->mapped, 1); >> mem->used = 1; >> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >> } >> } >> >> + if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) { > > Couldn't you avoid testing this again ... > >> + /* >> + * Allow to use larger than 64k IOMMU pages. Only do that >> + * if we are backed by hugetlb. Skip device memory as it is not >> + * backed with page structs. >> + */ >> + pageshift = PAGE_SHIFT; >> + for (i = 0; i < entries; ++i) { > > ... by making this loop up to `pinned`. > > `pinned` is only incremented in the loop that does the GUP, and there's > a check that pinned == entries after that loop. > > So when we get here we know pinned == entries, and if pinned is zero > it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at > the start of the function to get here. > > Or do you think that's too subtle to rely on? I had 4 choices: 1. for (;i < pinned;) 2. if (dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) (dev_hpa is a function parameter) 3. if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) 4. if (mem->hpages) The function is already ugly. 3) seemed as the most obvious way of telling what is going on here: "we have just initialized @mem and it is not for a device memory, lets finish the initialization". I could rearrange the code even more but since there is no NVLink3 coming ever, I'd avoid changing it more than necessary. Thanks, > > cheers > >> + struct page *page = mem->hpages[i]; >> + >> + if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >> + pageshift = page_shift(compound_head(page)); >> + mem->pageshift = min(mem->pageshift, pageshift); >> + /* >> + * We don't need struct page reference any more, switch >> + * to physical address. >> + */ >> + mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >> + } >> + } >> + >> list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list); >> >> mutex_unlock(&mem_list_mutex); >> -- >> 2.17.1
On 24/12/2019 10:32, Alexey Kardashevskiy wrote: > > > On 23/12/2019 22:18, Michael Ellerman wrote: >> Alexey Kardashevskiy <aik@ozlabs.ru> writes: >> >>> The last jump to free_exit in mm_iommu_do_alloc() happens after page >>> pointers in struct mm_iommu_table_group_mem_t were already converted to >>> physical addresses. Thus calling put_page() on these physical addresses >>> will likely crash. >>> >>> This moves the loop which calculates the pageshift and converts page >>> struct pointers to physical addresses later after the point when >>> we cannot fail; thus eliminating the need to convert pointers back. >>> >>> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") >>> Reported-by: Jan Kara <jack@suse.cz> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>> --- >>> Changes: >>> v3: >>> * move pointers conversion after the last possible failure point >>> --- >>> arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++------------- >>> 1 file changed, 21 insertions(+), 18 deletions(-) >>> >>> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c >>> index 56cc84520577..ef164851738b 100644 >>> --- a/arch/powerpc/mm/book3s64/iommu_api.c >>> +++ b/arch/powerpc/mm/book3s64/iommu_api.c >>> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >>> goto free_exit; >>> } >>> >>> - pageshift = PAGE_SHIFT; >>> - for (i = 0; i < entries; ++i) { >>> - struct page *page = mem->hpages[i]; >>> - >>> - /* >>> - * Allow to use larger than 64k IOMMU pages. Only do that >>> - * if we are backed by hugetlb. >>> - */ >>> - if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >>> - pageshift = page_shift(compound_head(page)); >>> - mem->pageshift = min(mem->pageshift, pageshift); >>> - /* >>> - * We don't need struct page reference any more, switch >>> - * to physical address. >>> - */ >>> - mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >>> - } >>> - >>> good_exit: >>> atomic64_set(&mem->mapped, 1); >>> mem->used = 1; >>> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, >>> } >>> } >>> >>> + if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) { >> >> Couldn't you avoid testing this again ... >> >>> + /* >>> + * Allow to use larger than 64k IOMMU pages. Only do that >>> + * if we are backed by hugetlb. Skip device memory as it is not >>> + * backed with page structs. >>> + */ >>> + pageshift = PAGE_SHIFT; >>> + for (i = 0; i < entries; ++i) { >> >> ... by making this loop up to `pinned`. >> >> `pinned` is only incremented in the loop that does the GUP, and there's >> a check that pinned == entries after that loop. >> >> So when we get here we know pinned == entries, and if pinned is zero >> it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at >> the start of the function to get here. >> >> Or do you think that's too subtle to rely on? > > > I had 4 choices: > > 1. for (;i < pinned;) > > 2. if (dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) (dev_hpa is a function > parameter) > > 3. if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) > > 4. if (mem->hpages) > > > The function is already ugly. 3) seemed as the most obvious way of > telling what is going on here: "we have just initialized @mem and it is > not for a device memory, lets finish the initialization". > > I could rearrange the code even more but since there is no NVLink3 > coming ever, I'd avoid changing it more than necessary. Thanks, Repost? Rework? > > >> >> cheers >> >>> + struct page *page = mem->hpages[i]; >>> + >>> + if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) >>> + pageshift = page_shift(compound_head(page)); >>> + mem->pageshift = min(mem->pageshift, pageshift); >>> + /* >>> + * We don't need struct page reference any more, switch >>> + * to physical address. >>> + */ >>> + mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; >>> + } >>> + } >>> + >>> list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list); >>> >>> mutex_unlock(&mem_list_mutex); >>> -- >>> 2.17.1 >
Alexey Kardashevskiy <aik@ozlabs.ru> writes: > On 24/12/2019 10:32, Alexey Kardashevskiy wrote: ... >> >> I could rearrange the code even more but since there is no NVLink3 >> coming ever, I'd avoid changing it more than necessary. Thanks, > > Repost? Rework? I'll just take v3. cheers
On Mon, 2019-12-23 at 06:03:51 UTC, Alexey Kardashevskiy wrote: > The last jump to free_exit in mm_iommu_do_alloc() happens after page > pointers in struct mm_iommu_table_group_mem_t were already converted to > physical addresses. Thus calling put_page() on these physical addresses > will likely crash. > > This moves the loop which calculates the pageshift and converts page > struct pointers to physical addresses later after the point when > we cannot fail; thus eliminating the need to convert pointers back. > > Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") > Reported-by: Jan Kara <jack@suse.cz> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/c4b78169e3667413184c9a20e11b5832288a109f cheers
diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c index 56cc84520577..ef164851738b 100644 --- a/arch/powerpc/mm/book3s64/iommu_api.c +++ b/arch/powerpc/mm/book3s64/iommu_api.c @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, goto free_exit; } - pageshift = PAGE_SHIFT; - for (i = 0; i < entries; ++i) { - struct page *page = mem->hpages[i]; - - /* - * Allow to use larger than 64k IOMMU pages. Only do that - * if we are backed by hugetlb. - */ - if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) - pageshift = page_shift(compound_head(page)); - mem->pageshift = min(mem->pageshift, pageshift); - /* - * We don't need struct page reference any more, switch - * to physical address. - */ - mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; - } - good_exit: atomic64_set(&mem->mapped, 1); mem->used = 1; @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, } } + if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) { + /* + * Allow to use larger than 64k IOMMU pages. Only do that + * if we are backed by hugetlb. Skip device memory as it is not + * backed with page structs. + */ + pageshift = PAGE_SHIFT; + for (i = 0; i < entries; ++i) { + struct page *page = mem->hpages[i]; + + if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page)) + pageshift = page_shift(compound_head(page)); + mem->pageshift = min(mem->pageshift, pageshift); + /* + * We don't need struct page reference any more, switch + * to physical address. + */ + mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; + } + } + list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list); mutex_unlock(&mem_list_mutex);
The last jump to free_exit in mm_iommu_do_alloc() happens after page pointers in struct mm_iommu_table_group_mem_t were already converted to physical addresses. Thus calling put_page() on these physical addresses will likely crash. This moves the loop which calculates the pageshift and converts page struct pointers to physical addresses later after the point when we cannot fail; thus eliminating the need to convert pointers back. Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- Changes: v3: * move pointers conversion after the last possible failure point --- arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++------------- 1 file changed, 21 insertions(+), 18 deletions(-)