From patchwork Thu May 30 20:34:58 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 247912 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 5E0432C02FD for ; Fri, 31 May 2013 20:32:26 +1000 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UiMd4-00055M-Qu; Fri, 31 May 2013 10:32:14 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1Ui9Yt-0002x2-83 for kernel-team@lists.ubuntu.com; Thu, 30 May 2013 20:35:03 +0000 Received: from c-67-160-231-42.hsd1.ca.comcast.net ([67.160.231.42] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1Ui9Yr-0002lh-Cc; Thu, 30 May 2013 20:35:01 +0000 Received: from kamal by fourier with local (Exim 4.80) (envelope-from ) id 1Ui9Yo-0003Uq-Tq; Thu, 30 May 2013 13:34:58 -0700 From: Kamal Mostafa To: Cliff Wickman Subject: [ 3.8.y.z extended stable ] Patch "mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas" has been added to staging queue Date: Thu, 30 May 2013 13:34:58 -0700 Message-Id: <1369946098-13407-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 1.8.1.2 X-Extended-Stable: 3.8 X-Mailman-Approved-At: Fri, 31 May 2013 10:32:12 +0000 Cc: Andrea Arcangeli , Mel Gorman , Kamal Mostafa , David Sterba , Dave Hansen , kernel-team@lists.ubuntu.com, Johannes Weiner , KOSAKI Motohiro , Naoya Horiguchi , Linus Torvalds , Andrew Morton , "Kirill A. Shutemov" X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas to the linux-3.8.y-queue branch of the 3.8.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.8.y-queue This patch is scheduled to be released in version 3.8.13.2. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.8.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ------ From e6181bf23f16847f83023339cefa147c193afa3a Mon Sep 17 00:00:00 2001 From: Cliff Wickman Date: Fri, 24 May 2013 15:55:36 -0700 Subject: mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas commit a9ff785e4437c83d2179161e012f5bdfbd6381f0 upstream. A panic can be caused by simply cat'ing /proc//smaps while an application has a VM_PFNMAP range. It happened in-house when a benchmarker was trying to decipher the memory layout of his program. /proc//smaps and similar walks through a user page table should not be looking at VM_PFNMAP areas. Certain tests in walk_page_range() (specifically split_huge_page_pmd()) assume that all the mapped PFN's are backed with page structures. And this is not usually true for VM_PFNMAP areas. This can result in panics on kernel page faults when attempting to address those page structures. There are a half dozen callers of walk_page_range() that walk through a task's entire page table (as N. Horiguchi pointed out). So rather than change all of them, this patch changes just walk_page_range() to ignore VM_PFNMAP areas. The logic of hugetlb_vma() is moved back into walk_page_range(), as we want to test any vma in the range. VM_PFNMAP areas are used by: - graphics memory manager gpu/drm/drm_gem.c - global reference unit sgi-gru/grufile.c - sgi special memory char/mspec.c - and probably several out-of-tree modules [akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub] Signed-off-by: Cliff Wickman Reviewed-by: Naoya Horiguchi Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dave Hansen Cc: David Sterba Cc: Johannes Weiner Cc: KOSAKI Motohiro Cc: "Kirill A. Shutemov" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Kamal Mostafa --- mm/pagewalk.c | 70 ++++++++++++++++++++++++++++++----------------------------- 1 file changed, 36 insertions(+), 34 deletions(-) -- 1.8.1.2 diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 35aa294..5da2cbc 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -127,28 +127,7 @@ static int walk_hugetlb_range(struct vm_area_struct *vma, return 0; } -static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) -{ - struct vm_area_struct *vma; - - /* We don't need vma lookup at all. */ - if (!walk->hugetlb_entry) - return NULL; - - VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); - vma = find_vma(walk->mm, addr); - if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma)) - return vma; - - return NULL; -} - #else /* CONFIG_HUGETLB_PAGE */ -static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) -{ - return NULL; -} - static int walk_hugetlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -198,30 +177,53 @@ int walk_page_range(unsigned long addr, unsigned long end, if (!walk->mm) return -EINVAL; + VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); + pgd = pgd_offset(walk->mm, addr); do { - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; next = pgd_addr_end(addr, end); /* - * handle hugetlb vma individually because pagetable walk for - * the hugetlb page is dependent on the architecture and - * we can't handled it in the same manner as non-huge pages. + * This function was not intended to be vma based. + * But there are vma special cases to be handled: + * - hugetlb vma's + * - VM_PFNMAP vma's */ - vma = hugetlb_vma(addr, walk); + vma = find_vma(walk->mm, addr); if (vma) { - if (vma->vm_end < next) + /* + * There are no page structures backing a VM_PFNMAP + * range, so do not allow split_huge_page_pmd(). + */ + if ((vma->vm_start <= addr) && + (vma->vm_flags & VM_PFNMAP)) { next = vma->vm_end; + pgd = pgd_offset(walk->mm, next); + continue; + } /* - * Hugepage is very tightly coupled with vma, so - * walk through hugetlb entries within a given vma. + * Handle hugetlb vma individually because pagetable + * walk for the hugetlb page is dependent on the + * architecture and we can't handled it in the same + * manner as non-huge pages. */ - err = walk_hugetlb_range(vma, addr, next, walk); - if (err) - break; - pgd = pgd_offset(walk->mm, next); - continue; + if (walk->hugetlb_entry && (vma->vm_start <= addr) && + is_vm_hugetlb_page(vma)) { + if (vma->vm_end < next) + next = vma->vm_end; + /* + * Hugepage is very tightly coupled with vma, + * so walk through hugetlb entries within a + * given vma. + */ + err = walk_hugetlb_range(vma, addr, next, walk); + if (err) + break; + pgd = pgd_offset(walk->mm, next); + continue; + } } if (pgd_none_or_clear_bad(pgd)) {