From patchwork Mon Mar 7 22:35:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 593455 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 8304514017E; Tue, 8 Mar 2016 09:44:41 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1ad3tG-00042j-5L; Mon, 07 Mar 2016 22:44:38 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1ad3kN-0007jJ-Si for kernel-team@lists.ubuntu.com; Mon, 07 Mar 2016 22:35:27 +0000 Received: from 1.general.kamal.us.vpn ([10.172.68.52] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ad3kN-0007b6-EE; Mon, 07 Mar 2016 22:35:27 +0000 Received: from kamal by fourier with local (Exim 4.86) (envelope-from ) id 1ad3kK-0000X8-O4; Mon, 07 Mar 2016 14:35:24 -0800 From: Kamal Mostafa To: Andrea Arcangeli Subject: [4.2.y-ckt stable] Patch "mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED" has been added to the 4.2.y-ckt tree Date: Mon, 7 Mar 2016 14:35:23 -0800 Message-Id: <1457390123-2015-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 2.7.0 X-Extended-Stable: 4.2 Cc: Andrew Morton , Linus Torvalds , Kamal Mostafa , "Kirill A . Shutemov" , kernel-team@lists.ubuntu.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED to the linux-4.2.y-queue branch of the 4.2.y-ckt extended stable tree which can be found at: http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-4.2.y-queue This patch is scheduled to be released in version 4.2.8-ckt5. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 4.2.y-ckt tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ---8<------------------------------------------------------------ From 7a977a48930352eb23bf87a7e441f5963f311ef8 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Fri, 26 Feb 2016 15:19:28 -0800 Subject: mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED commit ad33bb04b2a6cee6c1f99fabb15cddbf93ff0433 upstream. pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were introduced to locklessy (but atomically) detect when a pmd is a regular (stable) pmd or when the pmd is unstable and can infinitely transition from pmd_none() and pmd_trans_huge() from under us, while only holding the mmap_sem for reading (for writing not). While holding the mmap_sem only for reading, MADV_DONTNEED can run from under us and so before we can assume the pmd to be a regular stable pmd we need to compare it against pmd_none() and pmd_trans_huge() in an atomic way, with pmd_trans_unstable(). The old pmd_trans_huge() left a tiny window for a race. Useful applications are unlikely to notice the difference as doing MADV_DONTNEED concurrently with a page fault would lead to undefined behavior. [akpm@linux-foundation.org: tidy up comment grammar/layout] Signed-off-by: Andrea Arcangeli Reported-by: Kirill A. Shutemov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [ kamal: backport to 4.2-stable: context ] Signed-off-by: Kamal Mostafa --- mm/memory.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) -- 2.7.0 diff --git a/mm/memory.c b/mm/memory.c index 388dcf9..90e6455 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3365,8 +3365,18 @@ static int __handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address))) return VM_FAULT_OOM; - /* if an huge pmd materialized from under us just retry later */ - if (unlikely(pmd_trans_huge(*pmd))) + /* + * If a huge pmd materialized under us just retry later. Use + * pmd_trans_unstable() instead of pmd_trans_huge() to ensure the pmd + * didn't become pmd_trans_huge under us and then back to pmd_none, as + * a result of MADV_DONTNEED running immediately after a huge pmd fault + * in a different thread of this mm, in turn leading to a misleading + * pmd_trans_huge() retval. All we have to ensure is that it is a + * regular pmd that we can walk with pte_offset_map() and we can do that + * through an atomic read in C, which is what pmd_trans_unstable() + * provides. + */ + if (unlikely(pmd_trans_unstable(pmd))) return 0; /* * A regular pmd is established and it can't morph into a huge pmd