From patchwork Tue Jul 18 12:05:45 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Guo X-Patchwork-Id: 790222 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical-com.20150623.gappssmtp.com header.i=@canonical-com.20150623.gappssmtp.com header.b="n1mKvDyi"; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3xBf6y6XH7z9rxl; Tue, 18 Jul 2017 22:07:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1dXRHs-0004ae-4Y; Tue, 18 Jul 2017 12:07:36 +0000 Received: from mail-pg0-f50.google.com ([74.125.83.50]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1dXRGb-0003hH-9o for kernel-team@lists.ubuntu.com; Tue, 18 Jul 2017 12:06:17 +0000 Received: by mail-pg0-f50.google.com with SMTP id u5so11607925pgq.3 for ; Tue, 18 Jul 2017 05:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=DQlC+Fz1fWh2ZeexAuFwrTamOEcL6AOHbqR7jvzAsvI=; b=n1mKvDyi1eUvr15SlVj9dOT4nO/xpThz0FH2mDvG30IfhXyxQVG/QL4Jdyp3Zn0Q27 AedVv9FN93RgyomaGhtXmxZEspCkBZC11j0jtMvrLbq3YBbuLv9tBTwVhIn4+pfgM5FO Hr0ia7PS0dWAOKsg3l5tHdUL1ynaeGJreC9LnXvhqPGjD8bkZcgo8hVMoJxPzC7HKfzr U7tIBBe6GvnYZPYfqNRlqaFjKNseFKa6R1AP4k2o5WfY5J7Kasm+XqBC7lfDL28kPwr0 Wp583WNxgFiHsG2NvJyg+VicdNyBY+R393H7CrmYcR9FDPUFkQr9+aoYW82EZXny3mBx gVng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=DQlC+Fz1fWh2ZeexAuFwrTamOEcL6AOHbqR7jvzAsvI=; b=DXWaty387RJC4a4ICc6rbTg33VLPH8OXco8HlIgnmaK6qzD75G/hplwFXXazFQgWnC x8/zGhej01jjGr+6HYPjUui2gvtCUD/A1dNi8TYTn6fr9enDjNaVowJtcf+lAtvE23wB 811M/dsaAz0A6zX+XTBIJRbFktgbYaj40Df4LDDlJw+1kSpV9IBQYZiJ3Up4lxGOpA6O BP7deFQnIX88VqARtJRsG95wPh7s8A/uZ9bJTCqm0IEi/8ZdsyaL8uc7AzFinU8r5O9b 8WI8AyjO63I6VrwdrGGJDMHj2NvQ8gNBSnh7//eNdjX1dRp/mdovXpBrlEU345pdJLgU EkhQ== X-Gm-Message-State: AIVw113R7K3110qlmbLxf/jixA4ZD6X5U7XeYNuirwHp175MM/rVujaq 3KdhOjUe7HXNYf0pMRc= X-Received: by 10.84.193.101 with SMTP id e92mr1304402pld.209.1500379575425; Tue, 18 Jul 2017 05:06:15 -0700 (PDT) Received: from gavin-P70.buildd (114-35-245-81.HINET-IP.hinet.net. [114.35.245.81]) by smtp.gmail.com with ESMTPSA id k19sm5771047pfk.16.2017.07.18.05.06.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 18 Jul 2017 05:06:14 -0700 (PDT) From: Gavin Guo To: kernel-team@lists.ubuntu.com Subject: [SRU][Zesty][PATCH 2/5] ksm: fix use after free with merge_across_nodes = 0 Date: Tue, 18 Jul 2017 20:05:45 +0800 Message-Id: <1500379557-13503-9-git-send-email-gavin.guo@canonical.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1500379557-13503-1-git-send-email-gavin.guo@canonical.com> References: <1500379557-13503-1-git-send-email-gavin.guo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com From: Andrea Arcangeli BugLink: http://bugs.launchpad.net/bugs/1680513 If merge_across_nodes was manually set to 0 (not the default value) by the admin or a tuned profile on NUMA systems triggering cross-NODE page migrations, a stable_node use after free could materialize. If the chain is collapsed stable_node would point to the old chain that was already freed. stable_node_dup would be the stable_node dup now converted to a regular stable_node and indexed in the rbtree in replacement of the freed stable_node chain (not anymore a dup). This special case where the chain is collapsed in the NUMA replacement path, is now detected by setting stable_node to NULL by the chain_prune callee if it decides to collapse the chain. This tells the NUMA replacement code that even if stable_node and stable_node_dup are different, this is not a chain if stable_node is NULL, as the stable_node_dup was converted to a regular stable_node and the chain was collapsed. It is generally safer for the callee to force the caller stable_node to NULL the moment it become stale so any other mistake like this would result in an instant Oops easier to debug than an use after free. Otherwise the replace logic would act like if stable_node was a valid chain, when in fact it was freed. Notably stable_node_chain_add_dup(page_node, stable_node) would run on a stable stable_node. Andrey Ryabinin found the source of the use after free in chain_prune(). Link: http://lkml.kernel.org/r/20170512193805.8807-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli Reported-by: Andrey Ryabinin Reported-by: Evgheni Dereveanchin Tested-by: Andrey Ryabinin Cc: Petr Holasek Cc: Hugh Dickins Cc: Davidlohr Bueso Cc: Arjan van de Ven Cc: Gavin Guo Cc: Jay Vosburgh Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds (cherry picked from commit b4fecc67cc569b14301f5a1111363d5818b8da5e) Signed-off-by: Gavin Guo --- mm/ksm.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 55 insertions(+), 11 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index e01580fa2e30..dddd9a650eae 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -336,6 +336,7 @@ static inline void stable_node_chain_add_dup(struct stable_node *dup, static inline void __stable_node_dup_del(struct stable_node *dup) { + VM_BUG_ON(!is_stable_node_dup(dup)); hlist_del(&dup->hlist_dup); ksm_stable_node_dups--; } @@ -1309,12 +1310,12 @@ bool is_page_sharing_candidate(struct stable_node *stable_node) return __is_page_sharing_candidate(stable_node, 0); } -static struct stable_node *stable_node_dup(struct stable_node *stable_node, +static struct stable_node *stable_node_dup(struct stable_node **_stable_node, struct page **tree_page, struct rb_root *root, bool prune_stale_stable_nodes) { - struct stable_node *dup, *found = NULL; + struct stable_node *dup, *found = NULL, *stable_node = *_stable_node; struct hlist_node *hlist_safe; struct page *_tree_page; int nr = 0; @@ -1387,6 +1388,15 @@ static struct stable_node *stable_node_dup(struct stable_node *stable_node, free_stable_node(stable_node); ksm_stable_node_chains--; ksm_stable_node_dups--; + /* + * NOTE: the caller depends on the + * *_stable_node to become NULL if the chain + * was collapsed. Enforce that if anything + * uses a stale (freed) stable_node chain a + * visible crash will materialize (instead of + * an use after free). + */ + *_stable_node = stable_node = NULL; } else if (__is_page_sharing_candidate(found, 1)) { /* * Refile our candidate at the head @@ -1416,11 +1426,12 @@ static struct stable_node *stable_node_dup_any(struct stable_node *stable_node, typeof(*stable_node), hlist_dup); } -static struct stable_node *__stable_node_chain(struct stable_node *stable_node, +static struct stable_node *__stable_node_chain(struct stable_node **_stable_node, struct page **tree_page, struct rb_root *root, bool prune_stale_stable_nodes) { + struct stable_node *stable_node = *_stable_node; if (!is_stable_node_chain(stable_node)) { if (is_page_sharing_candidate(stable_node)) { *tree_page = get_ksm_page(stable_node, false); @@ -1428,11 +1439,11 @@ static struct stable_node *__stable_node_chain(struct stable_node *stable_node, } return NULL; } - return stable_node_dup(stable_node, tree_page, root, + return stable_node_dup(_stable_node, tree_page, root, prune_stale_stable_nodes); } -static __always_inline struct stable_node *chain_prune(struct stable_node *s_n, +static __always_inline struct stable_node *chain_prune(struct stable_node **s_n, struct page **t_p, struct rb_root *root) { @@ -1443,7 +1454,7 @@ static __always_inline struct stable_node *chain(struct stable_node *s_n, struct page **t_p, struct rb_root *root) { - return __stable_node_chain(s_n, t_p, root, false); + return __stable_node_chain(&s_n, t_p, root, false); } /* @@ -1484,7 +1495,15 @@ static struct page *stable_tree_search(struct page *page) cond_resched(); stable_node = rb_entry(*new, struct stable_node, node); stable_node_any = NULL; - stable_node_dup = chain_prune(stable_node, &tree_page, root); + stable_node_dup = chain_prune(&stable_node, &tree_page, root); + /* + * NOTE: stable_node may have been freed by + * chain_prune() if the returned stable_node_dup is + * not NULL. stable_node_dup may have been inserted in + * the rbtree instead as a regular stable_node (in + * order to collapse the stable_node chain if a single + * stable_node dup was found in it). + */ if (!stable_node_dup) { /* * Either all stable_node dups were full in @@ -1599,20 +1618,33 @@ static struct page *stable_tree_search(struct page *page) return NULL; replace: - if (stable_node_dup == stable_node) { + /* + * If stable_node was a chain and chain_prune collapsed it, + * stable_node will be NULL here. In that case the + * stable_node_dup is the regular stable_node that has + * replaced the chain. If stable_node is not NULL and equal to + * stable_node_dup there was no chain and stable_node_dup is + * the regular stable_node in the stable rbtree. Otherwise + * stable_node is the chain and stable_node_dup is the dup to + * replace. + */ + if (!stable_node || stable_node_dup == stable_node) { + VM_BUG_ON(is_stable_node_chain(stable_node_dup)); + VM_BUG_ON(is_stable_node_dup(stable_node_dup)); /* there is no chain */ if (page_node) { VM_BUG_ON(page_node->head != &migrate_nodes); list_del(&page_node->list); DO_NUMA(page_node->nid = nid); - rb_replace_node(&stable_node->node, &page_node->node, + rb_replace_node(&stable_node_dup->node, + &page_node->node, root); if (is_page_sharing_candidate(page_node)) get_page(page); else page = NULL; } else { - rb_erase(&stable_node->node, root); + rb_erase(&stable_node_dup->node, root); page = NULL; } } else { @@ -1639,7 +1671,17 @@ static struct page *stable_tree_search(struct page *page) /* stable_node_dup could be null if it reached the limit */ if (!stable_node_dup) stable_node_dup = stable_node_any; - if (stable_node_dup == stable_node) { + /* + * If stable_node was a chain and chain_prune collapsed it, + * stable_node will be NULL here. In that case the + * stable_node_dup is the regular stable_node that has + * replaced the chain. If stable_node is not NULL and equal to + * stable_node_dup there was no chain and stable_node_dup is + * the regular stable_node in the stable rbtree. + */ + if (!stable_node || stable_node_dup == stable_node) { + VM_BUG_ON(is_stable_node_chain(stable_node_dup)); + VM_BUG_ON(is_stable_node_dup(stable_node_dup)); /* chain is missing so create it */ stable_node = alloc_stable_node_chain(stable_node_dup, root); @@ -1652,6 +1694,8 @@ static struct page *stable_tree_search(struct page *page) * of the current nid for this page * content. */ + VM_BUG_ON(!is_stable_node_chain(stable_node)); + VM_BUG_ON(!is_stable_node_dup(stable_node_dup)); VM_BUG_ON(page_node->head != &migrate_nodes); list_del(&page_node->list); DO_NUMA(page_node->nid = nid);