From patchwork Tue Jan 24 16:28:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thadeu Lima de Souza Cascardo X-Patchwork-Id: 719251 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3v7DCw4g4cz9ry7; Wed, 25 Jan 2017 03:29:32 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1cW3yL-00061I-IY; Tue, 24 Jan 2017 16:29:29 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1cW3xw-0005rA-Gl for kernel-team@lists.ubuntu.com; Tue, 24 Jan 2017 16:29:04 +0000 Received: from [187.74.228.207] (helo=calabresa.lan) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1cW3xv-0008UG-Pa for kernel-team@lists.ubuntu.com; Tue, 24 Jan 2017 16:29:04 +0000 From: Thadeu Lima de Souza Cascardo To: kernel-team@lists.ubuntu.com Subject: [Xenial PATCH 11/11] mm, oom: prevent premature OOM killer invocation for high order request Date: Tue, 24 Jan 2017 14:28:39 -0200 Message-Id: <20170124162839.6355-12-cascardo@canonical.com> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170124162839.6355-1-cascardo@canonical.com> References: <20170124162839.6355-1-cascardo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com From: Michal Hocko BugLink: https://bugs.launchpad.net/bugs/1655842 There have been several reports about pre-mature OOM killer invocation in 4.7 kernel when order-2 allocation request (for the kernel stack) invoked OOM killer even during basic workloads (light IO or even kernel compile on some filesystems). In all reported cases the memory is fragmented and there are no order-2+ pages available. There is usually a large amount of slab memory (usually dentries/inodes) and further debugging has shown that there are way too many unmovable blocks which are skipped during the compaction. Multiple reporters have confirmed that the current linux-next which includes [1] and [2] helped and OOMs are not reproducible anymore. A simpler fix for the late rc and stable is to simply ignore the compaction feedback and retry as long as there is a reclaim progress and we are not getting OOM for order-0 pages. We already do that for CONFING_COMPACTION=n so let's reuse the same code when compaction is enabled as well. [1] http://lkml.kernel.org/r/20160810091226.6709-1-vbabka@suse.cz [2] http://lkml.kernel.org/r/f7a9ea9d-bb88-bfd6-e340-3a933559305a@suse.cz Fixes: 0a0337e0d1d1 ("mm, oom: rework oom detection") Link: http://lkml.kernel.org/r/20160823074339.GB23577@dhcp22.suse.cz Signed-off-by: Michal Hocko Tested-by: Olaf Hering Tested-by: Ralf-Peter Rohbeck Cc: Markus Trippelsdorf Cc: Arkadiusz Miskiewicz Cc: Ralf-Peter Rohbeck Cc: Jiri Slaby Cc: Vlastimil Babka Cc: Joonsoo Kim Cc: Tetsuo Handa Cc: David Rientjes Cc: [4.7.x] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds (cherry picked from commit 6b4e3181d7bd5ca5ab6f45929e4a5ffa7ab4ab7f) Signed-off-by: Thadeu Lima de Souza Cascardo --- mm/page_alloc.c | 51 ++------------------------------------------------- 1 file changed, 2 insertions(+), 49 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 50eba2f..f8775cb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2844,54 +2844,6 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, return NULL; } - -static inline bool -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags, - enum compact_result compact_result, enum migrate_mode *migrate_mode, - int compaction_retries) -{ - int max_retries = MAX_COMPACT_RETRIES; - - if (!order) - return false; - - /* - * compaction considers all the zone as desperately out of memory - * so it doesn't really make much sense to retry except when the - * failure could be caused by weak migration mode. - */ - if (compaction_failed(compact_result)) { - if (*migrate_mode == MIGRATE_ASYNC) { - *migrate_mode = MIGRATE_SYNC_LIGHT; - return true; - } - return false; - } - - /* - * make sure the compaction wasn't deferred or didn't bail out early - * due to locks contention before we declare that we should give up. - * But do not retry if the given zonelist is not suitable for - * compaction. - */ - if (compaction_withdrawn(compact_result)) - return compaction_zonelist_suitable(ac, order, alloc_flags); - - /* - * !costly requests are much more important than __GFP_REPEAT - * costly ones because they are de facto nofail and invoke OOM - * killer to move on while costly can fail and users are ready - * to cope with that. 1/4 retries is rather arbitrary but we - * would need much more detailed feedback from compaction to - * make a better decision. - */ - if (order > PAGE_ALLOC_COSTLY_ORDER) - max_retries /= 4; - if (compaction_retries <= max_retries) - return true; - - return false; -} #else static inline struct page * __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, @@ -2902,6 +2854,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, return NULL; } +#endif /* CONFIG_COMPACTION */ + static inline bool should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags, enum compact_result compact_result, @@ -2928,7 +2882,6 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla } return false; } -#endif /* CONFIG_COMPACTION */ /* Perform direct synchronous page reclaim */ static int