From patchwork Thu Oct 2 13:05:05 2008 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 2386 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 44610DE2B3 for ; Thu, 2 Oct 2008 23:19:07 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754726AbYJBNS7 (ORCPT ); Thu, 2 Oct 2008 09:18:59 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754714AbYJBNS5 (ORCPT ); Thu, 2 Oct 2008 09:18:57 -0400 Received: from casper.infradead.org ([85.118.1.10]:54804 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754693AbYJBNSz (ORCPT ); Thu, 2 Oct 2008 09:18:55 -0400 Received: from d9244.upc-d.chello.nl ([213.46.9.244] helo=twins) by casper.infradead.org with esmtpsa (Exim 4.69 #1 (Red Hat Linux)) id 1KlO4b-0000ES-T7; Thu, 02 Oct 2008 13:18:30 +0000 Received: by twins (Postfix, from userid 0) id E980F181256A7; Thu, 2 Oct 2008 15:18:29 +0200 (CEST) Message-Id: <20081002131607.592222709@chello.nl> References: <20081002130504.927878499@chello.nl> User-Agent: quilt/0.46-1 Date: Thu, 02 Oct 2008 15:05:05 +0200 From: Peter Zijlstra To: Linus Torvalds , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, trond.myklebust@fys.uio.no, Daniel Lezcano , Pekka Enberg , Peter Zijlstra , Neil Brown , David Miller Subject: [PATCH 01/32] mm: gfp_to_alloc_flags() Content-Disposition: inline; filename=mm-gfp-to-alloc_flags.patch X-Bad-Reply: References but no 'Re:' in Subject. Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Clean up the code by factoring out the gfp to alloc_flags mapping. [neilb@suse.de says] As the test: - if (((p->flags & PF_MEMALLOC) || unlikely(test_thread_flag(TIF_MEMDIE))) - && !in_interrupt()) { - if (!(gfp_mask & __GFP_NOMEMALLOC)) { has been replaced with a slightly weaker one: + if (alloc_flags & ALLOC_NO_WATERMARKS) { we need to ensure we don't recurse when PF_MEMALLOC is set Signed-off-by: Peter Zijlstra --- mm/internal.h | 10 +++++ mm/page_alloc.c | 95 +++++++++++++++++++++++++++++++------------------------- 2 files changed, 64 insertions(+), 41 deletions(-) Index: linux-2.6/mm/page_alloc.c =================================================================== --- linux-2.6.orig/mm/page_alloc.c +++ linux-2.6/mm/page_alloc.c @@ -1510,6 +1502,44 @@ static void set_page_owner(struct page * #endif /* CONFIG_PAGE_OWNER */ /* + * get the deepest reaching allocation flags for the given gfp_mask + */ +static int gfp_to_alloc_flags(gfp_t gfp_mask) +{ + struct task_struct *p = current; + int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; + const gfp_t wait = gfp_mask & __GFP_WAIT; + + /* + * The caller may dip into page reserves a bit more if the caller + * cannot run direct reclaim, or if the caller has realtime scheduling + * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will + * set both ALLOC_HARDER (!wait) and ALLOC_HIGH (__GFP_HIGH). + */ + if (gfp_mask & __GFP_HIGH) + alloc_flags |= ALLOC_HIGH; + + if (!wait) { + alloc_flags |= ALLOC_HARDER; + /* + * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. + * See also cpuset_zone_allowed() comment in kernel/cpuset.c. + */ + alloc_flags &= ~ALLOC_CPUSET; + } else if (unlikely(rt_task(p)) && !in_interrupt()) + alloc_flags |= ALLOC_HARDER; + + if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) { + if (!in_interrupt() && + ((p->flags & PF_MEMALLOC) || + unlikely(test_thread_flag(TIF_MEMDIE)))) + alloc_flags |= ALLOC_NO_WATERMARKS; + } + + return alloc_flags; +} + +/* * This is the 'heart' of the zoned buddy allocator. */ struct page * @@ -1567,49 +1597,28 @@ restart: * OK, we're below the kswapd watermark and have kicked background * reclaim. Now things get more complex, so set up alloc_flags according * to how we want to proceed. - * - * The caller may dip into page reserves a bit more if the caller - * cannot run direct reclaim, or if the caller has realtime scheduling - * policy or is asking for __GFP_HIGH memory. GFP_ATOMIC requests will - * set both ALLOC_HARDER (!wait) and ALLOC_HIGH (__GFP_HIGH). */ - alloc_flags = ALLOC_WMARK_MIN; - if ((unlikely(rt_task(p)) && !in_interrupt()) || !wait) - alloc_flags |= ALLOC_HARDER; - if (gfp_mask & __GFP_HIGH) - alloc_flags |= ALLOC_HIGH; - if (wait) - alloc_flags |= ALLOC_CPUSET; + alloc_flags = gfp_to_alloc_flags(gfp_mask); - /* - * Go through the zonelist again. Let __GFP_HIGH and allocations - * coming from realtime tasks go deeper into reserves. - * - * This is the last chance, in general, before the goto nopage. - * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. - * See also cpuset_zone_allowed() comment in kernel/cpuset.c. - */ + /* This is the last chance, in general, before the goto nopage. */ page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, - high_zoneidx, alloc_flags); + high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS); if (page) goto got_pg; /* This allocation should allow future memory freeing. */ - rebalance: - if (((p->flags & PF_MEMALLOC) || unlikely(test_thread_flag(TIF_MEMDIE))) - && !in_interrupt()) { - if (!(gfp_mask & __GFP_NOMEMALLOC)) { + if (alloc_flags & ALLOC_NO_WATERMARKS) { nofail_alloc: - /* go through the zonelist yet again, ignoring mins */ - page = get_page_from_freelist(gfp_mask, nodemask, order, + /* go through the zonelist yet again, ignoring mins */ + page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, high_zoneidx, ALLOC_NO_WATERMARKS); - if (page) - goto got_pg; - if (gfp_mask & __GFP_NOFAIL) { - congestion_wait(WRITE, HZ/50); - goto nofail_alloc; - } + if (page) + goto got_pg; + + if (wait && (gfp_mask & __GFP_NOFAIL)) { + congestion_wait(WRITE, HZ/50); + goto nofail_alloc; } goto nopage; } @@ -1618,6 +1627,10 @@ nofail_alloc: if (!wait) goto nopage; + /* Avoid recursion of direct reclaim */ + if (p->flags & PF_MEMALLOC) + goto nopage; + cond_resched(); /* We now go into synchronous reclaim */