From patchwork Fri Oct 2 18:44:58 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Jennings X-Patchwork-Id: 34872 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from bilbo.ozlabs.org (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id EB95810086E for ; Sat, 3 Oct 2009 04:45:27 +1000 (EST) Received: by ozlabs.org (Postfix) id 44BD1B7BF3; Sat, 3 Oct 2009 04:45:19 +1000 (EST) Delivered-To: linuxppc-dev@ozlabs.org Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e31.co.us.ibm.com", Issuer "Equifax" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 34BB8B7BE5 for ; Sat, 3 Oct 2009 04:45:18 +1000 (EST) Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e31.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id n92Icqwa004233 for ; Fri, 2 Oct 2009 12:38:52 -0600 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id n92Ij84j089482 for ; Fri, 2 Oct 2009 12:45:08 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n92Ij4RP027580 for ; Fri, 2 Oct 2009 12:45:05 -0600 Received: from toy.austin.ibm.com (toy.austin.ibm.com [9.53.41.214]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id n92Ij3U0027575; Fri, 2 Oct 2009 12:45:04 -0600 Received: by toy.austin.ibm.com (Postfix, from userid 1000) id C811DCCA063; Fri, 2 Oct 2009 13:44:58 -0500 (CDT) Date: Fri, 2 Oct 2009 13:44:58 -0500 From: Robert Jennings To: Mel Gorman Subject: [PATCH 1/2][v2] mm: add notifier in pageblock isolation for balloon drivers Message-ID: <20091002184458.GC4908@austin.ibm.com> Mail-Followup-To: Mel Gorman , Ingo Molnar , Badari Pulavarty , Brian King , Benjamin Herrenschmidt , Paul Mackerras , Martin Schwidefsky , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@ozlabs.org MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org, Martin Schwidefsky , Badari Pulavarty , Brian King , Paul Mackerras , Ingo Molnar X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Memory balloon drivers can allocate a large amount of memory which is not movable but could be freed to accomodate memory hotplug remove. Prior to calling the memory hotplug notifier chain the memory in the pageblock is isolated. If the migrate type is not MIGRATE_MOVABLE the isolation will not proceed, causing the memory removal for that page range to fail. Rather than failing pageblock isolation if the the migrateteype is not MIGRATE_MOVABLE, this patch checks if all of the pages in the pageblock are owned by a registered balloon driver (or other entity) using a notifier chain. If all of the non-movable pages are owned by a balloon, they can be freed later through the memory notifier chain and the range can still be isolated in set_migratetype_isolate(). Signed-off-by: Robert Jennings --- drivers/base/memory.c | 19 +++++++++++++++++++ include/linux/memory.h | 26 ++++++++++++++++++++++++++ mm/page_alloc.c | 45 ++++++++++++++++++++++++++++++++++++++------- 3 files changed, 83 insertions(+), 7 deletions(-) Index: b/drivers/base/memory.c =================================================================== --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -63,6 +63,20 @@ void unregister_memory_notifier(struct n } EXPORT_SYMBOL(unregister_memory_notifier); +static BLOCKING_NOTIFIER_HEAD(memory_isolate_chain); + +int register_memory_isolate_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&memory_isolate_chain, nb); +} +EXPORT_SYMBOL(register_memory_isolate_notifier); + +void unregister_memory_isolate_notifier(struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&memory_isolate_chain, nb); +} +EXPORT_SYMBOL(unregister_memory_isolate_notifier); + /* * register_memory - Setup a sysfs device for a memory block */ @@ -157,6 +171,11 @@ int memory_notify(unsigned long val, voi return blocking_notifier_call_chain(&memory_chain, val, v); } +int memory_isolate_notify(unsigned long val, void *v) +{ + return blocking_notifier_call_chain(&memory_isolate_chain, val, v); +} + /* * MEMORY_HOTPLUG depends on SPARSEMEM in mm/Kconfig, so it is * OK to have direct references to sparsemem variables in here. Index: b/include/linux/memory.h =================================================================== --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -50,6 +50,18 @@ struct memory_notify { int status_change_nid; }; +/* + * During pageblock isolation, count the number of pages in the + * range [start_pfn, start_pfn + nr_pages) + */ +#define MEM_ISOLATE_COUNT (1<<0) + +struct memory_isolate_notify { + unsigned long start_pfn; + unsigned int nr_pages; + unsigned int pages_found; +}; + struct notifier_block; struct mem_section; @@ -76,14 +88,28 @@ static inline int memory_notify(unsigned { return 0; } +static inline int register_memory_isolate_notifier(struct notifier_block *nb) +{ + return 0; +} +static inline void unregister_memory_isolate_notifier(struct notifier_block *nb) +{ +} +static inline int memory_isolate_notify(unsigned long val, void *v) +{ + return 0; +} #else extern int register_memory_notifier(struct notifier_block *nb); extern void unregister_memory_notifier(struct notifier_block *nb); +extern int register_memory_isolate_notifier(struct notifier_block *nb); +extern void unregister_memory_isolate_notifier(struct notifier_block *nb); extern int register_new_memory(int, struct mem_section *); extern int unregister_memory_section(struct mem_section *); extern int memory_dev_init(void); extern int remove_memory_block(unsigned long, struct mem_section *, int); extern int memory_notify(unsigned long val, void *v); +extern int memory_isolate_notify(unsigned long val, void *v); extern struct memory_block *find_memory_block(struct mem_section *); #define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION< #include #include +#include #include #include @@ -4985,23 +4986,53 @@ void set_pageblock_flags_group(struct pa int set_migratetype_isolate(struct page *page) { struct zone *zone; - unsigned long flags; + unsigned long flags, pfn, iter; + unsigned long immobile = 0; + struct memory_isolate_notify arg; + int notifier_ret; int ret = -EBUSY; int zone_idx; zone = page_zone(page); zone_idx = zone_idx(zone); + spin_lock_irqsave(&zone->lock, flags); + if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE || + zone_idx == ZONE_MOVABLE) { + ret = 0; + goto out; + } + + pfn = page_to_pfn(page); + arg.start_pfn = pfn; + arg.nr_pages = pageblock_nr_pages; + arg.pages_found = 0; + /* - * In future, more migrate types will be able to be isolation target. + * The pageblock can be isolated even if the migrate type is + * not *_MOVABLE. The memory isolation notifier chain counts + * the number of pages in this pageblock that can be freed later + * through the memory notifier chain. If all of the pages are + * accounted for, isolation can continue. */ - if (get_pageblock_migratetype(page) != MIGRATE_MOVABLE && - zone_idx != ZONE_MOVABLE) + notifier_ret = memory_isolate_notify(MEM_ISOLATE_COUNT, &arg); + notifier_ret = notifier_to_errno(notifier_ret); + if (notifier_ret || !arg.pages_found) goto out; - set_pageblock_migratetype(page, MIGRATE_ISOLATE); - move_freepages_block(zone, page, MIGRATE_ISOLATE); - ret = 0; + + for (iter = pfn; iter < (pfn + pageblock_nr_pages); iter++) + if (page_count(pfn_to_page(iter))) + immobile++; + + if (arg.pages_found == immobile) + ret = 0; + out: + if (!ret) { + set_pageblock_migratetype(page, MIGRATE_ISOLATE); + move_freepages_block(zone, page, MIGRATE_ISOLATE); + } + spin_unlock_irqrestore(&zone->lock, flags); if (!ret) drain_all_pages();