page eviction from the buddy cache

Mel,


On Apr 25, 2013, at 17:30, Mel Gorman wrote:

> On Wed, Apr 24, 2013 at 10:26:50AM -0400, Theodore Ts'o wrote:
>> On Tue, Apr 23, 2013 at 03:00:08PM -0700, Andrew Morton wrote:
>>> That should fix things for now.  Although it might be better to just do
>>> 
>>> 	mark_page_accessed(page);	/* to SetPageReferenced */
>>> 	lru_add_drain();		/* to SetPageLRU */
>>> 
>>> Because a) this was too early to decide that the page is
>>> super-important and b) the second touch of this page should have a
>>> mark_page_accessed() in it already.
>> 
>> The question is do we really want to put lru_add_drain() into the ext4
>> file system code?  That seems to pushing some fairly mm-specific
>> knowledge into file system code.  I'll do this if I have to do, but
>> wouldn't be better if this was pushed into mark_page_accessed(), or
>> some other new API was exported by the mm subsystem?
>> 
> 
> I don't think we want to push lru_add_drain() into the ext4 code. It's
> too specific of knowledge just to work around pagevecs. Before we rework
> how pagevecs select what LRU to place a page, can we make sure that fixing
> that will fix the problem?
> 
what is "that"? puting lru_add_drain() in ext4 core? sure that is fixes problem with many small reads during large write.
originally i have put shake_page() in ext4 code, but that have call lru_add_drain_all() so to exaggerated.


additional i_state = I_NEW need to prevent kill page cache from sysctl -w vm.drop_caches=3

> Andrew, can you try the following patch please? Also, is there any chance
> you can describe in more detail what the workload does?
lustre OSS node + IOR with file size twice more then OSS memory.

> If it fails to boot,
> remove the second that calls lru_add_drain_all() and try again.
well, i will try.

> 
> The patch looks deceptively simple, a downside from is is that workloads that
> call mark_page_accessed() frequently will contend more on the zone->lru_lock
> than it did previously. Moving lru_add_drain() to the ext4 could would
> suffer the same contention problem.
NO, isn't. we have call lru_add_drain() in new page allocation case, but mark_page_accessed called without differences - is page in page cache already or it's new allocated - so we have very small zone->lru_lock contention.


> 
> Thanks.
> 
> ---8<---
> mm: pagevec: Move inactive pages to active lists even if on a pagevec
> 
> If a page is on a pagevec aimed at the inactive list then two subsequent
> calls to mark_page_acessed() will still not move it to the active list.
> This can cause a page to be reclaimed sooner than is expected. This
> patch detects if an inactive page is not on the LRU and drains the
> pagevec before promoting it.
> 
> Not-signed-off
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 8a529a0..eac64fe 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -437,7 +437,18 @@ void activate_page(struct page *page)
> void mark_page_accessed(struct page *page)
> {
> 	if (!PageActive(page) && !PageUnevictable(page) &&
> -			PageReferenced(page) && PageLRU(page)) {
> +			PageReferenced(page)) {
> +		/* Page could be in pagevec */
> +		if (!PageLRU(page))
> +			lru_add_drain();
> +
> +		/*
> +		 * Weeeee, using in_atomic() like this is a hand-grenade.
> +		 * Patch is for debugging purposes only, do not merge this.
> +		 */
> +		if (!PageLRU(page) && !in_atomic())
> +			lru_add_drain_all();
> +
> 		activate_page(page);
> 		ClearPageReferenced(page);
> 	} else if (!PageReferenced(page)) {

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

page eviction from the buddy cache

Commit Message

Comments

Patch