jffs2: force the jffs2 GC daemon to behave a bit better

Message ID	20081102130449.0aa74adb@fred
State	Accepted
Headers	show Return-Path: <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org> To: linux-mtd@lists.infradead.org From: Andres Salomon <dilinger@queued.net> Subject: [PATCH] jffs2: force the jffs2 GC daemon to behave a bit better Date: Sun, 2 Nov 2008 13:04:49 -0500 Lines: 50 Message-ID: <20081102130449.0aa74adb@fred> Mime-Version: 1.0 summary: Content analysis details: (-1.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [80.91.229.2 listed in list.dnswl.org] Cc: linux-kernel@vger.kernel.org Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: linux-mtd-bounces@lists.infradead.org Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Message ID

20081102130449.0aa74adb@fred

State

Accepted

Headers

To: linux-mtd@lists.infradead.org
From: Andres Salomon <dilinger@queued.net>
Subject: [PATCH] jffs2: force the jffs2 GC daemon to behave a bit better
Date: Sun, 2 Nov 2008 13:04:49 -0500
Lines: 50
Message-ID: <20081102130449.0aa74adb@fred>
Mime-Version: 1.0
Cc: linux-kernel@vger.kernel.org
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: linux-mtd-bounces@lists.infradead.org
Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Commit Message

Andres Salomon Nov. 2, 2008, 6:04 p.m. UTC

I've noticed some pretty poor behavior on OLPC machines after bootup, when
gdm/X are starting.  The GCD monopolizes the scheduler (which in turns means
it gets to do more nand i/o), which results in processes taking much much
longer than they should to start.

As an example, on an OLPC machine going from OFW to a usable X (via auto-login
gdm) takes 2m 30s.  The majority of this time is consumed by the switch into
graphical mode.  With this patch, we cut a full 60s off of bootup time.  After
bootup, things are much snappier as well.

Note that we have seen a CRC node error with this patch that causes the machine
to fail to boot, but we've also seen that problem without this patch.  

Signed-off-by: Andres Salomon <dilinger@debian.org>
---
 fs/jffs2/background.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

Comments

Andrew Morton Nov. 5, 2008, 3:49 a.m. UTC | #1

On Sun, 2 Nov 2008 13:04:49 -0500 Andres Salomon <dilinger@queued.net> wrote:

> 
> I've noticed some pretty poor behavior on OLPC machines after bootup, when
> gdm/X are starting.  The GCD monopolizes the scheduler (which in turns means
> it gets to do more nand i/o), which results in processes taking much much
> longer than they should to start.
> 
> As an example, on an OLPC machine going from OFW to a usable X (via auto-login
> gdm) takes 2m 30s.  The majority of this time is consumed by the switch into
> graphical mode.  With this patch, we cut a full 60s off of bootup time.  After
> bootup, things are much snappier as well.
> 
> Note that we have seen a CRC node error with this patch that causes the machine
> to fail to boot, but we've also seen that problem without this patch.  
> 

Well you've observed one of the problems with yield().  The other
problem is that this thread can be starved for fantastic amounts of
time when the system is busy.

yield() is so unpredictable that basically any use of it should be
viewed as a bug.

> ---
>  fs/jffs2/background.c |   18 +++++++++++-------
>  1 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
> index 8adebd3..f38d557 100644
> --- a/fs/jffs2/background.c
> +++ b/fs/jffs2/background.c
> @@ -95,13 +95,17 @@ static int jffs2_garbage_collect_thread(void *_c)
>  			schedule();
>  		}
>  
> -		/* This thread is purely an optimisation. But if it runs when
> -		   other things could be running, it actually makes things a
> -		   lot worse. Use yield() and put it at the back of the runqueue
> -		   every time. Especially during boot, pulling an inode in
> -		   with read_inode() is much preferable to having the GC thread
> -		   get there first. */
> -		yield();
> +		/* Problem - immediately after bootup, the GCD spends a lot
> +		 * of time in places like jffs2_kill_fragtree(); so much so
> +		 * that userspace processes (like gdm and X) are starved
> +		 * despite plenty of cond_resched()s and renicing.  Yield()
> +		 * doesn't help, either (presumably because userspace and GCD
> +		 * are generally competing for a higher latency resource -
> +		 * disk).
> +		 * This forces the GCD to slow the hell down.   Pulling an
> +		 * inode in with read_inode() is much preferable to having
> +		 * the GC thread get there first. */
> +		schedule_timeout_interruptible(msecs_to_jiffies(50));
>  
>  		/* Put_super will send a SIGKILL and then wait on the sem.
>  		 */

Yeah.  It doesn't matter much - almost any change you can make in there
will improve the current code.

David, I do think we should fix this in 2.6.28 (at least).

Perhaps this is an application for SCHED_IDLE?

diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 8adebd3..f38d557 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -95,13 +95,17 @@  static int jffs2_garbage_collect_thread(void *_c)
 			schedule();
 		}
 
-		/* This thread is purely an optimisation. But if it runs when
-		   other things could be running, it actually makes things a
-		   lot worse. Use yield() and put it at the back of the runqueue
-		   every time. Especially during boot, pulling an inode in
-		   with read_inode() is much preferable to having the GC thread
-		   get there first. */
-		yield();
+		/* Problem - immediately after bootup, the GCD spends a lot
+		 * of time in places like jffs2_kill_fragtree(); so much so
+		 * that userspace processes (like gdm and X) are starved
+		 * despite plenty of cond_resched()s and renicing.  Yield()
+		 * doesn't help, either (presumably because userspace and GCD
+		 * are generally competing for a higher latency resource -
+		 * disk).
+		 * This forces the GCD to slow the hell down.   Pulling an
+		 * inode in with read_inode() is much preferable to having
+		 * the GC thread get there first. */
+		schedule_timeout_interruptible(msecs_to_jiffies(50));
 
 		/* Put_super will send a SIGKILL and then wait on the sem.
 		 */

jffs2: force the jffs2 GC daemon to behave a bit better

Commit Message

Comments

Patch