diff mbox

mtd: nand: fix shutdown/reboot for multi-chip systems

Message ID 1447115848-92621-1-git-send-email-computersforpeace@gmail.com
State Accepted
Headers show

Commit Message

Brian Norris Nov. 10, 2015, 12:37 a.m. UTC
If multiple NAND chips are registered to the same controller, then when
rebooting the system, the first one will grab the controller lock, while
the second will wait forever for the first one to release it. i.e., a
classic deadlock.

This problem was solved for a similar case (suspend/resume) back in
commit 6b0d9a841249 ("mtd: nand: fix multi-chip suspend problem"), and
the shutdown state really isn't much different for us, so rather than
adding a new special case to nand_get_device(), we can just overload the
FL_PM_SUSPENDED state.

Now, multiple chips can "get" the same controller lock (preventing
further I/O), while we still allow other chips to pass through
nand_shutdown().

Original report:
http://thread.gmane.org/gmane.linux.drivers.mtd/59726
http://lists.infradead.org/pipermail/linux-mtd/2015-July/059992.html

Fixes: 72ea403669c7 ("mtd: nand: added nand_shutdown")
Reported-by: Andrew E. Mileski <andrewm@isoar.ca>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Andrew E. Mileski <andrewm@isoar.ca>
---
I only compile-tested

If we get proper tests, this is probably 4.4 material

 drivers/mtd/nand/nand_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Scott Branden Nov. 13, 2015, 11:49 p.m. UTC | #1
On 15-11-09 04:37 PM, Brian Norris wrote:
> If multiple NAND chips are registered to the same controller, then when
> rebooting the system, the first one will grab the controller lock, while
> the second will wait forever for the first one to release it. i.e., a
> classic deadlock.
>
> This problem was solved for a similar case (suspend/resume) back in
> commit 6b0d9a841249 ("mtd: nand: fix multi-chip suspend problem"), and
> the shutdown state really isn't much different for us, so rather than
> adding a new special case to nand_get_device(), we can just overload the
> FL_PM_SUSPENDED state.
>
> Now, multiple chips can "get" the same controller lock (preventing
> further I/O), while we still allow other chips to pass through
> nand_shutdown().
>
> Original report:
> http://thread.gmane.org/gmane.linux.drivers.mtd/59726
> http://lists.infradead.org/pipermail/linux-mtd/2015-July/059992.html
>
> Fixes: 72ea403669c7 ("mtd: nand: added nand_shutdown")
> Reported-by: Andrew E. Mileski <andrewm@isoar.ca>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: Andrew E. Mileski <andrewm@isoar.ca>
> ---
> I only compile-tested
>
> If we get proper tests, this is probably 4.4 material

I reviewed the code in nand_get_device and it looks sane whether 
FL_SHUTDOWN or PL_PM_SUSPENDED is called.

Acked-by: Scott Branden <sbranden@broadcom.com>

>
>   drivers/mtd/nand/nand_base.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index cc74142938b0..ece544efccc3 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3110,7 +3110,7 @@ static void nand_resume(struct mtd_info *mtd)
>    */
>   static void nand_shutdown(struct mtd_info *mtd)
>   {
> -	nand_get_device(mtd, FL_SHUTDOWN);
> +	nand_get_device(mtd, FL_PM_SUSPENDED);
>   }
>
>   /* Set default functions */
>
Boris Brezillon Nov. 16, 2015, 10:09 a.m. UTC | #2
On Mon,  9 Nov 2015 16:37:28 -0800
Brian Norris <computersforpeace@gmail.com> wrote:

> If multiple NAND chips are registered to the same controller, then when
> rebooting the system, the first one will grab the controller lock, while
> the second will wait forever for the first one to release it. i.e., a
> classic deadlock.
> 
> This problem was solved for a similar case (suspend/resume) back in
> commit 6b0d9a841249 ("mtd: nand: fix multi-chip suspend problem"), and
> the shutdown state really isn't much different for us, so rather than
> adding a new special case to nand_get_device(), we can just overload the
> FL_PM_SUSPENDED state.
> 
> Now, multiple chips can "get" the same controller lock (preventing
> further I/O), while we still allow other chips to pass through
> nand_shutdown().
> 
> Original report:
> http://thread.gmane.org/gmane.linux.drivers.mtd/59726
> http://lists.infradead.org/pipermail/linux-mtd/2015-July/059992.html
> 
> Fixes: 72ea403669c7 ("mtd: nand: added nand_shutdown")
> Reported-by: Andrew E. Mileski <andrewm@isoar.ca>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: Andrew E. Mileski <andrewm@isoar.ca>

Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>

> ---
> I only compile-tested
> 
> If we get proper tests, this is probably 4.4 material
> 
>  drivers/mtd/nand/nand_base.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index cc74142938b0..ece544efccc3 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3110,7 +3110,7 @@ static void nand_resume(struct mtd_info *mtd)
>   */
>  static void nand_shutdown(struct mtd_info *mtd)
>  {
> -	nand_get_device(mtd, FL_SHUTDOWN);
> +	nand_get_device(mtd, FL_PM_SUSPENDED);
>  }
>  
>  /* Set default functions */
Brian Norris Nov. 16, 2015, 6:52 p.m. UTC | #3
On Mon, Nov 09, 2015 at 04:37:28PM -0800, Brian Norris wrote:
> If multiple NAND chips are registered to the same controller, then when
> rebooting the system, the first one will grab the controller lock, while
> the second will wait forever for the first one to release it. i.e., a
> classic deadlock.
> 
> This problem was solved for a similar case (suspend/resume) back in
> commit 6b0d9a841249 ("mtd: nand: fix multi-chip suspend problem"), and
> the shutdown state really isn't much different for us, so rather than
> adding a new special case to nand_get_device(), we can just overload the
> FL_PM_SUSPENDED state.
> 
> Now, multiple chips can "get" the same controller lock (preventing
> further I/O), while we still allow other chips to pass through
> nand_shutdown().
> 
> Original report:
> http://thread.gmane.org/gmane.linux.drivers.mtd/59726
> http://lists.infradead.org/pipermail/linux-mtd/2015-July/059992.html
> 
> Fixes: 72ea403669c7 ("mtd: nand: added nand_shutdown")
> Reported-by: Andrew E. Mileski <andrewm@isoar.ca>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> Cc: Scott Branden <sbranden@broadcom.com>
> Cc: Andrew E. Mileski <andrewm@isoar.ca>

Pushed to linux-mtd.git
diff mbox

Patch

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index cc74142938b0..ece544efccc3 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -3110,7 +3110,7 @@  static void nand_resume(struct mtd_info *mtd)
  */
 static void nand_shutdown(struct mtd_info *mtd)
 {
-	nand_get_device(mtd, FL_SHUTDOWN);
+	nand_get_device(mtd, FL_PM_SUSPENDED);
 }
 
 /* Set default functions */