mtd: spinand: Wait after erase in spinand_markbad

Message ID	20181221115831.9595-1-emil.lenngren@gmail.com
State	Changes Requested
Delegated to:	Miquel Raynal
Headers	show Return-Path: <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org> From: Emil Lenngren <emil.lenngren@gmail.com> To: linux-mtd@lists.infradead.org Subject: [PATCH] mtd: spinand: Wait after erase in spinand_markbad Date: Fri, 21 Dec 2018 12:58:14 +0100 Message-Id: <20181221115831.9595-1-emil.lenngren@gmail.com> summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [2a00:1450:4864:20:0:0:0:141 listed in] [list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (emil.lenngren[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain Precedence: list Cc: Marek Vasut <marek.vasut@gmail.com>, Richard Weinberger <richard@nod.at>, Emil Lenngren <emil.lenngren@gmail.com>, Boris Brezillon <boris.brezillon@bootlin.com>, Miquel Raynal <miquel.raynal@bootlin.com>, Brian Norris <computersforpeace@gmail.com>, David Woodhouse <dwmw2@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-mtd" <linux-mtd-bounces@lists.infradead.org> Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org
Series	mtd: spinand: Wait after erase in spinand_markbad \| expand mtd: spinand: Wait after erase in spinand_markbad

Emil Lenngren Dec. 21, 2018, 11:58 a.m. UTC

SPI NAND flashes don't accept new commands while an erase is ongoing.
Make sure to wait until the device is ready before writing the marker.

Just as with the erase op, no error check is performed since we want
to continue writing the marker even if the erase fails.

Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
---
 drivers/mtd/nand/spi/core.c | 2 ++
 1 file changed, 2 insertions(+)

Boris Brezillon Jan. 5, 2019, 1:58 p.m. UTC | #1

On Fri, 21 Dec 2018 12:58:14 +0100
Emil Lenngren <emil.lenngren@gmail.com> wrote:

> SPI NAND flashes don't accept new commands while an erase is ongoing.
> Make sure to wait until the device is ready before writing the marker.
> 
> Just as with the erase op, no error check is performed since we want
> to continue writing the marker even if the erase fails.
> 
> Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> ---
>  drivers/mtd/nand/spi/core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> index 479c2f2cf1..c2724d34e6 100644
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
>  
>  	spinand_erase_op(spinand, pos);
>  
> +	spinand_wait(spinand, NULL);
> +

After thinking a bit more about it, I think we should simply write the
BBM and skip the erase operation. Marking a block bad is just about
writing 0 to the first 2 bytes of the OOB area, and we don't need the
block to be erased to do that.

>  	memset(spinand->oobbuf, 0, 2);
>  	return spinand_write_page(spinand, &req);
>  }

Emil Lenngren Jan. 6, 2019, 6:01 a.m. UTC | #2

Hi,

Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:
>
> On Fri, 21 Dec 2018 12:58:14 +0100
> Emil Lenngren <emil.lenngren@gmail.com> wrote:
>
> > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > Make sure to wait until the device is ready before writing the marker.
> >
> > Just as with the erase op, no error check is performed since we want
> > to continue writing the marker even if the erase fails.
> >
> > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > ---
> >  drivers/mtd/nand/spi/core.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > index 479c2f2cf1..c2724d34e6 100644
> > --- a/drivers/mtd/nand/spi/core.c
> > +++ b/drivers/mtd/nand/spi/core.c
> > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> >
> >       spinand_erase_op(spinand, pos);
> >
> > +     spinand_wait(spinand, NULL);
> > +
>
> After thinking a bit more about it, I think we should simply write the
> BBM and skip the erase operation. Marking a block bad is just about
> writing 0 to the first 2 bytes of the OOB area, and we don't need the
> block to be erased to do that.
>

I compared with the raw and implementation in
nand_block_markbad_lowlevel, it also erases first, ignoring a
potential error.

On the other hand, a common spi flash chip MX35LF1GE4AB states in the
datasheet that it's not recommended to erase a bad block, but no
reason why. At the same time, it's generally disallowed to write the
same page twice...

But in the end I also think the best way is to avoid the erase
operation and simply write 0 0 as a raw write.

> >       memset(spinand->oobbuf, 0, 2);
> >       return spinand_write_page(spinand, &req);
> >  }
>

/Emil

Boris Brezillon Jan. 6, 2019, 8:10 a.m. UTC | #3

On Sat, 5 Jan 2019 22:01:44 -0800
Emil Lenngren <emil.lenngren@gmail.com> wrote:

> Hi,
> 
> Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:
> >
> > On Fri, 21 Dec 2018 12:58:14 +0100
> > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> >  
> > > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > > Make sure to wait until the device is ready before writing the marker.
> > >
> > > Just as with the erase op, no error check is performed since we want
> > > to continue writing the marker even if the erase fails.
> > >
> > > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > > ---
> > >  drivers/mtd/nand/spi/core.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > > index 479c2f2cf1..c2724d34e6 100644
> > > --- a/drivers/mtd/nand/spi/core.c
> > > +++ b/drivers/mtd/nand/spi/core.c
> > > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> > >
> > >       spinand_erase_op(spinand, pos);
> > >
> > > +     spinand_wait(spinand, NULL);
> > > +  
> >
> > After thinking a bit more about it, I think we should simply write the
> > BBM and skip the erase operation. Marking a block bad is just about
> > writing 0 to the first 2 bytes of the OOB area, and we don't need the
> > block to be erased to do that.
> >  
> 
> I compared with the raw and implementation in
> nand_block_markbad_lowlevel, it also erases first, ignoring a
> potential error.

Yes, and I think this implementation was inspired by the rawnand one,
but I'm not sure the rawnand implem is correct.

> 
> On the other hand, a common spi flash chip MX35LF1GE4AB states in the
> datasheet that it's not recommended to erase a bad block, but no
> reason why.

Because the erase might succeed and reset the BBM to 0xff, thus marking
the block good even if it's unreliable.

> At the same time, it's generally disallowed to write the
> same page twice...

That's only if you care about the data you write to the page. Marking a
block bad is just about setting the BBM to 0x0, which should always work
even if the page you're writing to has already been written, simply
because a 1 -> 0 cell transition does not require an erase (only a 0 ->
1 transition does).

> 
> But in the end I also think the best way is to avoid the erase
> operation and simply write 0 0 as a raw write.

I don't know why the erase op was added to
nand_block_markbad_lowlevel() in the first place but I don't want to
risk breaking platforms that might depend on this behavior. It's
different for SPI NAND: the subsystem has just been created and I think
we should get rid of this erase call until someone explicitly asks for
it with a good explanation on why this is needed.

> 
> > >       memset(spinand->oobbuf, 0, 2);
> > >       return spinand_write_page(spinand, &req);
> > >  }  
> >  
> 
> /Emil

Emil Lenngren Feb. 18, 2019, 11:27 a.m. UTC | #4

Hi,

Den sön 6 jan. 2019 kl 09:11 skrev Boris Brezillon <bbrezillon@kernel.org>:
>
> On Sat, 5 Jan 2019 22:01:44 -0800
> Emil Lenngren <emil.lenngren@gmail.com> wrote:
>
> > Hi,
> >
> > Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:
> > >
> > > On Fri, 21 Dec 2018 12:58:14 +0100
> > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > >
> > > > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > > > Make sure to wait until the device is ready before writing the marker.
> > > >
> > > > Just as with the erase op, no error check is performed since we want
> > > > to continue writing the marker even if the erase fails.
> > > >
> > > > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > > > ---
> > > >  drivers/mtd/nand/spi/core.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > > > index 479c2f2cf1..c2724d34e6 100644
> > > > --- a/drivers/mtd/nand/spi/core.c
> > > > +++ b/drivers/mtd/nand/spi/core.c
> > > > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> > > >
> > > >       spinand_erase_op(spinand, pos);
> > > >
> > > > +     spinand_wait(spinand, NULL);
> > > > +
> > >
> > > After thinking a bit more about it, I think we should simply write the
> > > BBM and skip the erase operation. Marking a block bad is just about
> > > writing 0 to the first 2 bytes of the OOB area, and we don't need the
> > > block to be erased to do that.
> > >
> >
> > I compared with the raw and implementation in
> > nand_block_markbad_lowlevel, it also erases first, ignoring a
> > potential error.
>
> Yes, and I think this implementation was inspired by the rawnand one,
> but I'm not sure the rawnand implem is correct.
>
> >
> > On the other hand, a common spi flash chip MX35LF1GE4AB states in the
> > datasheet that it's not recommended to erase a bad block, but no
> > reason why.
>
> Because the erase might succeed and reset the BBM to 0xff, thus marking
> the block good even if it's unreliable.
>
> > At the same time, it's generally disallowed to write the
> > same page twice...
>
> That's only if you care about the data you write to the page. Marking a
> block bad is just about setting the BBM to 0x0, which should always work
> even if the page you're writing to has already been written, simply
> because a 1 -> 0 cell transition does not require an erase (only a 0 ->
> 1 transition does).

Should the BBM be written with or without ECC? Now it uses whatever
mode was used in the last operation. Also for reading, I see the
"spinand_isbad" function sets .mode to MTD_OPS_RAW but that field
doesn't seem to be inspected, again using the same ECC mode as in the
last operation.
Isn't it best to use the non-ECC mode for both reading and writing the
BBM? If we would use ECC mode and overwrite the page when writing the
BBM marker, I guess the controller needs to store a second error
correction code (overwriting the previous one), which will probably
lead to ECC failure if later read in ECC-mode (and who knows if it
will "correct" the zero bits to ones...).

/Emil

Boris Brezillon Feb. 20, 2019, 7:59 a.m. UTC | #5

On Mon, 18 Feb 2019 12:27:25 +0100
Emil Lenngren <emil.lenngren@gmail.com> wrote:
  
> > > At the same time, it's generally disallowed to write the
> > > same page twice...  
> >
> > That's only if you care about the data you write to the page. Marking a
> > block bad is just about setting the BBM to 0x0, which should always work
> > even if the page you're writing to has already been written, simply
> > because a 1 -> 0 cell transition does not require an erase (only a 0 ->
> > 1 transition does).  
> 
> Should the BBM be written with or without ECC? Now it uses whatever
> mode was used in the last operation. Also for reading, I see the
> "spinand_isbad" function sets .mode to MTD_OPS_RAW but that field
> doesn't seem to be inspected, again using the same ECC mode as in the
> last operation.
> Isn't it best to use the non-ECC mode for both reading and writing the
> BBM?

We should definitely disable the ECC engine when reading/writing the
BBM.

> If we would use ECC mode and overwrite the page when writing the
> BBM marker, I guess the controller needs to store a second error
> correction code (overwriting the previous one), which will probably
> lead to ECC failure if later read in ECC-mode (and who knows if it
> will "correct" the zero bits to ones...).

Well, you could just ignore the ECC result, but that's better to simply
read the BBM in raw mode.

Miquel Raynal March 4, 2019, 11:23 a.m. UTC | #6

Hi Emil,

Emil Lenngren <emil.lenngren@gmail.com> wrote on Mon, 18 Feb 2019
12:27:25 +0100:

> Hi,
> 
> Den sön 6 jan. 2019 kl 09:11 skrev Boris Brezillon <bbrezillon@kernel.org>:
> >
> > On Sat, 5 Jan 2019 22:01:44 -0800
> > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> >  
> > > Hi,
> > >
> > > Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:  
> > > >
> > > > On Fri, 21 Dec 2018 12:58:14 +0100
> > > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > > >  
> > > > > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > > > > Make sure to wait until the device is ready before writing the marker.
> > > > >
> > > > > Just as with the erase op, no error check is performed since we want
> > > > > to continue writing the marker even if the erase fails.
> > > > >
> > > > > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > > > > ---
> > > > >  drivers/mtd/nand/spi/core.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > > > > index 479c2f2cf1..c2724d34e6 100644
> > > > > --- a/drivers/mtd/nand/spi/core.c
> > > > > +++ b/drivers/mtd/nand/spi/core.c
> > > > > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> > > > >
> > > > >       spinand_erase_op(spinand, pos);
> > > > >
> > > > > +     spinand_wait(spinand, NULL);
> > > > > +  
> > > >
> > > > After thinking a bit more about it, I think we should simply write the
> > > > BBM and skip the erase operation. Marking a block bad is just about
> > > > writing 0 to the first 2 bytes of the OOB area, and we don't need the
> > > > block to be erased to do that.
> > > >  
> > >
> > > I compared with the raw and implementation in
> > > nand_block_markbad_lowlevel, it also erases first, ignoring a
> > > potential error.  
> >
> > Yes, and I think this implementation was inspired by the rawnand one,
> > but I'm not sure the rawnand implem is correct.
> >  
> > >
> > > On the other hand, a common spi flash chip MX35LF1GE4AB states in the
> > > datasheet that it's not recommended to erase a bad block, but no
> > > reason why.  
> >
> > Because the erase might succeed and reset the BBM to 0xff, thus marking
> > the block good even if it's unreliable.
> >  
> > > At the same time, it's generally disallowed to write the
> > > same page twice...  
> >
> > That's only if you care about the data you write to the page. Marking a
> > block bad is just about setting the BBM to 0x0, which should always work
> > even if the page you're writing to has already been written, simply
> > because a 1 -> 0 cell transition does not require an erase (only a 0 ->
> > 1 transition does).  
> 
> Should the BBM be written with or without ECC? Now it uses whatever
> mode was used in the last operation. Also for reading, I see the
> "spinand_isbad" function sets .mode to MTD_OPS_RAW but that field
> doesn't seem to be inspected, again using the same ECC mode as in the
> last operation.
> Isn't it best to use the non-ECC mode for both reading and writing the
> BBM? If we would use ECC mode and overwrite the page when writing the
> BBM marker, I guess the controller needs to store a second error
> correction code (overwriting the previous one), which will probably
> lead to ECC failure if later read in ECC-mode (and who knows if it
> will "correct" the zero bits to ones...).
> 

I suppose writing the BBM without correction is what we want.

Thanks,
Miquèl

Emil Lenngren March 4, 2019, 11:55 a.m. UTC | #7

Hi,

Den mån 4 mars 2019 kl 12:23 skrev Miquel Raynal <miquel.raynal@bootlin.com>:
>
> Hi Emil,
>
> Emil Lenngren <emil.lenngren@gmail.com> wrote on Mon, 18 Feb 2019
> 12:27:25 +0100:
>
> > Hi,
> >
> > Den sön 6 jan. 2019 kl 09:11 skrev Boris Brezillon <bbrezillon@kernel.org>:
> > >
> > > On Sat, 5 Jan 2019 22:01:44 -0800
> > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:
> > > > >
> > > > > On Fri, 21 Dec 2018 12:58:14 +0100
> > > > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > > > >
> > > > > > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > > > > > Make sure to wait until the device is ready before writing the marker.
> > > > > >
> > > > > > Just as with the erase op, no error check is performed since we want
> > > > > > to continue writing the marker even if the erase fails.
> > > > > >
> > > > > > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > > > > > ---
> > > > > >  drivers/mtd/nand/spi/core.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > > > > > index 479c2f2cf1..c2724d34e6 100644
> > > > > > --- a/drivers/mtd/nand/spi/core.c
> > > > > > +++ b/drivers/mtd/nand/spi/core.c
> > > > > > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> > > > > >
> > > > > >       spinand_erase_op(spinand, pos);
> > > > > >
> > > > > > +     spinand_wait(spinand, NULL);
> > > > > > +
> > > > >
> > > > > After thinking a bit more about it, I think we should simply write the
> > > > > BBM and skip the erase operation. Marking a block bad is just about
> > > > > writing 0 to the first 2 bytes of the OOB area, and we don't need the
> > > > > block to be erased to do that.
> > > > >
> > > >
> > > > I compared with the raw and implementation in
> > > > nand_block_markbad_lowlevel, it also erases first, ignoring a
> > > > potential error.
> > >
> > > Yes, and I think this implementation was inspired by the rawnand one,
> > > but I'm not sure the rawnand implem is correct.
> > >
> > > >
> > > > On the other hand, a common spi flash chip MX35LF1GE4AB states in the
> > > > datasheet that it's not recommended to erase a bad block, but no
> > > > reason why.
> > >
> > > Because the erase might succeed and reset the BBM to 0xff, thus marking
> > > the block good even if it's unreliable.
> > >
> > > > At the same time, it's generally disallowed to write the
> > > > same page twice...
> > >
> > > That's only if you care about the data you write to the page. Marking a
> > > block bad is just about setting the BBM to 0x0, which should always work
> > > even if the page you're writing to has already been written, simply
> > > because a 1 -> 0 cell transition does not require an erase (only a 0 ->
> > > 1 transition does).
> >
> > Should the BBM be written with or without ECC? Now it uses whatever
> > mode was used in the last operation. Also for reading, I see the
> > "spinand_isbad" function sets .mode to MTD_OPS_RAW but that field
> > doesn't seem to be inspected, again using the same ECC mode as in the
> > last operation.
> > Isn't it best to use the non-ECC mode for both reading and writing the
> > BBM? If we would use ECC mode and overwrite the page when writing the
> > BBM marker, I guess the controller needs to store a second error
> > correction code (overwriting the previous one), which will probably
> > lead to ECC failure if later read in ECC-mode (and who knows if it
> > will "correct" the zero bits to ones...).
> >
>
> I suppose writing the BBM without correction is what we want.

I was also thinking, is it a good or bad idea to move the BBM
writing/reading to NAND core? I mean, as I understand, both parallel
nand and spi nand use the same logic to handle the BBM (at the
beginning of the OOB area). Or is there some specific treatment that
must be done individually for spi and parallel flash chips?

After reading the BBM info of some datasheets, it appears the variance
lies between different manufacturers, not the technology used
(parallel or spi). For example, Toshiba fills a whole bad block with
00h and instructs to read any byte in any page and check if it
contains 00h. Macronix on the other hand instructs to check the first
byte of the OOB area in the first and second page to see they contain
00h. SmartMedia instructs to check the 6th byte of the OOB area of the
first page and see if it contains at least two zero bits.

/Emil

Miquel Raynal March 4, 2019, 12:50 p.m. UTC | #8

Hi Emil,

Emil Lenngren <emil.lenngren@gmail.com> wrote on Mon, 4 Mar 2019
12:55:30 +0100:

> Hi,
> 
> Den mån 4 mars 2019 kl 12:23 skrev Miquel Raynal <miquel.raynal@bootlin.com>:
> >
> > Hi Emil,
> >
> > Emil Lenngren <emil.lenngren@gmail.com> wrote on Mon, 18 Feb 2019
> > 12:27:25 +0100:
> >  
> > > Hi,
> > >
> > > Den sön 6 jan. 2019 kl 09:11 skrev Boris Brezillon <bbrezillon@kernel.org>:  
> > > >
> > > > On Sat, 5 Jan 2019 22:01:44 -0800
> > > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > > >  
> > > > > Hi,
> > > > >
> > > > > Den lör 5 jan. 2019 kl 05:59 skrev Boris Brezillon <bbrezillon@kernel.org>:  
> > > > > >
> > > > > > On Fri, 21 Dec 2018 12:58:14 +0100
> > > > > > Emil Lenngren <emil.lenngren@gmail.com> wrote:
> > > > > >  
> > > > > > > SPI NAND flashes don't accept new commands while an erase is ongoing.
> > > > > > > Make sure to wait until the device is ready before writing the marker.
> > > > > > >
> > > > > > > Just as with the erase op, no error check is performed since we want
> > > > > > > to continue writing the marker even if the erase fails.
> > > > > > >
> > > > > > > Signed-off-by: Emil Lenngren <emil.lenngren@gmail.com>
> > > > > > > ---
> > > > > > >  drivers/mtd/nand/spi/core.c | 2 ++
> > > > > > >  1 file changed, 2 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> > > > > > > index 479c2f2cf1..c2724d34e6 100644
> > > > > > > --- a/drivers/mtd/nand/spi/core.c
> > > > > > > +++ b/drivers/mtd/nand/spi/core.c
> > > > > > > @@ -685,6 +685,8 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
> > > > > > >
> > > > > > >       spinand_erase_op(spinand, pos);
> > > > > > >
> > > > > > > +     spinand_wait(spinand, NULL);
> > > > > > > +  
> > > > > >
> > > > > > After thinking a bit more about it, I think we should simply write the
> > > > > > BBM and skip the erase operation. Marking a block bad is just about
> > > > > > writing 0 to the first 2 bytes of the OOB area, and we don't need the
> > > > > > block to be erased to do that.
> > > > > >  
> > > > >
> > > > > I compared with the raw and implementation in
> > > > > nand_block_markbad_lowlevel, it also erases first, ignoring a
> > > > > potential error.  
> > > >
> > > > Yes, and I think this implementation was inspired by the rawnand one,
> > > > but I'm not sure the rawnand implem is correct.
> > > >  
> > > > >
> > > > > On the other hand, a common spi flash chip MX35LF1GE4AB states in the
> > > > > datasheet that it's not recommended to erase a bad block, but no
> > > > > reason why.  
> > > >
> > > > Because the erase might succeed and reset the BBM to 0xff, thus marking
> > > > the block good even if it's unreliable.
> > > >  
> > > > > At the same time, it's generally disallowed to write the
> > > > > same page twice...  
> > > >
> > > > That's only if you care about the data you write to the page. Marking a
> > > > block bad is just about setting the BBM to 0x0, which should always work
> > > > even if the page you're writing to has already been written, simply
> > > > because a 1 -> 0 cell transition does not require an erase (only a 0 ->
> > > > 1 transition does).  
> > >
> > > Should the BBM be written with or without ECC? Now it uses whatever
> > > mode was used in the last operation. Also for reading, I see the
> > > "spinand_isbad" function sets .mode to MTD_OPS_RAW but that field
> > > doesn't seem to be inspected, again using the same ECC mode as in the
> > > last operation.
> > > Isn't it best to use the non-ECC mode for both reading and writing the
> > > BBM? If we would use ECC mode and overwrite the page when writing the
> > > BBM marker, I guess the controller needs to store a second error
> > > correction code (overwriting the previous one), which will probably
> > > lead to ECC failure if later read in ECC-mode (and who knows if it
> > > will "correct" the zero bits to ones...).
> > >  
> >
> > I suppose writing the BBM without correction is what we want.  
> 
> I was also thinking, is it a good or bad idea to move the BBM
> writing/reading to NAND core? I mean, as I understand, both parallel
> nand and spi nand use the same logic to handle the BBM (at the
> beginning of the OOB area). Or is there some specific treatment that
> must be done individually for spi and parallel flash chips?

I don't think there is a difference. Moving it to the NAND core is
probably a good choice. However, as far as I remember, there is no BBT
support in the SPI-NAND core yet.

> 
> After reading the BBM info of some datasheets, it appears the variance
> lies between different manufacturers, not the technology used
> (parallel or spi). For example, Toshiba fills a whole bad block with
> 00h and instructs to read any byte in any page and check if it
> contains 00h. Macronix on the other hand instructs to check the first
> byte of the OOB area in the first and second page to see they contain
> 00h. SmartMedia instructs to check the 6th byte of the OOB area of the
> first page and see if it contains at least two zero bits.

I think the BBM was supposed to be at the 6th bytes of the OOB area for
small page NAND devices (probably the case of the SmartMedia datasheet
you read), otherwise it is supposed to be located at the 1st (and 2nd?)
byte(s) which would work for both Toshiba and Macronix parts, for
instance.


Thanks,
Miquèl

mtd: spinand: Wait after erase in spinand_markbad

Commit Message

Comments

Patch