Message ID | 20230717194221.229778-2-miquel.raynal@bootlin.com |
---|---|
State | Accepted |
Headers | show |
Series | [1/3] mtd: rawnand: marvell: Ensure program page operations are successful | expand |
Hi Michal, miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: > The NAND core complies with the ONFI specification, which itself > mentions that after any program or erase operation, a status check > should be performed to see whether the operation was finished *and* > successful. > > The NAND core offers helpers to finish a page write (sending the > "PAGE PROG" command, waiting for the NAND chip to be ready again, and > checking the operation status). But in some cases, advanced controller > drivers might want to optimize this and craft their own page write > helper to leverage additional hardware capabilities, thus not always > using the core facilities. > > Some drivers, like this one, do not use the core helper to finish a page > write because the final cycles are automatically managed by the > hardware. In this case, the additional care must be taken to manually > perform the final status check. > > Let's read the NAND chip status at the end of the page write helper and > return -EIO upon error. > > Cc: Michal Simek <michal.simek@amd.com> > Cc: stable@vger.kernel.org > Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > > --- > > Hello Michal, > > I have not tested this, but based on a report on another driver, I > believe the status check is also missing here and could sometimes > lead to unnoticed partial writes. > > Please test on your side that everything still works and let me > know how it goes. Any news from the testing team about patches 2/3 and 3/3? Thanks, Miquèl
Hi Miquel, On 9/11/23 17:52, Miquel Raynal wrote: > Hi Michal, > > miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: > >> The NAND core complies with the ONFI specification, which itself >> mentions that after any program or erase operation, a status check >> should be performed to see whether the operation was finished *and* >> successful. >> >> The NAND core offers helpers to finish a page write (sending the >> "PAGE PROG" command, waiting for the NAND chip to be ready again, and >> checking the operation status). But in some cases, advanced controller >> drivers might want to optimize this and craft their own page write >> helper to leverage additional hardware capabilities, thus not always >> using the core facilities. >> >> Some drivers, like this one, do not use the core helper to finish a page >> write because the final cycles are automatically managed by the >> hardware. In this case, the additional care must be taken to manually >> perform the final status check. >> >> Let's read the NAND chip status at the end of the page write helper and >> return -EIO upon error. >> >> Cc: Michal Simek <michal.simek@amd.com> >> Cc: stable@vger.kernel.org >> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") >> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >> >> --- >> >> Hello Michal, >> >> I have not tested this, but based on a report on another driver, I >> believe the status check is also missing here and could sometimes >> lead to unnoticed partial writes. >> >> Please test on your side that everything still works and let me >> know how it goes. > > Any news from the testing team about patches 2/3 and 3/3? I asked Amit to test and he didn't get back to me even I asked for it couple of times. Can you please tell me how to test it? I will setup HW myself and test it and get back to you. Thanks, Michal
Hi Michal, michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: > Hi Miquel, > > On 9/11/23 17:52, Miquel Raynal wrote: > > Hi Michal, > > > > miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: > > > >> The NAND core complies with the ONFI specification, which itself > >> mentions that after any program or erase operation, a status check > >> should be performed to see whether the operation was finished *and* > >> successful. > >> > >> The NAND core offers helpers to finish a page write (sending the > >> "PAGE PROG" command, waiting for the NAND chip to be ready again, and > >> checking the operation status). But in some cases, advanced controller > >> drivers might want to optimize this and craft their own page write > >> helper to leverage additional hardware capabilities, thus not always > >> using the core facilities. > >> > >> Some drivers, like this one, do not use the core helper to finish a page > >> write because the final cycles are automatically managed by the > >> hardware. In this case, the additional care must be taken to manually > >> perform the final status check. > >> > >> Let's read the NAND chip status at the end of the page write helper and > >> return -EIO upon error. > >> > >> Cc: Michal Simek <michal.simek@amd.com> > >> Cc: stable@vger.kernel.org > >> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") > >> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > >> > >> --- > >> > >> Hello Michal, > >> > >> I have not tested this, but based on a report on another driver, I > >> believe the status check is also missing here and could sometimes > >> lead to unnoticed partial writes. > >> > >> Please test on your side that everything still works and let me > >> know how it goes. > > > > Any news from the testing team about patches 2/3 and 3/3? > > I asked Amit to test and he didn't get back to me even I asked for it couple of times. Ok. > Can you please tell me how to test it? I will setup HW myself and test it and get back to you. I believe setting up the board to use the hardware BCH engine and performing basic erase/write/read testing with a known file and check it still behaves correctly would work. You can also run nandbiterrs -i /dev/mtdx as a second step and verify there is no difference with and without the patch and finally check the impact: flash_speed -d -c 10 /dev/mtdx (be careful: this is a destructive operation) Thanks, Miquèl
Hi Miquel, On 9/12/23 16:17, Miquel Raynal wrote: > Hi Michal, > > michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: > >> Hi Miquel, >> >> On 9/11/23 17:52, Miquel Raynal wrote: >>> Hi Michal, >>> >>> miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: >>> >>>> The NAND core complies with the ONFI specification, which itself >>>> mentions that after any program or erase operation, a status check >>>> should be performed to see whether the operation was finished *and* >>>> successful. >>>> >>>> The NAND core offers helpers to finish a page write (sending the >>>> "PAGE PROG" command, waiting for the NAND chip to be ready again, and >>>> checking the operation status). But in some cases, advanced controller >>>> drivers might want to optimize this and craft their own page write >>>> helper to leverage additional hardware capabilities, thus not always >>>> using the core facilities. >>>> >>>> Some drivers, like this one, do not use the core helper to finish a page >>>> write because the final cycles are automatically managed by the >>>> hardware. In this case, the additional care must be taken to manually >>>> perform the final status check. >>>> >>>> Let's read the NAND chip status at the end of the page write helper and >>>> return -EIO upon error. >>>> >>>> Cc: Michal Simek <michal.simek@amd.com> >>>> Cc: stable@vger.kernel.org >>>> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") >>>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >>>> >>>> --- >>>> >>>> Hello Michal, >>>> >>>> I have not tested this, but based on a report on another driver, I >>>> believe the status check is also missing here and could sometimes >>>> lead to unnoticed partial writes. >>>> >>>> Please test on your side that everything still works and let me >>>> know how it goes. >>> >>> Any news from the testing team about patches 2/3 and 3/3? >> >> I asked Amit to test and he didn't get back to me even I asked for it couple of times. > > Ok. > >> Can you please tell me how to test it? I will setup HW myself and test it and get back to you. > > I believe setting up the board to use the hardware BCH engine and > performing basic erase/write/read testing with a known file and check > it still behaves correctly would work. You can also run > > nandbiterrs -i /dev/mtdx > > as a second step and verify there is no difference with and without the > patch and finally check the impact: > > flash_speed -d -c 10 /dev/mtdx > (be careful: this is a destructive operation) I run this myself. pl353 test log before the patch. # cat /proc/mtd dev: size erasesize name mtd0: 10000000 00020000 "pl35x-nand-controller" # nandbiterrs -i /dev/mtd0 incremental biterrors test Successfully corrected 0 bit errors per subpage Inserted biterror @ 0/5 Read reported 1 corrected bit errors Successfully corrected 1 bit errors per subpage Inserted biterror @ 0/2 Failed to recover 1 bitflips Read error after 2 bit errors per page # flash_speed -d -c 10 /dev/mtd0 scanning for bad eraseblocks scanned 10 eraseblocks, 0 are bad testing eraseblock write speed eraseblock write speed is 4555 KiB/s testing eraseblock read speed eraseblock read speed is 5765 KiB/s testing page write speed page write speed is 4383 KiB/s testing page read speed page read speed is 5614 KiB/s testing 2 page write speed 2 page write speed is 4444 KiB/s testing 2 page read speed 2 page read speed is 5688 KiB/s Testing erase speed erase speed is 320000 KiB/s Testing 2x multi-block erase speed 2x multi-block erase speed is 320000 KiB/s Testing 4x multi-block erase speed 4x multi-block erase speed is 320000 KiB/s Testing 8x multi-block erase speed 8x multi-block erase speed is 320000 KiB/s Testing 16x multi-block erase speed 16x multi-block erase speed is 320000 KiB/s Testing 32x multi-block erase speed 32x multi-block erase speed is 320000 KiB/s Testing 64x multi-block erase speed 64x multi-block erase speed is 320000 KiB/s finished # dmesg | grep nand [ 2.876719] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [ 2.883130] nand: Micron MT29F2G08ABAEAWP [ 2.887230] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 # When applied # cat /proc/mtd dev: size erasesize name mtd0: 10000000 00020000 "pl35x-nand-controller" # nandbiterrs -i /dev/mtd0 incremental biterrors test Successfully corrected 0 bit errors per subpage Inserted biterror @ 0/5 Read reported 1 corrected bit errors Successfully corrected 1 bit errors per subpage Inserted biterror @ 0/2 Failed to recover 1 bitflips Read error after 2 bit errors per page # flash_speed -d -c 10 /dev/mtd0 scanning for bad eraseblocks scanned 10 eraseblocks, 0 are bad testing eraseblock write speed eraseblock write speed is 4522 KiB/s testing eraseblock read speed eraseblock read speed is 5765 KiB/s testing page write speed page write speed is 4383 KiB/s testing page read speed page read speed is 5638 KiB/s testing 2 page write speed 2 page write speed is 4444 KiB/s testing 2 page read speed 2 page read speed is 5714 KiB/s Testing erase speed erase speed is 320000 KiB/s Testing 2x multi-block erase speed 2x multi-block erase speed is 320000 KiB/s Testing 4x multi-block erase speed 4x multi-block erase speed is 320000 KiB/s Testing 8x multi-block erase speed 8x multi-block erase speed is 320000 KiB/s Testing 16x multi-block erase speed 16x multi-block erase speed is 320000 KiB/s Testing 32x multi-block erase speed 32x multi-block erase speed is 320000 KiB/s Testing 64x multi-block erase speed 64x multi-block erase speed is 320000 KiB/s finished # dmesg | grep nand [ 2.896206] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [ 2.902648] nand: Micron MT29F2G08ABAEAWP [ 2.906667] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 Behavior is the same. Speed is changing on every run. I don't have zynqmp board here but will try to get data asap. Thanks, Michal
On 9/12/23 16:17, Miquel Raynal wrote: > Hi Michal, > > michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: > >> Hi Miquel, >> >> On 9/11/23 17:52, Miquel Raynal wrote: >>> Hi Michal, >>> >>> miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: >>> >>>> The NAND core complies with the ONFI specification, which itself >>>> mentions that after any program or erase operation, a status check >>>> should be performed to see whether the operation was finished *and* >>>> successful. >>>> >>>> The NAND core offers helpers to finish a page write (sending the >>>> "PAGE PROG" command, waiting for the NAND chip to be ready again, and >>>> checking the operation status). But in some cases, advanced controller >>>> drivers might want to optimize this and craft their own page write >>>> helper to leverage additional hardware capabilities, thus not always >>>> using the core facilities. >>>> >>>> Some drivers, like this one, do not use the core helper to finish a page >>>> write because the final cycles are automatically managed by the >>>> hardware. In this case, the additional care must be taken to manually >>>> perform the final status check. >>>> >>>> Let's read the NAND chip status at the end of the page write helper and >>>> return -EIO upon error. >>>> >>>> Cc: Michal Simek <michal.simek@amd.com> >>>> Cc: stable@vger.kernel.org >>>> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") >>>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >>>> >>>> --- >>>> >>>> Hello Michal, >>>> >>>> I have not tested this, but based on a report on another driver, I >>>> believe the status check is also missing here and could sometimes >>>> lead to unnoticed partial writes. >>>> >>>> Please test on your side that everything still works and let me >>>> know how it goes. >>> >>> Any news from the testing team about patches 2/3 and 3/3? >> >> I asked Amit to test and he didn't get back to me even I asked for it couple of times. > > Ok. > >> Can you please tell me how to test it? I will setup HW myself and test it and get back to you. > > I believe setting up the board to use the hardware BCH engine and > performing basic erase/write/read testing with a known file and check > it still behaves correctly would work. You can also run > > nandbiterrs -i /dev/mtdx > > as a second step and verify there is no difference with and without the > patch and finally check the impact: > > flash_speed -d -c 10 /dev/mtdx > (be careful: this is a destructive operation) Testing team won't see any issue that's why feel free to add my Acked-by: Michal Smek <michal.simek@amd.com> Thanks, Michal
Hi Michal, michal.simek@amd.com wrote on Thu, 21 Sep 2023 12:25:10 +0200: > On 9/12/23 16:17, Miquel Raynal wrote: > > Hi Michal, > > > > michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: > > > >> Hi Miquel, > >> > >> On 9/11/23 17:52, Miquel Raynal wrote: > >>> Hi Michal, > >>> > >>> miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: > >>> >>>> The NAND core complies with the ONFI specification, which itself > >>>> mentions that after any program or erase operation, a status check > >>>> should be performed to see whether the operation was finished *and* > >>>> successful. > >>>> > >>>> The NAND core offers helpers to finish a page write (sending the > >>>> "PAGE PROG" command, waiting for the NAND chip to be ready again, and > >>>> checking the operation status). But in some cases, advanced controller > >>>> drivers might want to optimize this and craft their own page write > >>>> helper to leverage additional hardware capabilities, thus not always > >>>> using the core facilities. > >>>> > >>>> Some drivers, like this one, do not use the core helper to finish a page > >>>> write because the final cycles are automatically managed by the > >>>> hardware. In this case, the additional care must be taken to manually > >>>> perform the final status check. > >>>> > >>>> Let's read the NAND chip status at the end of the page write helper and > >>>> return -EIO upon error. > >>>> > >>>> Cc: Michal Simek <michal.simek@amd.com> > >>>> Cc: stable@vger.kernel.org > >>>> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") > >>>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > >>>> > >>>> --- > >>>> > >>>> Hello Michal, > >>>> > >>>> I have not tested this, but based on a report on another driver, I > >>>> believe the status check is also missing here and could sometimes > >>>> lead to unnoticed partial writes. > >>>> > >>>> Please test on your side that everything still works and let me > >>>> know how it goes. > >>> > >>> Any news from the testing team about patches 2/3 and 3/3? > >> > >> I asked Amit to test and he didn't get back to me even I asked for it couple of times. > > > > Ok. > > > >> Can you please tell me how to test it? I will setup HW myself and test it and get back to you. > > > > I believe setting up the board to use the hardware BCH engine and > > performing basic erase/write/read testing with a known file and check > > it still behaves correctly would work. You can also run > > > > nandbiterrs -i /dev/mtdx > > > > as a second step and verify there is no difference with and without the > > patch and finally check the impact: > > > > flash_speed -d -c 10 /dev/mtdx > > (be careful: this is a destructive operation) > > Testing team won't see any issue that's why feel free to add my > Acked-by: Michal Smek <michal.simek@amd.com> I think you told me in the last e-mail you tested the pl353 patch, not the one for the Arasan controller. Shall I add your Acked-by here and your Tested-by in the other? Thanks, Miquèl
On 9/22/23 11:14, Miquel Raynal wrote: > Hi Michal, > > michal.simek@amd.com wrote on Thu, 21 Sep 2023 12:25:10 +0200: > >> On 9/12/23 16:17, Miquel Raynal wrote: >>> Hi Michal, >>> >>> michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: >>> >>>> Hi Miquel, >>>> >>>> On 9/11/23 17:52, Miquel Raynal wrote: >>>>> Hi Michal, >>>>> >>>>> miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: >>>>> >>>> The NAND core complies with the ONFI specification, which itself >>>>>> mentions that after any program or erase operation, a status check >>>>>> should be performed to see whether the operation was finished *and* >>>>>> successful. >>>>>> >>>>>> The NAND core offers helpers to finish a page write (sending the >>>>>> "PAGE PROG" command, waiting for the NAND chip to be ready again, and >>>>>> checking the operation status). But in some cases, advanced controller >>>>>> drivers might want to optimize this and craft their own page write >>>>>> helper to leverage additional hardware capabilities, thus not always >>>>>> using the core facilities. >>>>>> >>>>>> Some drivers, like this one, do not use the core helper to finish a page >>>>>> write because the final cycles are automatically managed by the >>>>>> hardware. In this case, the additional care must be taken to manually >>>>>> perform the final status check. >>>>>> >>>>>> Let's read the NAND chip status at the end of the page write helper and >>>>>> return -EIO upon error. >>>>>> >>>>>> Cc: Michal Simek <michal.simek@amd.com> >>>>>> Cc: stable@vger.kernel.org >>>>>> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") >>>>>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >>>>>> >>>>>> --- >>>>>> >>>>>> Hello Michal, >>>>>> >>>>>> I have not tested this, but based on a report on another driver, I >>>>>> believe the status check is also missing here and could sometimes >>>>>> lead to unnoticed partial writes. >>>>>> >>>>>> Please test on your side that everything still works and let me >>>>>> know how it goes. >>>>> >>>>> Any news from the testing team about patches 2/3 and 3/3? >>>> >>>> I asked Amit to test and he didn't get back to me even I asked for it couple of times. >>> >>> Ok. >>> >>>> Can you please tell me how to test it? I will setup HW myself and test it and get back to you. >>> >>> I believe setting up the board to use the hardware BCH engine and >>> performing basic erase/write/read testing with a known file and check >>> it still behaves correctly would work. You can also run >>> >>> nandbiterrs -i /dev/mtdx >>> >>> as a second step and verify there is no difference with and without the >>> patch and finally check the impact: >>> >>> flash_speed -d -c 10 /dev/mtdx >>> (be careful: this is a destructive operation) >> >> Testing team won't see any issue that's why feel free to add my >> Acked-by: Michal Smek <michal.simek@amd.com> > > I think you told me in the last e-mail you tested the pl353 patch, not > the one for the Arasan controller. Shall I add your Acked-by here and > your Tested-by in the other? Yes exactly. I tested pl353 myself. If that log looks good feel free to add my Tested-by tag. And I got information from testing team that they tested Arasan one hence only Ack one. Thanks, Michal
Hi Michal, michal.simek@amd.com wrote on Fri, 22 Sep 2023 11:16:20 +0200: > On 9/22/23 11:14, Miquel Raynal wrote: > > Hi Michal, > > > > michal.simek@amd.com wrote on Thu, 21 Sep 2023 12:25:10 +0200: > > > >> On 9/12/23 16:17, Miquel Raynal wrote: > >>> Hi Michal, > >>> > >>> michal.simek@amd.com wrote on Tue, 12 Sep 2023 15:55:23 +0200: > >>> >>>> Hi Miquel, > >>>> > >>>> On 9/11/23 17:52, Miquel Raynal wrote: > >>>>> Hi Michal, > >>>>> > >>>>> miquel.raynal@bootlin.com wrote on Mon, 17 Jul 2023 21:42:20 +0200: > >>>>> >>>> The NAND core complies with the ONFI specification, which itself > >>>>>> mentions that after any program or erase operation, a status check > >>>>>> should be performed to see whether the operation was finished *and* > >>>>>> successful. > >>>>>> > >>>>>> The NAND core offers helpers to finish a page write (sending the > >>>>>> "PAGE PROG" command, waiting for the NAND chip to be ready again, and > >>>>>> checking the operation status). But in some cases, advanced controller > >>>>>> drivers might want to optimize this and craft their own page write > >>>>>> helper to leverage additional hardware capabilities, thus not always > >>>>>> using the core facilities. > >>>>>> > >>>>>> Some drivers, like this one, do not use the core helper to finish a page > >>>>>> write because the final cycles are automatically managed by the > >>>>>> hardware. In this case, the additional care must be taken to manually > >>>>>> perform the final status check. > >>>>>> > >>>>>> Let's read the NAND chip status at the end of the page write helper and > >>>>>> return -EIO upon error. > >>>>>> > >>>>>> Cc: Michal Simek <michal.simek@amd.com> > >>>>>> Cc: stable@vger.kernel.org > >>>>>> Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") > >>>>>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> Hello Michal, > >>>>>> > >>>>>> I have not tested this, but based on a report on another driver, I > >>>>>> believe the status check is also missing here and could sometimes > >>>>>> lead to unnoticed partial writes. > >>>>>> > >>>>>> Please test on your side that everything still works and let me > >>>>>> know how it goes. > >>>>> > >>>>> Any news from the testing team about patches 2/3 and 3/3? > >>>> > >>>> I asked Amit to test and he didn't get back to me even I asked for it couple of times. > >>> > >>> Ok. > >>> >>>> Can you please tell me how to test it? I will setup HW myself and test it and get back to you. > >>> > >>> I believe setting up the board to use the hardware BCH engine and > >>> performing basic erase/write/read testing with a known file and check > >>> it still behaves correctly would work. You can also run > >>> > >>> nandbiterrs -i /dev/mtdx > >>> > >>> as a second step and verify there is no difference with and without the > >>> patch and finally check the impact: > >>> > >>> flash_speed -d -c 10 /dev/mtdx > >>> (be careful: this is a destructive operation) > >> > >> Testing team won't see any issue that's why feel free to add my > >> Acked-by: Michal Smek <michal.simek@amd.com> > > > > I think you told me in the last e-mail you tested the pl353 patch, not > > the one for the Arasan controller. Shall I add your Acked-by here and > > your Tested-by in the other? > > Yes exactly. > I tested pl353 myself. If that log looks good feel free to add my Tested-by tag. > And I got information from testing team that they tested Arasan one hence only Ack one. Perfect. Thanks a lot! Miquèl
On Mon, 2023-07-17 at 19:42:20 UTC, Miquel Raynal wrote: > The NAND core complies with the ONFI specification, which itself > mentions that after any program or erase operation, a status check > should be performed to see whether the operation was finished *and* > successful. > > The NAND core offers helpers to finish a page write (sending the > "PAGE PROG" command, waiting for the NAND chip to be ready again, and > checking the operation status). But in some cases, advanced controller > drivers might want to optimize this and craft their own page write > helper to leverage additional hardware capabilities, thus not always > using the core facilities. > > Some drivers, like this one, do not use the core helper to finish a page > write because the final cycles are automatically managed by the > hardware. In this case, the additional care must be taken to manually > perform the final status check. > > Let's read the NAND chip status at the end of the page write helper and > return -EIO upon error. > > Cc: Michal Simek <michal.simek@amd.com> > Cc: stable@vger.kernel.org > Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > Acked-by: Michal Smek <michal.simek@amd.com> Applied to https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git mtd/fixes. Miquel
diff --git a/drivers/mtd/nand/raw/arasan-nand-controller.c b/drivers/mtd/nand/raw/arasan-nand-controller.c index 906eef70cb6d..487c139316fe 100644 --- a/drivers/mtd/nand/raw/arasan-nand-controller.c +++ b/drivers/mtd/nand/raw/arasan-nand-controller.c @@ -515,6 +515,7 @@ static int anfc_write_page_hw_ecc(struct nand_chip *chip, const u8 *buf, struct mtd_info *mtd = nand_to_mtd(chip); unsigned int len = mtd->writesize + (oob_required ? mtd->oobsize : 0); dma_addr_t dma_addr; + u8 status; int ret; struct anfc_op nfc_op = { .pkt_reg = @@ -561,10 +562,21 @@ static int anfc_write_page_hw_ecc(struct nand_chip *chip, const u8 *buf, } /* Spare data is not protected */ - if (oob_required) + if (oob_required) { ret = nand_write_oob_std(chip, page); + if (ret) + return ret; + } - return ret; + /* Check write status on the chip side */ + ret = nand_status_op(chip, &status); + if (ret) + return ret; + + if (status & NAND_STATUS_FAIL) + return -EIO; + + return 0; } static int anfc_sel_write_page_hw_ecc(struct nand_chip *chip, const u8 *buf,
The NAND core complies with the ONFI specification, which itself mentions that after any program or erase operation, a status check should be performed to see whether the operation was finished *and* successful. The NAND core offers helpers to finish a page write (sending the "PAGE PROG" command, waiting for the NAND chip to be ready again, and checking the operation status). But in some cases, advanced controller drivers might want to optimize this and craft their own page write helper to leverage additional hardware capabilities, thus not always using the core facilities. Some drivers, like this one, do not use the core helper to finish a page write because the final cycles are automatically managed by the hardware. In this case, the additional care must be taken to manually perform the final status check. Let's read the NAND chip status at the end of the page write helper and return -EIO upon error. Cc: Michal Simek <michal.simek@amd.com> Cc: stable@vger.kernel.org Fixes: 88ffef1b65cf ("mtd: rawnand: arasan: Support the hardware BCH ECC engine") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> --- Hello Michal, I have not tested this, but based on a report on another driver, I believe the status check is also missing here and could sometimes lead to unnoticed partial writes. Please test on your side that everything still works and let me know how it goes. Thanks a lot. Miquèl --- drivers/mtd/nand/raw/arasan-nand-controller.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)