diff mbox

[2/3] MTD: bitflip_threshold added to mtd_info and sysfs

Message ID 1331832353-15569-3-git-send-email-mikedunn@newsguy.com
State New, archived
Headers show

Commit Message

Mike Dunn March 15, 2012, 5:25 p.m. UTC
The element 'bitflip_threshold' is added to struct mtd_info.  If the driver
leaves this uninitialized, mtd sets it to ecc_strength when the device or
partition is registered.  This element is also exposed for reading and writing
from userspace through sysfs.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---
 Documentation/ABI/testing/sysfs-class-mtd |   31 +++++++++++++++++++++++++++
 drivers/mtd/mtdcore.c                     |   33 +++++++++++++++++++++++++++++
 include/linux/mtd/mtd.h                   |    9 ++++++++
 3 files changed, 73 insertions(+), 0 deletions(-)

Comments

Ivan Djelic March 16, 2012, 4:31 p.m. UTC | #1
On Thu, Mar 15, 2012 at 05:25:52PM +0000, Mike Dunn wrote:
> +
> +What:		/sys/class/mtd/mtdX/bitflip_threshold
> +Date:		March 2012
> +KernelVersion:	3.3.1
> +Contact:	linux-mtd@lists.infradead.org
> +Description:
> +		This allows the user to examine and adjust the criteria by which
> +		mtd returns -EUCLEAN from mtd_read() and mtd_read_oob().  If the
> +		maximum number of bit errors that were corrected on any single
> +		writesize region (as reported by the driver) equals or exceeds
> +		this value, -EUCLEAN is returned.  Otherwise, absent an error, 0
> +		is returned.  Higher layers (e.g., UBI) use this return code as
> +		an indication that an erase block may be degrading and should be
> +		scrutinized as a candidate for being marked as bad.
> +
> +		The initial value may be specified by the flash device driver.
> +		If not, then the default value is ecc_strength.  Users who wish
> +		to be more paranoid about data integrity can lower the value.
> +		If the value exceeds ecc_strength, -EUCLEAN is never returned by
> +		the read functions.

Hmmm. I don't think it's a good idea to say "Users who wish to be more paranoid
about data integrity can lower the value"; because this is not exactly true.

Lowering the value is very dangerous, and can have devastating effects: on NAND
devices where sticky bitflips appear (we have plenty of those devices), a low
threshold (say 1) triggers block torture by UBI, then bad block retirement,
quickly reducing the number of valid blocks; the other "sane" blocks
with intermittent bitflips keep being scrubbed, thrashing the whole device.
Even worse: if enough bad blocks appear, UBI runs out of replacement blocks
and stops working.

IMHO the value of 'bitflip_threshold' should be carefully chosen:

- low enough to ensure ecc correction has a safety margin and manufacturer
  requirements are met
- high enough to avoid the effects described above

In some cases, controlling bitflip_threshold can be interesting for other
reasons; for instance, on a specific board, I have used a NAND device
requiring 4-bit ecc, but I implemented 8-bit protection through hardware BCH for
extra safety (and future 8-bit NAND upgrades).
In that particular setup, I would set bitflip_threshold to 3 or 4 instead of the
value derived from the ecc strength (8).

So in practice, setting bitflip_threshold is tricky and requires a good
knowledge of the NAND (or Doc/whatever) device your are using, and of how mtd/UBI
will use the threshold.
I suggest we warn about the dangers and discourage people from messing with this
knob unless they know what they are doing.

BR,
diff mbox

Patch

diff --git a/Documentation/ABI/testing/sysfs-class-mtd b/Documentation/ABI/testing/sysfs-class-mtd
index 6a218e6..e2f24c8 100644
--- a/Documentation/ABI/testing/sysfs-class-mtd
+++ b/Documentation/ABI/testing/sysfs-class-mtd
@@ -134,3 +134,34 @@  Description:
 		always be a non-negative integer.
 
 		In the case of devices lacking any ECC capability, it is 0.
+
+What:		/sys/class/mtd/mtdX/bitflip_threshold
+Date:		March 2012
+KernelVersion:	3.3.1
+Contact:	linux-mtd@lists.infradead.org
+Description:
+		This allows the user to examine and adjust the criteria by which
+		mtd returns -EUCLEAN from mtd_read() and mtd_read_oob().  If the
+		maximum number of bit errors that were corrected on any single
+		writesize region (as reported by the driver) equals or exceeds
+		this value, -EUCLEAN is returned.  Otherwise, absent an error, 0
+		is returned.  Higher layers (e.g., UBI) use this return code as
+		an indication that an erase block may be degrading and should be
+		scrutinized as a candidate for being marked as bad.
+
+		The initial value may be specified by the flash device driver.
+		If not, then the default value is ecc_strength.  Users who wish
+		to be more paranoid about data integrity can lower the value.
+		If the value exceeds ecc_strength, -EUCLEAN is never returned by
+		the read functions.
+
+		Note that the introduction of this feature brings a subtle
+		change to the meaning of the -EUCLEAN return code.  Previously,
+		it was interpreted to mean simply "one or more bit errors were
+		corrected".  Its new interpretation can be phrased as "a
+		dangerously high number of bit errors were corrected on one or
+		more writesize regions".  The precise definition of "dangerously
+		high" can be adjusted by the user with bitflip_threshold.
+
+		This is generally applicable only to NAND flash devices with ECC
+		capability.
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 1573644..7e20335 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -259,6 +259,34 @@  static ssize_t mtd_ecc_strength_show(struct device *dev,
 }
 static DEVICE_ATTR(ecc_strength, S_IRUGO, mtd_ecc_strength_show, NULL);
 
+static ssize_t mtd_bitflip_threshold_show(struct device *dev,
+					  struct device_attribute *attr,
+					  char *buf)
+{
+	struct mtd_info *mtd = dev_get_drvdata(dev);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", mtd->bitflip_threshold);
+}
+
+static ssize_t mtd_bitflip_threshold_store(struct device *dev,
+					   struct device_attribute *attr,
+					   const char *buf, size_t count)
+{
+	struct mtd_info *mtd = dev_get_drvdata(dev);
+	unsigned int bitflip_threshold;
+	int retval;
+
+	retval = kstrtouint(buf, 0, &bitflip_threshold);
+	if (retval)
+		return retval;
+
+	mtd->bitflip_threshold = bitflip_threshold;
+	return count;
+}
+static DEVICE_ATTR(bitflip_threshold, S_IRUGO | S_IWUSR,
+		   mtd_bitflip_threshold_show,
+		   mtd_bitflip_threshold_store);
+
 static struct attribute *mtd_attrs[] = {
 	&dev_attr_type.attr,
 	&dev_attr_flags.attr,
@@ -270,6 +298,7 @@  static struct attribute *mtd_attrs[] = {
 	&dev_attr_numeraseregions.attr,
 	&dev_attr_name.attr,
 	&dev_attr_ecc_strength.attr,
+	&dev_attr_bitflip_threshold.attr,
 	NULL,
 };
 
@@ -332,6 +361,10 @@  int add_mtd_device(struct mtd_info *mtd)
 	mtd->index = i;
 	mtd->usecount = 0;
 
+	/* default value if not set by driver */
+	if (mtd->bitflip_threshold == 0)
+		mtd->bitflip_threshold = mtd->ecc_strength;
+
 	if (is_power_of_2(mtd->erasesize))
 		mtd->erasesize_shift = ffs(mtd->erasesize) - 1;
 	else
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index cf5ea8c..1c3706e 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -157,6 +157,15 @@  struct mtd_info {
 	unsigned int erasesize_mask;
 	unsigned int writesize_mask;
 
+	/*
+	 * read ops return -EUCLEAN if max number of bitflips corrected on any
+	 * one writesize region equals or exceeds this value.  Settable by
+	 * driver, else defaults to ecc_strength.  User can override in sysfs.
+	 * N.B. The meaning of the -EUCLEAN return code has changed; see
+	 * Documentation/ABI/testing/sysfs-class-mtd for more detail.
+	 */
+	unsigned int bitflip_threshold;
+
 	// Kernel-only stuff starts here.
 	const char *name;
 	int index;