Message ID | cover.1711048433.git.daniel@makrotopia.org |
---|---|
Headers | show |
Series | block: implement NVMEM provider | expand |
On 3/21/24 12:34, Daniel Golle wrote: > On embedded devices using an eMMC it is common that one or more partitions > on the eMMC are used to store MAC addresses and Wi-Fi calibration EEPROM > data. Allow referencing the partition in device tree for the kernel and > Wi-Fi drivers accessing it via the NVMEM layer. Why to store calibration data in a partition instead of in a file on a filesystem? > diff --git a/MAINTAINERS b/MAINTAINERS > index 8c88f362feb55..242a0a139c00a 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -3662,6 +3662,11 @@ L: linux-mtd@lists.infradead.org > S: Maintained > F: drivers/mtd/devices/block2mtd.c > > +BLOCK NVMEM DRIVER > +M: Daniel Golle <daniel@makrotopia.org> > +S: Maintained > +F: block/blk-nvmem.c Why to add this functionality to the block layer instead of somewhere in the drivers/ directory? Thanks, Bart.
Hi Bart, thank you for looking at the patches! On Thu, Mar 21, 2024 at 12:44:19PM -0700, Bart Van Assche wrote: > On 3/21/24 12:34, Daniel Golle wrote: > > On embedded devices using an eMMC it is common that one or more partitions > > on the eMMC are used to store MAC addresses and Wi-Fi calibration EEPROM > > data. Allow referencing the partition in device tree for the kernel and > > Wi-Fi drivers accessing it via the NVMEM layer. > > Why to store calibration data in a partition instead of in a file on a > filesystem? First of all, it's just how it is already in the practical world out there. The same methods for mass-production are used independently of the type of flash memory, so vendors don't care if in Linux the flash ends up as MMC/block (in case of an eMMC) device or MTD device (in case of SPI-NOR, for example). I can name countless devices of numerous vendors following this generally very common practise (and then ending up extracting that using ugly custom drivers, or poking around in the block devices in early userland, ... none of it is nice, which is the motivation for this series). Adtran, GL-iNet, Netgear, ... to name just a few very popular vendors. The devices are already out there, and the way they store those details is considered part of the low level firmware which will never change. Yet it would be nice to run vanilla Linux on them (or OpenWrt), and make sure things like NFS root can work, and for that the MAC address needs to be in place already, ie. extracting it in userland would be too late. However, I also believe there is nothing wrong with that and using a filesystem comes with many additional pitfalls, such as being possibly not cleanly unmounted, the file could be renamed or deleted by the user, .... All that should not result in a device not having it's proper MAC address any more. Why have all the complexity for something as simple as storing 6 bytes of MAC address? I will not re-iterate over all that discussion now, you may look at list archives where this has been explained and discussed also for the first run of the RFC series last year. > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 8c88f362feb55..242a0a139c00a 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -3662,6 +3662,11 @@ L: linux-mtd@lists.infradead.org > > S: Maintained > > F: drivers/mtd/devices/block2mtd.c > > +BLOCK NVMEM DRIVER > > +M: Daniel Golle <daniel@makrotopia.org> > > +S: Maintained > > +F: block/blk-nvmem.c > > Why to add this functionality to the block layer instead of somewhere > in the drivers/ directory? Simply because we need notifications about appearing and disappearing block devices, or a way to iterate over all block devices in a system. For both there isn't currently any other interface than using a class_interface for that, and that requires access to &block_class which is considered a block subsystem internal. Also note that the same is true for the MTD NVMEM provider (in drivers/mtd/mtdcore.c) as well as the UBI NVMEM provider (in drivers/mtd/ubi/nvmem.c), both are considered an integral part of their corresponding subsystems -- despite the fact that in those cases this wouldn't even be stricktly needed as for MTD we got register_mtd_user() and for UBI we'd have ubi_register_volume_notifier(). Doing it differently for block devices would hence not only complicate things unnessesarily, it would also be inconsistent.
On 3/21/24 12:33, Daniel Golle wrote: > Add new flag to destinguish block devices which may act as an NVMEM > provider. > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > --- > include/linux/blkdev.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index c3e8f7cf96be9..f2c4f280d7619 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -81,11 +81,13 @@ struct partition_meta_info { > * ``GENHD_FL_NO_PART``: partition support is disabled. The kernel will not > * scan for partitions from add_disk, and users can't add partitions manually. > * > + * ``GENHD_FL_NVMEM``: the block device should be considered as NVMEM provider. > */ > enum { > GENHD_FL_REMOVABLE = 1 << 0, > GENHD_FL_HIDDEN = 1 << 1, > GENHD_FL_NO_PART = 1 << 2, > + GENHD_FL_NVMEM = 1 << 3, > }; What would break if this flag wouldn't exist? Thanks, Bart.
On 3/21/24 12:31, Daniel Golle wrote: > On embedded devices using an eMMC it is common that one or more (hw/sw) > partitions on the eMMC are used to store MAC addresses and Wi-Fi > calibration EEPROM data. > > Implement an NVMEM provider backed by a block device as typically the > NVMEM framework is used to have kernel drivers read and use binary data > from EEPROMs, efuses, flash memory (MTD), ... > > In order to be able to reference hardware partitions on an eMMC, add code > to bind each hardware partition to a specific firmware subnode. > > Overall, this enables uniform handling across practially all flash > storage types used for this purpose (MTD, UBI, and now also MMC). > > As part of this series it was necessary to define a device tree schema > for block devices and partitions on them, which (similar to how it now > works also for UBI volumes) can be matched by one or more properties. Since this patch series adds code that opens partitions and reads from partitions, can that part of the functionality be implemented in user space? There is already a mechanism for notifying user space about block device changes, namely udev. Thanks, Bart.
On 3/21/24 13:22, Daniel Golle wrote: > On Thu, Mar 21, 2024 at 12:44:19PM -0700, Bart Van Assche wrote: >> Why to add this functionality to the block layer instead of somewhere >> in the drivers/ directory? > > Simply because we need notifications about appearing and disappearing > block devices, or a way to iterate over all block devices in a system. > For both there isn't currently any other interface than using a > class_interface for that, and that requires access to &block_class > which is considered a block subsystem internal. That's an argument for adding an interface to the block layer that implements this functionality but not for adding this code in the block layer. Thanks, Bart.
On Fri, Mar 22, 2024 at 10:52:17AM -0700, Bart Van Assche wrote: > On 3/21/24 12:31, Daniel Golle wrote: > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > calibration EEPROM data. > > > > Implement an NVMEM provider backed by a block device as typically the > > NVMEM framework is used to have kernel drivers read and use binary data > > from EEPROMs, efuses, flash memory (MTD), ... > > > > In order to be able to reference hardware partitions on an eMMC, add code > > to bind each hardware partition to a specific firmware subnode. > > > > Overall, this enables uniform handling across practially all flash > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > As part of this series it was necessary to define a device tree schema > > for block devices and partitions on them, which (similar to how it now > > works also for UBI volumes) can be matched by one or more properties. > > Since this patch series adds code that opens partitions and reads > from partitions, can that part of the functionality be implemented in > user space? There is already a mechanism for notifying user space about > block device changes, namely udev. No. Because it has to happen (e.g. for nfsroot to work) before userland gets initiated: Without Ethernet MAC address (which if often stored at some raw offset on a partition or hw-partition of an eMMC), we don't have a way to use nfsroot (because that requires functional Ethernet), hence userland won't come up. It's a circular dependency problem which can only be addressed by making sure that everything needed for Ethernet to come up is provided by the kernel **before** rootfs (which can be nfsroot) is mounted.
On Fri, Mar 22, 2024 at 10:49:48AM -0700, Bart Van Assche wrote: > On 3/21/24 12:33, Daniel Golle wrote: > > Add new flag to destinguish block devices which may act as an NVMEM > > provider. > > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > --- > > include/linux/blkdev.h | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index c3e8f7cf96be9..f2c4f280d7619 100644 > > --- a/include/linux/blkdev.h > > +++ b/include/linux/blkdev.h > > @@ -81,11 +81,13 @@ struct partition_meta_info { > > * ``GENHD_FL_NO_PART``: partition support is disabled. The kernel will not > > * scan for partitions from add_disk, and users can't add partitions manually. > > * > > + * ``GENHD_FL_NVMEM``: the block device should be considered as NVMEM provider. > > */ > > enum { > > GENHD_FL_REMOVABLE = 1 << 0, > > GENHD_FL_HIDDEN = 1 << 1, > > GENHD_FL_NO_PART = 1 << 2, > > + GENHD_FL_NVMEM = 1 << 3, > > }; > > What would break if this flag wouldn't exist? As both, MTD and UBI already act as NVMEM providers themselves, once the user creates a ubiblock device or got CONFIG_MTD_BLOCK=y set in their kernel configuration, we would run into problems because both, the block layer as well as MTD or UBI would try to be an NVMEM provider for the same device tree node. I intially suggested the invert of this flag, GENHD_FL_NO_NVMEM which would be set only for mtdblock and ubiblock devices to opt-out of acting as NVMEM proviers. However, in a previous comment [1] on the RFC it was requested to make this opt-in instead. [1]: https://patchwork.kernel.org/comment/25432948/
On Fri, Mar 22, 2024 at 10:52:36AM -0700, Bart Van Assche wrote: > On 3/21/24 13:22, Daniel Golle wrote: > > On Thu, Mar 21, 2024 at 12:44:19PM -0700, Bart Van Assche wrote: > > > Why to add this functionality to the block layer instead of somewhere > > > in the drivers/ directory? > > > > Simply because we need notifications about appearing and disappearing > > block devices, or a way to iterate over all block devices in a system. > > For both there isn't currently any other interface than using a > > class_interface for that, and that requires access to &block_class > > which is considered a block subsystem internal. > > That's an argument for adding an interface to the block layer that > implements this functionality but not for adding this code in the block > layer. Fine with me. I can implement such an interface, similar to how it is implemented for MTD devices or UBI volumes for the block layer. I would basically add a subscription and callback interface utilizing a class_interface inside the block subsystem similar to how the same is done in this series for registering block-device-backed NVMEM providers. However, given that this is a bigger task, I'd like to know from more than one block subsystem maintainer that this approach would be agreeable before spending time and effort in this direction. Also note that obviously it would be much more intrusive and affect *all* users of the block subsystem, while the current approach would only affect those users who got CONFIG_BLOCK_NVMEM enabled.
On 3/22/24 11:02, Daniel Golle wrote: > On Fri, Mar 22, 2024 at 10:52:17AM -0700, Bart Van Assche wrote: >> On 3/21/24 12:31, Daniel Golle wrote: >>> On embedded devices using an eMMC it is common that one or more (hw/sw) >>> partitions on the eMMC are used to store MAC addresses and Wi-Fi >>> calibration EEPROM data. >>> >>> Implement an NVMEM provider backed by a block device as typically the >>> NVMEM framework is used to have kernel drivers read and use binary data >>> from EEPROMs, efuses, flash memory (MTD), ... >>> >>> In order to be able to reference hardware partitions on an eMMC, add code >>> to bind each hardware partition to a specific firmware subnode. >>> >>> Overall, this enables uniform handling across practially all flash >>> storage types used for this purpose (MTD, UBI, and now also MMC). >>> >>> As part of this series it was necessary to define a device tree schema >>> for block devices and partitions on them, which (similar to how it now >>> works also for UBI volumes) can be matched by one or more properties. >> >> Since this patch series adds code that opens partitions and reads >> from partitions, can that part of the functionality be implemented in >> user space? There is already a mechanism for notifying user space about >> block device changes, namely udev. > > No. Because it has to happen (e.g. for nfsroot to work) before > userland gets initiated: Without Ethernet MAC address (which if often > stored at some raw offset on a partition or hw-partition of an eMMC), > we don't have a way to use nfsroot (because that requires functional > Ethernet), hence userland won't come up. It's a circular dependency > problem which can only be addressed by making sure that everything > needed for Ethernet to come up is provided by the kernel **before** > rootfs (which can be nfsroot) is mounted. How about the initial RAM disk? I think that's where code should occur that reads calibration data from local storage. Thanks, Bart.
On 3/22/24 11:07, Daniel Golle wrote: > On Fri, Mar 22, 2024 at 10:49:48AM -0700, Bart Van Assche wrote: >> On 3/21/24 12:33, Daniel Golle wrote: >>> enum { >>> GENHD_FL_REMOVABLE = 1 << 0, >>> GENHD_FL_HIDDEN = 1 << 1, >>> GENHD_FL_NO_PART = 1 << 2, >>> + GENHD_FL_NVMEM = 1 << 3, >>> }; >> >> What would break if this flag wouldn't exist? > > As both, MTD and UBI already act as NVMEM providers themselves, once > the user creates a ubiblock device or got CONFIG_MTD_BLOCK=y set in their > kernel configuration, we would run into problems because both, the block > layer as well as MTD or UBI would try to be an NVMEM provider for the same > device tree node. Why would both MTD and UBI try to be an NVMEM provider for the same device tree node? Why can't this patch series be implemented such that a partition UUID occurs in the device tree and such that other code scans for that partition UUID? Thanks, Bart.
On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > On embedded devices using an eMMC it is common that one or more (hw/sw) > partitions on the eMMC are used to store MAC addresses and Wi-Fi > calibration EEPROM data. > > Implement an NVMEM provider backed by a block device as typically the > NVMEM framework is used to have kernel drivers read and use binary data > from EEPROMs, efuses, flash memory (MTD), ... > > In order to be able to reference hardware partitions on an eMMC, add code > to bind each hardware partition to a specific firmware subnode. > > Overall, this enables uniform handling across practially all flash > storage types used for this purpose (MTD, UBI, and now also MMC). > > As part of this series it was necessary to define a device tree schema > for block devices and partitions on them, which (similar to how it now > works also for UBI volumes) can be matched by one or more properties. > > --- > This series has previously been submitted as RFC on July 19th 2023[1] > and most of the basic idea did not change since. Another round of RFC > was submitted on March 5th 2024[2] which has received overall positive > feedback and only minor corrections have been done since (see > changelog below). I don't recall giving positive feedback. I still think this should use offsets rather than partition specific information. Not wanting to have to update the offsets if they change is not reason enough to not use them. Rob
On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > On embedded devices using an eMMC it is common that one or more (hw/sw) > partitions on the eMMC are used to store MAC addresses and Wi-Fi > calibration EEPROM data. > > Implement an NVMEM provider backed by a block device as typically the > NVMEM framework is used to have kernel drivers read and use binary data > from EEPROMs, efuses, flash memory (MTD), ... > > In order to be able to reference hardware partitions on an eMMC, add code > to bind each hardware partition to a specific firmware subnode. > > Overall, this enables uniform handling across practially all flash > storage types used for this purpose (MTD, UBI, and now also MMC). > > As part of this series it was necessary to define a device tree schema > for block devices and partitions on them, which (similar to how it now > works also for UBI volumes) can be matched by one or more properties. > > --- > This series has previously been submitted as RFC on July 19th 2023[1] > and most of the basic idea did not change since. Another round of RFC > was submitted on March 5th 2024[2] which has received overall positive > feedback and only minor corrections have been done since (see > changelog below). Also, please version your patches. 'RFC' is a tag, not a version. v1 was July. v2 was March 5th. This is v3. Rob
On Mon, Mar 25, 2024 at 10:10:46AM -0500, Rob Herring wrote: > On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > calibration EEPROM data. > > > > Implement an NVMEM provider backed by a block device as typically the > > NVMEM framework is used to have kernel drivers read and use binary data > > from EEPROMs, efuses, flash memory (MTD), ... > > > > In order to be able to reference hardware partitions on an eMMC, add code > > to bind each hardware partition to a specific firmware subnode. > > > > Overall, this enables uniform handling across practially all flash > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > As part of this series it was necessary to define a device tree schema > > for block devices and partitions on them, which (similar to how it now > > works also for UBI volumes) can be matched by one or more properties. > > > > --- > > This series has previously been submitted as RFC on July 19th 2023[1] > > and most of the basic idea did not change since. Another round of RFC > > was submitted on March 5th 2024[2] which has received overall positive > > feedback and only minor corrections have been done since (see > > changelog below). > > I don't recall giving positive feedback. > > I still think this should use offsets rather than partition specific > information. Not wanting to have to update the offsets if they change is > not reason enough to not use them. Using raw offsets on the block device (rather than the partition) won't work for most existing devices and boot firmware out there. They always reference the partition, usually by the name of a GPT partition (but sometimes also PARTUUID or even PARTNO) which is then used in the exact same way as an MTD partition or UBI volume would be on devices with NOR or NAND flash. Just on eMMC we usually use a GPT or MBR partition table rather than defining partitions in DT or cmdline, which is rather rare (for historic reasons, I suppose, but it is what it is now). Depending on the eMMC chip used, that partition may not even be at the same offset for different batches of the same device and hence I'd like to just do it in the same way vendor firmware does it as well. Chad of Adtran has previously confirmed that [1], which was the positive feedback I was refering to. Other vendors like GL-iNet or Netgear are doing the exact same thing. As of now, we support this in OpenWrt by adding a lot of board-specific knowledge to userland, which is ugly and also prevents using things like PXE-initiated nfsroot on those devices. The purpose of this series is to be able to properly support such devices (ie. practially all consumer-grade routers out there using an eMMC for storing firmware). Also, those devices have enough resources to run a general purpose distribution like Debian instead of OpenWrt, and all the userland hacks to set MAC addresses and extract WiFi-EEPROM-data in a board-specific ways will most certainly never find their way into Debian. It's just not how embedded Linux works, unless you are looking only at the RaspberryPi which got that data stored in a textfile which is shipped by the distribution -- something very weird and very different from literally all of-the-shelf routers, access-points or switches I have ever seen (and I've seen many). Maybe Felix who has seen even more of them can tell us more about that. [1]: https://patchwork.kernel.org/project/linux-block/patch/f70bb480aef6f55228a25ce20ff0e88e670e1b70.1709667858.git.daniel@makrotopia.org/#25756072
On Mon, Mar 25, 2024 at 10:12:59AM -0500, Rob Herring wrote: > On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > calibration EEPROM data. > > > > Implement an NVMEM provider backed by a block device as typically the > > NVMEM framework is used to have kernel drivers read and use binary data > > from EEPROMs, efuses, flash memory (MTD), ... > > > > In order to be able to reference hardware partitions on an eMMC, add code > > to bind each hardware partition to a specific firmware subnode. > > > > Overall, this enables uniform handling across practially all flash > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > As part of this series it was necessary to define a device tree schema > > for block devices and partitions on them, which (similar to how it now > > works also for UBI volumes) can be matched by one or more properties. > > > > --- > > This series has previously been submitted as RFC on July 19th 2023[1] > > and most of the basic idea did not change since. Another round of RFC > > was submitted on March 5th 2024[2] which has received overall positive > > feedback and only minor corrections have been done since (see > > changelog below). > > Also, please version your patches. 'RFC' is a tag, not a version. v1 was > July. v2 was March 5th. This is v3. According to "Submitting patches: the essential guide to getting your code into the kernel" [1] a version is also a tag. Quote: Common tags might include a version descriptor if the [sic] multiple versions of the patch have been sent out in response to comments (i.e., “v1, v2, v3”), or “RFC” to indicate a request for comments. Maybe this should be clarified, exclusive or inclusive "or" is up to the reader to interpret at this point, and I've often seen RFC, RFCv2, v1, v2, ... as a sequence of tags applied for the same series, which is why I followed what I used to believe was the most common interpretation of the guidelines. In any way, thank you for pointing it out, I assume the next iteration should then be v4. [1]: https://docs.kernel.org/process/submitting-patches.html
+boot-architecture list On Mon, Mar 25, 2024 at 03:38:19PM +0000, Daniel Golle wrote: > On Mon, Mar 25, 2024 at 10:10:46AM -0500, Rob Herring wrote: > > On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > > calibration EEPROM data. > > > > > > Implement an NVMEM provider backed by a block device as typically the > > > NVMEM framework is used to have kernel drivers read and use binary data > > > from EEPROMs, efuses, flash memory (MTD), ... > > > > > > In order to be able to reference hardware partitions on an eMMC, add code > > > to bind each hardware partition to a specific firmware subnode. > > > > > > Overall, this enables uniform handling across practially all flash > > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > > > As part of this series it was necessary to define a device tree schema > > > for block devices and partitions on them, which (similar to how it now > > > works also for UBI volumes) can be matched by one or more properties. > > > > > > --- > > > This series has previously been submitted as RFC on July 19th 2023[1] > > > and most of the basic idea did not change since. Another round of RFC > > > was submitted on March 5th 2024[2] which has received overall positive > > > feedback and only minor corrections have been done since (see > > > changelog below). > > > > I don't recall giving positive feedback. > > > > I still think this should use offsets rather than partition specific > > information. Not wanting to have to update the offsets if they change is > > not reason enough to not use them. > > Using raw offsets on the block device (rather than the partition) > won't work for most existing devices and boot firmware out there. They > always reference the partition, usually by the name of a GPT > partition (but sometimes also PARTUUID or even PARTNO) which is then > used in the exact same way as an MTD partition or UBI volume would be > on devices with NOR or NAND flash. MTD normally uses offsets hence why I'd like some alignment. UBI is special because raw NAND is, well, special. > Just on eMMC we usually use a GPT > or MBR partition table rather than defining partitions in DT or cmdline, > which is rather rare (for historic reasons, I suppose, but it is what it > is now). Yes, I understand how eMMC works. I don't understand why if you have part #, uuid, or name you can't get to the offset or vice-versa. You need only 1 piece of identification to map partition table entries to DT nodes. Sure, offsets can change, but surely the firmware can handle adjusting the DT? An offset would also work for the case of random firmware data on the disk that may or may not have a partition associated with it. There are certainly cases of that. I don't think we have much of a solution for that other than trying to educate vendors to not do that or OS installers only supporting installing to something other than eMMC. This is something EBBR[1] is trying to address. > Depending on the eMMC chip used, that partition may not even be at the > same offset for different batches of the same device and hence I'd > like to just do it in the same way vendor firmware does it as well. Often vendor firmware is not a model to follow... > Chad of Adtran has previously confirmed that [1], which was the > positive feedback I was refering to. Other vendors like GL-iNet or > Netgear are doing the exact same thing. > > As of now, we support this in OpenWrt by adding a lot of > board-specific knowledge to userland, which is ugly and also prevents > using things like PXE-initiated nfsroot on those devices. > > The purpose of this series is to be able to properly support such devices > (ie. practially all consumer-grade routers out there using an eMMC for > storing firmware). > > Also, those devices have enough resources to run a general purpose > distribution like Debian instead of OpenWrt, and all the userland > hacks to set MAC addresses and extract WiFi-EEPROM-data in a > board-specific ways will most certainly never find their way into > Debian. It's just not how embedded Linux works, unless you are looking > only at the RaspberryPi which got that data stored in a textfile > which is shipped by the distribution -- something very weird and very > different from literally all of-the-shelf routers, access-points or > switches I have ever seen (and I've seen many). Maybe Felix who has > seen even more of them can tell us more about that. General purpose distros want to partition the disk themselves. Adding anything to the DT for disk partitions would require the installer to be aware of it. There's various distro folks on the boot-arch list, so maybe one of them can comment. Rob [1] https://arm-software.github.io/ebbr/index.html#document-chapter4-firmware-media
Hi Rob, On Tue, Mar 26, 2024 at 03:24:49PM -0500, Rob Herring wrote: > +boot-architecture list Good idea, thank you :) > > On Mon, Mar 25, 2024 at 03:38:19PM +0000, Daniel Golle wrote: > > On Mon, Mar 25, 2024 at 10:10:46AM -0500, Rob Herring wrote: > > > On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > > > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > > > calibration EEPROM data. > > > > > > > > Implement an NVMEM provider backed by a block device as typically the > > > > NVMEM framework is used to have kernel drivers read and use binary data > > > > from EEPROMs, efuses, flash memory (MTD), ... > > > > > > > > In order to be able to reference hardware partitions on an eMMC, add code > > > > to bind each hardware partition to a specific firmware subnode. > > > > > > > > Overall, this enables uniform handling across practially all flash > > > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > > > > > As part of this series it was necessary to define a device tree schema > > > > for block devices and partitions on them, which (similar to how it now > > > > works also for UBI volumes) can be matched by one or more properties. > > > > > > > > --- > > > > This series has previously been submitted as RFC on July 19th 2023[1] > > > > and most of the basic idea did not change since. Another round of RFC > > > > was submitted on March 5th 2024[2] which has received overall positive > > > > feedback and only minor corrections have been done since (see > > > > changelog below). > > > > > > I don't recall giving positive feedback. > > > > > > I still think this should use offsets rather than partition specific > > > information. Not wanting to have to update the offsets if they change is > > > not reason enough to not use them. > > > > Using raw offsets on the block device (rather than the partition) > > won't work for most existing devices and boot firmware out there. They > > always reference the partition, usually by the name of a GPT > > partition (but sometimes also PARTUUID or even PARTNO) which is then > > used in the exact same way as an MTD partition or UBI volume would be > > on devices with NOR or NAND flash. > > MTD normally uses offsets hence why I'd like some alignment. UBI is > special because raw NAND is, well, special. I get the point and in a way this is also already intended and supported by this series. You can already just add an 'nvmem-layout' node directly to a disk device rather than to a partition and define a layout in this way. Making this useful in practice will require some improvements to the nvmem system in Linux though, because that currently uses signed 32-bit integers as addresses which is not sufficient for the size of the user-part of an eMMC. However, that needs to be done then and should of course not be read as an excuse. > > > Just on eMMC we usually use a GPT > > or MBR partition table rather than defining partitions in DT or cmdline, > > which is rather rare (for historic reasons, I suppose, but it is what it > > is now). > > Yes, I understand how eMMC works. I don't understand why if you have > part #, uuid, or name you can't get to the offset or vice-versa. You > need only 1 piece of identification to map partition table entries to DT > nodes. Yes, either of them (or a combination) is fine. In practise I've mostly seen PARTNAME as identifier used in userland scripts, and only adding this for now will probably cover most devices (and existing boot firmware) out there. Notable exceptions are devices which are using MBR partitions because the BootROM expects the bootloader to be at the same block as we would usually have the primary GPT. In this case we can only use the PARTNO, of course, and it stinks. MediaTek's MT7623A/N is such an example, but it's a slingly outdated and pretty weird niche SoC I admit. > Sure, offsets can change, but surely the firmware can handle > adjusting the DT? Future firmware may be able to do this, of course. Current existing firmware already out there on devices such as the quite popular GL.iNet MT-6000, Netgear's Orbi and Orbi Pro series as well as all Adtran SmartRG devices does not. Updating or changing the boot firmware of devices already out there is not intended and quite challenging, and will make the device incompatible with its vendor firmware. Hence it would be better to support replacing only the Linux-based firmware (eg. with OpenWrt or even Debian or any general-purpose Linux, the eMMC is large enough...) while not having to touch the boot firmware (and risking to brick the device if that goes wrong). Personally, I'm rather burdened and unhappy with vendor attempts to have the boot firmware mess around too much in (highly customized, downstream) DT, it may look like a good solution at the moment, but can totally become an obstacle in an unpredictable future (no offense ASUS...) > > An offset would also work for the case of random firmware data on the > disk that may or may not have a partition associated with it. There are > certainly cases of that. I don't think we have much of a solution for > that other than trying to educate vendors to not do that or OS > installers only supporting installing to something other than eMMC. This > is something EBBR[1] is trying to address. Absolutely. Actually *early* GL-iNet devices did exactly that: Use the eMMC boot hw-partitions to store boot firmware as well as MAC addresses and potentially also Wi-Fi calibration data. The MT-2500 is the example I'm aware of and got sitting on my desk for testing with this very series (which allows to also reference eMMC hardware partitions, see "[7/8] mmc: block: set fwnode of disk devices"). Unfortunately later devices such the the flag-ship MT-6000 moved MAC addresses and WiFi-EEPROMs into a GPT partition on the user-part of the eMMC. > > > Depending on the eMMC chip used, that partition may not even be at the > > same offset for different batches of the same device and hence I'd > > like to just do it in the same way vendor firmware does it as well. > > Often vendor firmware is not a model to follow... I totally agree. However, I don't see a good reason for not supporting those network-appliance-type embedded devices which even ship with (outdated, downstream) Linux by default while going through great lengths for things like broken ACPI tables in many laptops which require lots of work-arounds to have features like suspend-to-disk working, or even be able to run Linux at all. > > > Chad of Adtran has previously confirmed that [1], which was the > > positive feedback I was refering to. Other vendors like GL-iNet or > > Netgear are doing the exact same thing. > > > > As of now, we support this in OpenWrt by adding a lot of > > board-specific knowledge to userland, which is ugly and also prevents > > using things like PXE-initiated nfsroot on those devices. > > > > The purpose of this series is to be able to properly support such devices > > (ie. practially all consumer-grade routers out there using an eMMC for > > storing firmware). > > > > Also, those devices have enough resources to run a general purpose > > distribution like Debian instead of OpenWrt, and all the userland > > hacks to set MAC addresses and extract WiFi-EEPROM-data in a > > board-specific ways will most certainly never find their way into > > Debian. It's just not how embedded Linux works, unless you are looking > > only at the RaspberryPi which got that data stored in a textfile > > which is shipped by the distribution -- something very weird and very > > different from literally all of-the-shelf routers, access-points or > > switches I have ever seen (and I've seen many). Maybe Felix who has > > seen even more of them can tell us more about that. > > General purpose distros want to partition the disk themselves. Adding > anything to the DT for disk partitions would require the installer to be > aware of it. There's various distro folks on the boot-arch list, so > maybe one of them can comment. Usually the installers are already aware to not touch partitions when unaware of their purpose. Repartitioning the disk from scratch is not what (modern) distributions are doing, at least the EFI System partition is kept, as well as typical rescue/recovery partitions many vendors put on their (Windows, Mac) laptops to allow to "factory reset" them. Installers usually offer to replace (or resize) the "large" partition used by the currently installed OS instead. And well, the DT reference to a partition holding e.g. MAC addresses does make the installer aware of it, obviously. Thank you for the constructive debate! Cheers Daniel > > Rob > > [1] https://arm-software.github.io/ebbr/index.html#document-chapter4-firmware-media
On Tue, Mar 26, 2024 at 4:29 PM Daniel Golle <daniel@makrotopia.org> wrote: > > Hi Rob, > > On Tue, Mar 26, 2024 at 03:24:49PM -0500, Rob Herring wrote: > > +boot-architecture list > > Good idea, thank you :) Now really adding it. :( Will reply to rest later. > > > > On Mon, Mar 25, 2024 at 03:38:19PM +0000, Daniel Golle wrote: > > > On Mon, Mar 25, 2024 at 10:10:46AM -0500, Rob Herring wrote: > > > > On Thu, Mar 21, 2024 at 07:31:48PM +0000, Daniel Golle wrote: > > > > > On embedded devices using an eMMC it is common that one or more (hw/sw) > > > > > partitions on the eMMC are used to store MAC addresses and Wi-Fi > > > > > calibration EEPROM data. > > > > > > > > > > Implement an NVMEM provider backed by a block device as typically the > > > > > NVMEM framework is used to have kernel drivers read and use binary data > > > > > from EEPROMs, efuses, flash memory (MTD), ... > > > > > > > > > > In order to be able to reference hardware partitions on an eMMC, add code > > > > > to bind each hardware partition to a specific firmware subnode. > > > > > > > > > > Overall, this enables uniform handling across practially all flash > > > > > storage types used for this purpose (MTD, UBI, and now also MMC). > > > > > > > > > > As part of this series it was necessary to define a device tree schema > > > > > for block devices and partitions on them, which (similar to how it now > > > > > works also for UBI volumes) can be matched by one or more properties. > > > > > > > > > > --- > > > > > This series has previously been submitted as RFC on July 19th 2023[1] > > > > > and most of the basic idea did not change since. Another round of RFC > > > > > was submitted on March 5th 2024[2] which has received overall positive > > > > > feedback and only minor corrections have been done since (see > > > > > changelog below). > > > > > > > > I don't recall giving positive feedback. > > > > > > > > I still think this should use offsets rather than partition specific > > > > information. Not wanting to have to update the offsets if they change is > > > > not reason enough to not use them. > > > > > > Using raw offsets on the block device (rather than the partition) > > > won't work for most existing devices and boot firmware out there. They > > > always reference the partition, usually by the name of a GPT > > > partition (but sometimes also PARTUUID or even PARTNO) which is then > > > used in the exact same way as an MTD partition or UBI volume would be > > > on devices with NOR or NAND flash. > > > > MTD normally uses offsets hence why I'd like some alignment. UBI is > > special because raw NAND is, well, special. > > I get the point and in a way this is also already intended and > supported by this series. You can already just add an 'nvmem-layout' > node directly to a disk device rather than to a partition and define a > layout in this way. > > Making this useful in practice will require some improvements to the > nvmem system in Linux though, because that currently uses signed 32-bit > integers as addresses which is not sufficient for the size of the > user-part of an eMMC. However, that needs to be done then and should > of course not be read as an excuse. > > > > > > Just on eMMC we usually use a GPT > > > or MBR partition table rather than defining partitions in DT or cmdline, > > > which is rather rare (for historic reasons, I suppose, but it is what it > > > is now). > > > > Yes, I understand how eMMC works. I don't understand why if you have > > part #, uuid, or name you can't get to the offset or vice-versa. You > > need only 1 piece of identification to map partition table entries to DT > > nodes. > > Yes, either of them (or a combination) is fine. In practise I've mostly > seen PARTNAME as identifier used in userland scripts, and only adding > this for now will probably cover most devices (and existing boot firmware) > out there. Notable exceptions are devices which are using MBR partitions > because the BootROM expects the bootloader to be at the same block as > we would usually have the primary GPT. In this case we can only use the > PARTNO, of course, and it stinks. > MediaTek's MT7623A/N is such an example, but it's a slingly outdated > and pretty weird niche SoC I admit. > > > Sure, offsets can change, but surely the firmware can handle > > adjusting the DT? > > Future firmware may be able to do this, of course. Current existing > firmware already out there on devices such as the quite popular > GL.iNet MT-6000, Netgear's Orbi and Orbi Pro series as well as all > Adtran SmartRG devices does not. Updating or changing the boot > firmware of devices already out there is not intended and quite > challenging, and will make the device incompatible with its vendor > firmware. Hence it would be better to support replacing only the > Linux-based firmware (eg. with OpenWrt or even Debian or any > general-purpose Linux, the eMMC is large enough...) while not having > to touch the boot firmware (and risking to brick the device if that > goes wrong). > > Personally, I'm rather burdened and unhappy with vendor attempts to > have the boot firmware mess around too much in (highly customized, > downstream) DT, it may look like a good solution at the moment, but > can totally become an obstacle in an unpredictable future (no offense > ASUS...) > > > > > An offset would also work for the case of random firmware data on the > > disk that may or may not have a partition associated with it. There are > > certainly cases of that. I don't think we have much of a solution for > > that other than trying to educate vendors to not do that or OS > > installers only supporting installing to something other than eMMC. This > > is something EBBR[1] is trying to address. > > Absolutely. Actually *early* GL-iNet devices did exactly that: Use the > eMMC boot hw-partitions to store boot firmware as well as MAC > addresses and potentially also Wi-Fi calibration data. > > The MT-2500 is the example I'm aware of and got sitting on my desk for > testing with this very series (which allows to also reference eMMC > hardware partitions, see "[7/8] mmc: block: set fwnode of disk > devices"). > Unfortunately later devices such the the flag-ship MT-6000 moved MAC > addresses and WiFi-EEPROMs into a GPT partition on the user-part of > the eMMC. > > > > > > Depending on the eMMC chip used, that partition may not even be at the > > > same offset for different batches of the same device and hence I'd > > > like to just do it in the same way vendor firmware does it as well. > > > > Often vendor firmware is not a model to follow... > > I totally agree. However, I don't see a good reason for not supporting > those network-appliance-type embedded devices which even ship with > (outdated, downstream) Linux by default while going through great > lengths for things like broken ACPI tables in many laptops which > require lots of work-arounds to have features like suspend-to-disk > working, or even be able to run Linux at all. > > > > > > Chad of Adtran has previously confirmed that [1], which was the > > > positive feedback I was refering to. Other vendors like GL-iNet or > > > Netgear are doing the exact same thing. > > > > > > As of now, we support this in OpenWrt by adding a lot of > > > board-specific knowledge to userland, which is ugly and also prevents > > > using things like PXE-initiated nfsroot on those devices. > > > > > > The purpose of this series is to be able to properly support such devices > > > (ie. practially all consumer-grade routers out there using an eMMC for > > > storing firmware). > > > > > > Also, those devices have enough resources to run a general purpose > > > distribution like Debian instead of OpenWrt, and all the userland > > > hacks to set MAC addresses and extract WiFi-EEPROM-data in a > > > board-specific ways will most certainly never find their way into > > > Debian. It's just not how embedded Linux works, unless you are looking > > > only at the RaspberryPi which got that data stored in a textfile > > > which is shipped by the distribution -- something very weird and very > > > different from literally all of-the-shelf routers, access-points or > > > switches I have ever seen (and I've seen many). Maybe Felix who has > > > seen even more of them can tell us more about that. > > > > General purpose distros want to partition the disk themselves. Adding > > anything to the DT for disk partitions would require the installer to be > > aware of it. There's various distro folks on the boot-arch list, so > > maybe one of them can comment. > > Usually the installers are already aware to not touch partitions when > unaware of their purpose. Repartitioning the disk from scratch is not > what (modern) distributions are doing, at least the EFI System > partition is kept, as well as typical rescue/recovery partitions many > vendors put on their (Windows, Mac) laptops to allow to "factory > reset" them. > > Installers usually offer to replace (or resize) the "large" partition > used by the currently installed OS instead. > > And well, the DT reference to a partition holding e.g. MAC addresses > does make the installer aware of it, obviously. > > > Thank you for the constructive debate! > > > Cheers > > > Daniel > > > > > > Rob > > > > [1] https://arm-software.github.io/ebbr/index.html#document-chapter4-firmware-media
On Fri, Mar 22, 2024 at 12:22:32PM -0700, Bart Van Assche wrote: > On 3/22/24 11:07, Daniel Golle wrote: > > On Fri, Mar 22, 2024 at 10:49:48AM -0700, Bart Van Assche wrote: > > > On 3/21/24 12:33, Daniel Golle wrote: > > > > enum { > > > > GENHD_FL_REMOVABLE = 1 << 0, > > > > GENHD_FL_HIDDEN = 1 << 1, > > > > GENHD_FL_NO_PART = 1 << 2, > > > > + GENHD_FL_NVMEM = 1 << 3, > > > > }; > > > > > > What would break if this flag wouldn't exist? > > > > As both, MTD and UBI already act as NVMEM providers themselves, once > > the user creates a ubiblock device or got CONFIG_MTD_BLOCK=y set in their > > kernel configuration, we would run into problems because both, the block > > layer as well as MTD or UBI would try to be an NVMEM provider for the same > > device tree node. > > Why would both MTD and UBI try to be an NVMEM provider for the same > device tree node? I didn't mean that both MTD and UBI would **simultanously** try to act as NVMEM providers for the same device tree node. What I meant was that either of them can act as an NVMEM provider while at the same time also providing an emulated block device (mtdblock xor ubiblock). Hence those emulated block devices will have to be excluded from acting as NVMEM providers. In this patch I suggest to do this by opt-in of block drivers which should potentially provide NVMEM (typically mmcblk). I apologize for the confusion and assume that wasn't clear from the wording I've used. I hope it's more clear now. Alternatively it could also be solved via opt-out of ubiblock and mtdblock devices using the inverted flag (GENHD_FL_NO_NVMEM) -- however, this has previously been criticized and I was asked to rather make it opt-in.[1] > Why can't this patch series be implemented such that > a partition UUID occurs in the device tree and such that other code > scans for that partition UUID? This is actually one way this very series allows one to handle this: by identifying a partition using its partuuid. However, it's also quite common that the MMC boot **hardware** partitions are used to store MAC addresses and/or Wi-Fi calibration data. In this case there is no partition table and the NVMEM provider has to act directly on the whole disk device (which is only a few megabytes in size in case of those mmcblkXbootY devices and never has a partition table). [1]: https://patchwork.kernel.org/comment/25432948/